BIOS 362: Advanced Statistical Inference (Statistical Learning)
Instructor
Teaching Assistants
Dates, Time, and Location
- First meeting: Tue. Jan. 26, 2021; Last meeting: Thu. Apr. 29, 2021
- Tuesday, Thursday 9:00AM-10:30AM
- Virtually using Zoom via Brightspace
- Office hours: By appointment, initially. Will determine a regular schedule. * We will use the Graduate School Academic Calendar
Textbook
The book for this course is listed below, and free to download in PDF format at the book webpage:
Hastie, Tibshirani, Friedman. (2009) The elements of statistical learning: data mining, inference and prediction. Springer, 2nd edition.. In the course outline and class schedule, the textbook is abbreviated "HTF", often followed by chapter or page references "Ch. X-Y" or "pp. X-Y", respectively. The BibTeX entry for the book is as follows:
@book{HTF2009,
author = {Hastie, Trevor and Tibshirani, Robert and Friedman, Jerome},
title = {The elements of statistical learning: data mining, inference and prediction},
url = {http://www-stat.stanford.edu/~tibs/ElemStatLearn/},
publisher = {Springer},
year = 2009,
edition = 2
}
The wide margins of the linked PDF version of the book make it difficult to read on smart devices (e.g., an iPhone). The margins may be removed using the following ghostscript command in Linux, where "output.pdf" and "input.pdf" are substituted for the appropriate file names. Please see Dr. Shotwell for help with this.
gs -o output.pdf -sDEVICE=pdfwrite -c "[/CropBox [130 140 460 685] /PAGES pdfmark" -f input.pdf
Other Resources
Course Topics
- Overview of Supervised Learning and Review of Linear Methods: HTF Ch. 2-4
- Splines and Kernel Methods: HTF Ch. 5-6
- Model Assessment, Selection, and Inference: HTF Ch. 7-8
- Neural Networks: HTF Ch. 11
- Support Vector Machines: HTF Ch. 12
- Unsupervised Learning: HTF Ch. 14
Other information
- Unless otherwise stated, assigned homework is due in one week.
- Students are encouraged to work together on homework problems, but they must turn in their own write-ups.
- Class participation is encouraged.
- Please bring a laptop to class.
Grading
- Homework: 40%
- Take-home Midterm Exam: 30%
- Take-home Final Exam: 30%
Schedule of Topics
Date |
Reading (before class) |
Homework |
Topic/Content |
Presentation |
Tue. 1/26 |
none |
none |
Syllabus, introduction |
Intro.pdf |
Thu. 1/28 |
HTF Ch. 1 and Ch. 2.1, 2.2, and 2.3 |
See below: Thu. 1/28 |
Least-squares, nearest-neighbors |
lecture-1.pdf mixture-data-lin-knn.R |
Tue. 2/2 |
none |
none |
Least-squares, nearest-neighbors code |
mixture-data-lin-knn.R |
Thu. 2/4 |
HTF Ch. 2.4 |
none |
Decision theory |
lecture-2.pdf |
Tue. 2/9 |
none |
See below: Tue. 2/9 |
Loss functions in practice |
lecture-2a.pdf prostate-data-lin.R |
Thu. 2/11 |
HTF Ch. 2.7, 2.8, and 2.9 |
none |
Structured regression |
lecture-3.pdf ex-1.R ex-2.R ex-3.R |
Tue. 2/16 |
HTF Ch. 3.1, 3.2, 3.3, 3.4 |
none |
Linear methods, subset selection, ridge, and lasso |
lecture-4a.pdf linear-regression-examples.R lecture-5.pdf lasso-example.R |
Thu. 2/18 |
none |
See below: Tue. 2/18 |
No Class Reading day focused on linear methods for regression. |
Suggested supplemental reading: HTF Ch. 3.6, 3.7, 3.8, and 3.9. Suggested supplemental exercises: Ex. 3.12, 3.18 |
Tue. 2/23 |
none |
none |
Linear methods, subset selection, ridge, and lasso (cont.) |
lecture-5.pdf lasso-example.R |
Thu. 2/25 |
HTF Ch. 3.5 and 3.6 |
none |
Linear methods: principal components regression |
lecture-6.pdf pca-regression-example.R lec7.pdf lec8.pdf pca-and-g-inverses.html |
Tue. 3/2 |
HTF Ch. 4.1, 4.2, and 4.3 |
See below: Tue. 3/2 |
Linear methods: Linear discriminant analysis |
lecture-8.pdf simple-LDA-3D.R |
Thu. 3/4 |
HTF Ch. 5.1 and 5.2 |
none |
Basis expansions: piecewise polynomials & splines |
lecture-11.pdf splines-example.R mixture-data-complete.R |
Tue. 3/9 |
HTF Ch. 6.1-6.5 |
none |
Kernel methods |
lecture-13.pdf mixture-data-knn-local-kde.R kernel-methods-examples-mcycle.R |
Thu. 3/11 |
HTF Ch. 7.1, 7.2, 7.3, 7.4 |
See below: Thu. 3/11 |
Model assessment: Cp, AIC, BIC |
lecture-14.pdf effective-df-aic-bic-mcycle.R |
Tue. 3/16 |
HTF Ch. 7.10 |
none |
Cross validation |
lecture-15.pdf kNN-CV.R Income2.csv |
Thu. 3/18 |
none |
none |
Midterm Review |
none |
Tue. 3/23 |
HTF Ch. 9.2 |
none |
Classification and Regression Trees |
lecture-21.pdf mixture-data-rpart.R |
Thu. 3/25 |
HTF Ch. 8.7, 8.8, 8.9 |
none |
Bagging |
lecture-18.pdf mixture-data-rpart-bagging.R nonlinear-bagging.html |
Tue. 3/30 |
HTF Ch. 15.1, 15.2 |
Tue. 3/30 (below) |
Random Forest |
lecture-25.pdf random-forest-example.R |
Thu. 4/1 |
HTF Ch. 10.1 |
none |
Boosting and AdaBoost.M1 (part 1) |
lecture-22.pdf boosting-trees.R |
Tue. 4/6 |
HTF Ch. 10.2-10.9 |
Work through this nice GBM tutorial |
Boosting and AdaBoost.M1 (part 2) |
lecture-23.pdf |
Thu. 4/8 |
HTF Ch. 10.10, 10.13 |
none |
Boosting and AdaBoost.M1 (part 3) |
lecture-24.pdf gradient-boosting-example.R |
Tue. 4/12 |
HTF Ch. 11.1, 11.2, 11.3, 11.4, 11.5 |
none |
Introduction to Neural networks |
lecture-31.pdf nnet.R |
Thu. 4/14 |
HTF Ch. 11.1, 11.2, 11.3, 11.4, 11.5 |
Thu. 4/14 (below) |
Introduction to Neural networks (cont.) |
lecture-31.pdf nnet.R |
Homework/Laboratory (other than problems listed in HTF)
Thu. 1/28
Using the RMarkdown/knitr/github mechanism, implement the following tasks by extending the example R script
mixture-data-lin-knn.R:
- Paste the code from the mixture-data-lin-knn.R file into the homework template Knitr document.
- Read the help file for R's built-in linear regression function lm
- Re-write the functions fit_lc and predict_lc using lm, and the associated predict method for lm objects.
- Consider making the linear classifier more flexible, by adding squared terms for x1 and x2 to the linear model
- Describe how this more flexible model affects the bias-variance tradeoff
Tue. 2/9
Using the RMarkdown/knitr/github mechanism, implement the following tasks by extending the example R script (
prostate-data-lin.R):
- Write functions that implement the L1 loss and tilted absolute loss functions.
- Create a figure that shows lpsa (x-axis) versus lcavol (y-axis). Add and label (using the 'legend' function) the linear model predictors associated with L2 loss, L1 loss, and tilted absolute value loss for tau = 0.25 and 0.75.
- Write functions to fit and predict from a simple nonlinear model with three parameters defined by 'beta[1] + beta[2]*exp(-beta[3]*x)'. Hint: make copies of 'fit_lin' and 'predict_lin' and modify them to fit the nonlinear model. Use c(-1.0, 0.0, -0.3) as 'beta_init'.
- Create a figure that shows lpsa (x-axis) versus lcavol (y-axis). Add and label (using the 'legend' function) the nonlinear model predictors associated with L2 loss, L1 loss, and tilted absolute value loss for tau = 0.25 and 0.75.
Tue. 2/18
Using the RMarkdown/knitr/github mechanism, implement the following tasks:
- Use the prostate cancer data.
- Use the
cor
function to reproduce the correlations listed in HTF Table 3.1, page 50.
- Treat
lcavol
as the outcome, and use all other variables in the data set as predictors.
- With the training subset of the prostate data, train a least-squares regression model with all predictors using the
lm
function.
- Use the testing subset to compute the test error (average squared-error loss) using the fitted least-squares regression model.
- Train a ridge regression model using the
glmnet
function, and tune the value of lambda
(i.e., use guess and check to find the value of lambda
that approximately minimizes the test error).
- Create a figure that shows the training and test error associated with ridge regression as a function of
lambda
- Create a path diagram of the ridge regression analysis, similar to HTF Figure 3.8
Tue. 3/2
Goal: Understand and implement reduced rank LDA in R. This homework covers new material that we will not cover in class.
Using the RMarkdown/knitr/github mechanism, implement the following tasks:
- Retrieve the vowel data (training and testing) from the HTF website or R package.
- Review HTF section 4.3.3 and (optionally): LA Examples and example.R
- Implement reduced-rank LDA using the vowel training data. Check your work by plotting the first two discriminant variables as in HTF Figure 4.4. Hint: Center the 10 training predictors before implementing LDA. See built-in R function ’scale’. The singular value or Eigen decompositions may be computed using the built-in R functions ’svd’ or ’eigen’, respectively.
- Use the vowel testing data to estimate the expected prediction error (assuming zero-one loss), varying the number of canonical variables used for classification.
- Plot the EPE as a function of the number of discriminant variables, and compare this with HTF Figure 4.10.
- (Optional) Reproduce HTF Figure 4.11. Note: The reproduction need not be exact. However, the information content should be preserved.
Thu. 3/11
- Complete HTF exercises 7.4 and 7.6
- This homework should be submitted using the Github mechanism. However, you may complete the homework on paper and scan an image to upload. Or, you may use the LaTeX-style markup in an RMarkdown document.
Tue. 3/30
- Complete HTF exercise Ex. 15.4.
Thu. 4/14
Goal: Get started using Keras to construct simple neural networks
- Work through the "Image Classification" tutorial on the RStudio Keras website.
- Use the Keras library to re-implement the simple neural network discussed during lecture for the mixture data (see nnet.R). Use a single 10-node hidden layer; fully connected.
- Create a figure to illustrate that the predictions are (or are not) similar using the 'nnet' function versus the Keras model.
- (optional extra credit) Convert the neural network described in the "Image Classification" tutorial to a network that is similar to one of the convolutional networks described during lecture on 4/15 (i.e., Net-3, Net-4, or Net-5) and also described in the ESL book section 11.7. See the !ConvNet tutorial on the RStudio Keras website.
Links
RStudio/Knitr