Numbers to the right of topics indicate sequential lecture numbers.
Hn stands for Harrell Chapter n in the book's second edition. Ln stands for lecture n.
Introduction (H1) L1
Course overview and logistics
Course philosophy
Hypothesis testing vs. estimation vs. prediction
Examples of multivariable prediction problems
Misunderstandings about classification vs. prediction (read this also)
Study planning considerations
Choice of model
Model uncertainty/data driven model selection/phantom d.f.
General methods for multivariable models (H2) L2
Notation for general regression models
Model formulations
Interpreting model parameters
nominal predictors
interactions
Review of chunk tests
Relaxing linearity assumption for continuous predictors
avoiding categorization
nonparametric smoothing
simple nonlinear terms (L3)
splines for estimating shape of regression function and determining predictor transformations
cubic spline functions
restricted cubic splines
nonparametric regression (smoothers)
advantages of splines over other methods
recursive partitioning and tree models in a nutshell
New directions in predictive modeling (L4)
Tests of association
Grambsch and O'Brien paper
Assessment of model fit
regression assumptions
modeling and testing complex interactions
interactions to prespecify
distributional assumptions
Missing data (H3, L5)
Types of missing data
Prelude to modeling
Missing values for different types of response variables
Problems with alternatives to imputation
Strategies for developing imputation models
Single imputation
Predictive mean matching
Multiple imputation
The aregImpute algorithm (L6)
Diagnostics
Summary and rough guidelines; effective sample size
Multivariable modeling strategy (H4)
Pre-specification of predictor complexity
Variable selection
Sample size, overfitting, and number of predictors (L7)
Shrinkage
Collinearity
Data reduction
Overly influential observations (L8)
Comparing two models
Improving the practice of multivariable prediction
Overall modeling strategies
Bootstrap, Validating, Describing, and Simplifying the Model (L9, H5)
Describing the fitted model
Bootstrap
Model validation
Bootstrapping ranks of predictors (L10)
Simplifying the model by approximating it
How do we break bad habits?
R Multivariable Modeling/Validation/Presentation Software (H6, BBR9)
Case Study in Longitudinal Data Modeling with Generalized Least Squares (H7, L11)
Notation and model for mean time-response profile
Keeping baseline variables as baseline
Modeling within-subject dependence
Overview of competing methods for serial data
Checking model fit
Software
Case study from a randomized trial
Case study in data reduction (H8, L12)
How many parameters can be estimated?
Redundancy analysis
Variable clustering
Transformation/scaling of variables using transcan
Principal components Cox regression
Sparse principal components
Nonparametric transform-both-sides regression for transforming/scaling variables
Maximum Likelihood Estimation (H9, L13)
Three test statistics
Robust covariance matrix estimator
Correcting variances for clustered or serial data using sandwich and bootstrap estimators
Confidence regions
Wald (large-sample normal approximation)
Bootstrap
Simultaneous (normal approx)
General contrasts through differences in linear predictor
Further use of the log likelihood
Weighted MLE
Penalized MLE
Effective d.f.
Binary Logistic Model (H10, L15)
Model
Odds ratios, risk ratios, and risk differences
Detailed example
Estimation
Test statistics
Residuals
Assessment of model fit
Quantifying predictive ability
Validating the model
Describing fitted models
R functions
Binary Logistic Case Study 1 (H11, L16)
Binary Logistic Case Study 2 (H12, L17)
Ordinal Logistic Models (H13, L18)
Ordinality assumption
PO Model
Model
Assumptions, interpretations of parameters, estimation, residuals
Assessment of fit
Predictive ability measures
Describing the model
Validation
R functions
CR Model
Model
Assumptions, interpretation of parameters, estimation, residuals
Assessment of fit
Extended CR model including penalization
Validation
R functions
Ordinal Logistic Regression Case Study (H14, L19)
Case Study in Ordinal Regression for Continuous Univariate Y (H15, L21-22)
No transformation satisfying all linear model assumptions exists for the dataset
Assumptions of the proportional odds ordinal logistic model (semiparametric model) are not satisfied
Development and validation of a quantile regression model for median glycohemoglobin
Failure of linear multiple regression
Failure of proportional odds model for continuous gh