Binary Response, Random Sample of 1000 Patients from the SUPPORT Study, Missing Data

Analyze the support dataset available at http://biostat.mc.vanderbilt.edu/twiki/pub/Main/DataSets/support.sav (an R save file that can also be downloaded and loaded using the Hmisc getHdata function) to develop a model to predict the probability that a patient dies in the hospital. Consider the following predictors: age, sex, dzgroup, num.co, scoma, race, meanbp, hrt, temp, pafi, alb. As part of your analysis do the following:

Make a single chart showing proportions of deaths stratified by each of the other variables listed above
Characterize patterns of missing values in the predictors by plotting missingness tendencies of single predictors and jointly of two predictors at a time, and by using recursive partitioning to determine what kind of patients tended to have a higher proportion of missing measurements for the predictor that is missing most often
Impute missing lab data using "most normal" values; impute race using the most frequent category (hint: see the Hmisc impute function)
Initially estimate marginal relationships between continuous predictors and outcome using a nonparametric smoother
Use marginal potential predictive discrimination of predictors to decide on how to spend degrees of freedom
Fit a multivariable model with minimal observations deleted due to NAs
Test partial effects of all predictors
Graphically interpret the model three distinct ways
Validate the model for discrimination and calibration ability

This topic: Main > WebHome > CE > Puzzlers > Puzzler2
Topic revision: revision 1

Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback