Binary Response, Random Sample of 1000 Patients from the SUPPORT Study, Missing Data

Analyze the support dataset available at http://biostat.mc.vanderbilt.edu/twiki/pub/Main/DataSets/support.sav (an R save file that can also be downloaded and loaded using the Hmisc getHdata function) to develop a model to predict the probability that a patient dies in the hospital. Consider the following predictors: age, sex, dzgroup, num.co, scoma, race, meanbp, hrt, temp, pafi, alb. As part of your analysis do the following:
  1. Make a single chart showing proportions of deaths stratified by each of the other variables listed above
  2. Characterize patterns of missing values in the predictors by plotting missingness tendencies of single predictors and jointly of two predictors at a time, and by using recursive partitioning to determine what kind of patients tended to have a higher proportion of missing measurements for the predictor that is missing most often
  3. Impute missing lab data using "most normal" values; impute race using the most frequent category (hint: see the Hmisc impute function)
  4. Initially estimate marginal relationships between continuous predictors and outcome using a nonparametric smoother
  5. Use marginal potential predictive discrimination of predictors to decide on how to spend degrees of freedom
  6. Fit a multivariable model with minimal observations deleted due to NAs
  7. Test partial effects of all predictors
  8. Graphically interpret the model three distinct ways
  9. Validate the model for discrimination and calibration ability

This topic: Main > WebHome > CE > Puzzlers > Puzzler2
Topic revision: revision 1
 
This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback