Health services research, diagnosis, and prognosis

Click here for 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, and before.

Current Notes (2022)

2022 June 27

Shawniqua Williams Roberson (Leanne Boehm), Neurology

Q: Are there distinct subtypes of post-intensive care syndrome (PICS) with different cognitive, psychological, socioeconomic and physical profiles? If so, what are the demographic and clinical risk factors for these subtypes? We envision this project as a post-hoc analysis of data collected as part of the BRAIN-ICU and MIND-ICU observational cohort studies. We aim to aggregate the data from the two studies and use unsupervised clustering to identify constellations of PICs-related symptoms experienced by study participants at 3 and 12 months after discharge. Mentor confirmed.

Clinic Notes:
  • Post-hoc analysis on data from observational studies on post-ICU outcomes. Sample size: 564 at 3 months and 471 at 12 months for combined MIND-ICU and BRAIN-ICU/
  • Aims of the proposed study: 1. examine the degree to which PICS-related deficits occur in clusters (i.e. PICS phenotypes). 2. Identify potential clinical and demographic risk factors for these clusters.
  • Questions: 1. Ways to get biostatistics help. 2. Statistical approaches for the two aims.
  • Recommendations:
    • Ways of getting biostatistics support: 1. VICTR voucher (needs to be translational research). 2. Delirium group in campus (Rameela Raman). 3. Working with a graduate student (for one-off collaboration) in the Data Science Institute or Biostatistics department. Getting in touch with DGS of the Biostatistics program (Robert Greevy and Ben French). For DSI students, get in touch with people who have a direct connection with students.
    • Statistical approaches: Aim 1: K-means clustering is a good place to start. The important thing is to assess the stability of the clusters afterwards. Bootstrap clustering is a good way to do so. Aim 2: multinomial logistic regression is a good option. Random forest or support vector machine (SVM) also works.
    • Possible VICTR voucher. Application website (https://starbrite.app.vumc.org/) and research proposal template (https://starbrite.app.vumc.org/funding/templatesforms/).

2022 June 13

Whitney Barnett (Kathyrn Humphreys), Psychology and Human Development

Help with a stats review for a VICTR resource request. The proposed project aims to use repeated measures of multiple inflammatory markers to 1) chart trajectories across pregnancy and 2) explore whether changes in inflammatory markers are associated with changes in reward responsiveness and depressive symptoms. The reviewer comment is "The response did not address checking of the assumed correlation pattern (compound symmetry?). And there is a misunderstanding of how preliminary data are used. They are used to show feasibility and to estimate either prevalence (if response is binary) or variability (if it is continuous) for use in a power/sample size calculation for another study. Power calculations never use effect sizes that were observed. They involve only the effect one would not want to miss." We are not quite sure how to respond. Mentor confirmed.

Clinic Notes:
  • Focus on brain and mood changes during pregnancy. Markers on inflammation. N=25, with 3 measurements each participant.
  • The reviewer questions are related to what you will pull from the data and how you will use it. You have correlations in the data (multiple measurements per person), you need to understand how to model those correlations and need to describe the assumptions of the correlation.
  • Recommendations:
    • Check assumptions of correlation matrix in analysis plan (ex. inspect correlation matrix and adjust accordingly).
    • Use the pilot study to look at distribution of data and the "noise".

2022 May 02

Scott Miller (Aaron Yang), Physical Medicine & Rehabilitation

We are looking at the relationship between knee and spine surgery. I have data pulled regarding the surgeries but need help and direction as to the biostatistics section with regards to timing of procedures, which procedures happened first. Mentor confirmed.

Clinic Notes:
  • Looking at relationship between knee and back pathology. Applied for VICTR grant previously. Matched 1:2 cases (knee osteoarthritis (OA) with replacement) with controls (knee OA without replacement). Aims 1) prevalence of low back pain in patients with knee OA necessitating total knee arthroplasty (TKA) compared to matched controls 2) temporal relationship between diagnosis of low back pain and TKA 3) determine rate which patients undergoing TKA also undergo lumbar spine surgery compared with demographically matched controls.
  • Could have missing data when patients do surgeries elsewhere or visit VUMC for the first time (missing date of onset).
  • Recommendations:
    • Focusing on patients that are going to be capture in EHR for both conditions so that the conclusions from the study will be internally consistent. Will need this to ensure quality matches and being able to respond to reviewer criticism.

2022 April 04

Kayla Anderson (Yolanda McDonald), Human and Organizational Development

Return after first attending the clinic on 3/7/2022. Prior description: We are comparing regression methods, such as lasso regression and binomial logit regression, for analyzing weighted survey data. We have questions about how to best implement these methods and whether or not lasso regression can be used with weighted survey data. We have already conducted both methods of regression (lasso and binomial logit) with unweighted data. Questions for this clinic visit: We have questions around standardizing variables and using factor analysis to create latent variables. Mentor confirmed.

Clinic Notes:
  • Looking for ways to create indices for questions that are not on the same scale.
  • Recommendations:
    • Combining variables in to one index, imposed data structure. Bayesian analysis allows you to incorporate outside structure while still performing regression. Put all related variables in as covariates, use coefficients as weights to combine questions in to single score. Could incorporate shrinkage in the prior distribution, protects from over fitting. Use shrinkage to generalize to out-of-sample population.
    • Packages in R that are useful: rstanarm, rmsb. Use a ordinal regression model where the number of violations would be the outcome for the coefficients. Could incorporate sampling weights into the ordinal regression.

2022 March 21

Marshall Wallace, Division of Acute Care Surgery

Return after first attending the clinic on 2/28/22. Prior description of the project: We have developed and administered a 33-question survey which assesses provider perspectives, practice patterns and ethical considerations associated with discharge planning for victims of violence. This survey contains ~10 demographic questions, a set of 5 likert questions which are repeated three times assessing provider responses to three unique clinical vignettes, followed by ~10 separate likert style questions. This survey has been administered to VUMC ED and Trauma physicians, with ~85 responses gathered so far. We would like to perform descriptive statistics on the data obtained. First, we would like to describe the variation in responses to the three clinical vignettes. Second, we would like to assess how covariates such as demographics and responses to the ~10 separate questions relate to trends in responses to the clinical vignette responses. Questions for this clinic visit: I am working with R and excel to perform the recommended statistics from my last clinic on 2/28. I would like help with this on R, help with interpretation of my results and help with determining how to describe this in a methods section of a write up.

Clinic Notes:
  • Developed bar charts after last time's clinic visit.
  • Recommendations:
    • P-values should be used to supplement the bar charts.
    • Question: is there a way to summarize the difference between scenarios for each survey question? There is a correlation of how 1 person answers the same question in different scenarios. Independence test shows differences in the distributions (bar charts). Correlation between scenarios but works best for two instead of three.
    • Understanding patterns among respondents.
    • If you make assumption that differences between each level of the scale is the same you can use some tools like repeated measures anova, calculating means, random effects model. Use lmer function from the package lme4 in R. Compare models with and without the scenario variable.

2022 March 14

Ryan Hsi, Urology

I'm planning an RC2 grant submission for a multicenter prospective registry for kidney stone disease. Need input on study design, sample size/power, resources for de-identified data management/data structure, specimen tracking and storage

Clinic Notes:
  • Kidney stone registry with longitudinal data. Collect data on outcomes such as kidney stone recurrence and kidney stone growth. Timepoints, baseline and followup every 6 months for 3 years. Looking at risk factors for progression of disease. Multiple centers with a sample size of 300-500 patients.
  • Questions:
    • 1) power calculation - since it is not hypothesis testing, don't let "power" be seen anywhere. Make case that you would learn something or that it would change clinical practice. Wanting to estimate risk of outcome/how long for something to develop (prediction problems). Can look in to ordinal outcome.
    • 2) resources for data collection/management.
  • Recommendations:
    • Getting intensive data on a smaller sample of patients may be ideal but patients may not show up. Could consider random intervals for imaging. Could do 3 CTs per patient and randomize the time of the middle one.
    • Rough rule of thumb: number of clinical events divided by 15 is the number of factors you can look at.
    • Attend design/grant-writing studio to get multi-disciplinary feedback.

2022 March 7

Kayla Anderson (Yolanda McDonald), Human and Organizational Development

We are comparing regression methods, such as lasso regression and binomial logit regression, for analyzing weighted survey data. We have questions about how to best implement these methods and whether or not lasso regression can be used with weighted survey data. We have already conducted both methods of regression (lasso and binomial logit) with unweighted data. Mentor confirmed.

Clinic Notes:
  • Main question: could we use LASSO method on weighted survey data?
  • Recommendations:
    • Using your own data to select variables in LASSO will affect your coefficients and confidence intervals. If using LASSO might need to do 10-fold cross validation, bootstrap, etc to make sure you have good outside sampling information then do normal regression. Look in to relaxed LASSO. To incorporate weight information in LASSO, can grab variables related to sampling design and force them to be in the model as predictors.
    • Could also recreate sample based on weights. And it will reflect the sampling probabilities.
    • Backward/forward variable selection methods are known to be unstable. And they give standard errors (confidence intervals) that are too small. One way to look at the stability of the method is to bootstrap the data and do the selection.
    • In LASSO, one way to get stable results is 1se. And you could force variables to stay in the model.

2022 February 28

Bill Nobis, Department of Neurology

We intend to use the Vanderbilt BioVU database to perform whole exome sequencing (WES) of a population of epilepsy patients enriched for Sudden Unexpected Death in Epilepsy (SUDEP). After WES, we will look for biomarkers of SUDEP risk in patients, including genetic and clinical factors. We'd like guidance on the number of control samples to include for the appropriate power.

Clinic Notes:
  • BioVU and SD databases, will use ICD10 codes to create cohort. There is a total of 70 records to compare against control samples. Would like to do sequencing on those 70 patients. Looking at how certain genes are predictive for Sudden Unexpected Death in Epilepsy (SUDEP). Controls are 1000 epilepsy patients, representative of population at Vanderbilt.
  • Recommendations:

Marshall Wallace (Allan Peetz), Division of Acute Care Surgery

We have developed and administered a 33-question survey which assesses provider perspectives, practice patterns and ethical considerations associated with discharge planning for victims of violence. This survey contains ~10 demographic questions, a set of 5 likert questions which are repeated three times assessing provider responses to three unique clinical vignettes, followed by ~10 separate likert style questions. This survey has been administered to VUMC ED and Trauma physicians, with ~85 responses gathered so far. We would like to perform descriptive statistics on the data obtained. First, we would like to describe the variation in responses to the three clinical vignettes. Second, we would like to assess how covariates such as demographics and responses to the ~10 separate questions relate to trends in responses to the clinical vignette responses. Mentor confirmed.

Clinic Notes:
  • Survey on emergency and trauma providers. Ethics on discharge planning when sending someone to a place that might not be safe. 33-question survey, including demographic information. 3 clinical scenarios, same 5 questions. Additional ethical prompts at end of survey. Questions are typically on 5-point likert scale. 83 responses received so far. Question is do different levels/types of physicians react/think differently.
  • Recommendations:
    • What analyses can we do to perform descriptive statistics? A lot are visual, need to get clear idea of what you want to present. Could do stacked bar charts for likert scale questions. Could do separate stacked bar charts for different groups to see the distribution differences. Could do color-coded matrix (5x5 table) based on answer to a particular question.
    • There are single number summaries for group differences (common odds ratio). But the assumptions may not make sense in this case.
    • Can also look at agreement statistics between groups.
    • Possible VICTR voucher for sequencing. Application website (https://starbrite.app.vumc.org/) and research proposal template (https://starbrite.app.vumc.org/funding/templatesforms/).

2022 February 21

Christine Kimpel (Alvin Jeffery), School of Nursing

As part of a VA Quality Scholars improvement project, Drs. Amy Guidera and Alvin Jeffery and I are planning a QI project to increase nursingís awareness and use of a new quality metrics dashboard at the VA. At baseline and follow-up, we are collecting baseline and survey data (e.g., Evidence-based practice beliefs) consisting of three, 3-item scales. Our questions include, what statistical tests may be used to account for dropout. As we are unable to collect demographic data, what options are there for statistical adjustment (e.g., unit type)? There are surrogate markers from the All Employee Survey available at the unit level. How could we use those metrics? Mentor confirmed.

Clinic Notes:
  • Intervention is training on use of dashboard. 9-question evidence based practice survey is outcome, measured at baseline and 6-12 months later. There will be 3 summary outcomes. Hypothesis is scores changed after training/over time. No randomization in the study, comparing pre and post scores. Sample size potential is 12 record. This is pilot study.
  • Recommendations:
    • What is most appropriate pre/post comparison tests (high employee turnover [21-25%], no demographics, response rate is 50%)? Paired test has advantage of much more precise outcomes, you are looking at within person variation which tends to be less. For a paired test, could only include people who completed both pre test and post test. Assumption you make without paired test is that pre test would be representative of entire population (ie. same type of group/respondents). Compute paired non-parametric test on differences.
    • Pilot study shows that data is feasible to collect. At the next phase, could do a step wedge design randomization from different units.

2022 February 14

Ivana Thompson, OB/GYN

We have designed a survey to access clinical practices for managing delivery of the placenta after second trimester delivery. We would like to share our survey and get your input on how we should approach data analysis.

Clinic Notes:
  • Management of the third stage of labor (delivery of placenta). Survey sent to providers who deliver babies to get sense of how to manage particular stage of labor. Some questions will have ordinal outcome.
  • Recommendations:
    • To analyze ordinal/ranked data, could 1) list out all possible preferences and report percent at each one (good if there is a dominate ordering) 2) think about it as instantaneous run off, rank preference order, count up how many first ranks there are for each possibility (can identify which is least liked, reallocate votes until a preference gets more than 50% of votes) 3) rank aggregation methods, take ranks and put them in to single score which allows you to summarize the question by reporting scores.

2022 January 24

Sarah Welch (Christianne Roumie), Physical Medicine & Rehabilitation

Discussed ahead of time with Rob Greevey. Measurement of exposure variables collected from chart. Proper way to model relationship between exposure variables and outcome. Studying the 4Ms of geriatric care (mobility, mentation, what matters most, medications) in the hospital setting at Vanderbilt. Data was pulled from the RD and abstracted through chart review. Hospitalized older adults are vulnerable to suffer healthcare-related preventable harm. Age-friendly health systems is a national quality improvement initiative that addresses this problem by implementing a care delivery model called the 4M's (Mentation, Mobility, Medication, and what Matters most). We have adopted the 4Ms on Vanderbilt's Acute Care for Elders (ACE) unit. To our knowledge, no prior studies have been been published correlating the measurement of the combined 4M's with hospital outcomes. We have a retrospective cohort study consisting of chart review on patients admitted to the ACE unit in 2020. We reviewed the electronic medical record to quantify the measurement of each M on the unit. We had an IDASC custom data pull from the Research Derivative to include hospital outcomes and additional health demographic variables/covariates on our ACE unit patients in 2020. We would like help with our analytic approach to examine the relationships between the exposure (measurement of the 4M's in the hospital) and the hospital outcomes pulled from the Research Derivative (discharge to post acute care, hospital readmissions, etc.). Mentor confirmed.

Clinic Notes:
  • Question: Is incremental delivery of 4Ms care associated with reductions in hospital readmissions or discharge to post-acute care among those 65+? 4Ms care (what matters most, medication, delirium/mentation, mobility).
  • Exposures are the delivery of 4Ms (binary at this point, categorical if needed). Covariates collected through RD include age, race, ethnicity, hospital LOS, social vulnerability index, BMI, insurance, admitting source (home vs. institution), charlson comorbidity index. Covariates collected through chart review include Vanderbilt familiar faces, Lives alone, discharge diagnosis, fall during hospitalization, cognitive Impairment/dementia, frailty index. Outcomes include 30 day readmission (binary), 30 day ED visit (binary), 30 day death (binary), discharge destination (home vs. post acute care, binary), last follow up date (for censoring).
  • Prospective cohort has 400+ patients, retrospective chart review has 500+ patients. The two data sets do not necessarily have the same variables and the variables were not collected in the same way.
  • Recommendations:
    • Outcome should be redefined if you can be admitted from PAC and then return to it. Almost like an exclusion criteria because you already experienced the outcome (discharge to PAC). Huge limitation is not having admission PAC status.
    • Useful descriptive statistics might serve better than creating complex analysis.
    • Could look at differences between level 0, 1, 2, 3 for outcomes.
    • If you want good estimate of direction of effect of 1 M on outcome, control for everything else including other M's. Would not combine M's at this point.
    • In readmit model, can add in discharge to PAC as covariate.
    • Use this data to trim down covariates for next study.
    • Include death in outcome (ex. readmit or death).

2022 January 10

Aileen Wright, Biomedical Informatics

Predicting hypoglycemia (low blood sugar) in inpatients. I am inputting many variables into a logistic regression (e.g. patientís age, weight, etc.) However, one of the variables is actually time series data, of glucose values over time. Whatís the best way to incorporate this time series into the logistic regression which also has other variables in it (i.e., this is not just a regression of time series data only.)

Clinic Notes:
  • Try to develop a predictive model for low blood sugar (lower than 70). Outcome is next glucose level. Looking for advice on time-series models, glucose over previous 24 hours.
  • Inclusion criteria: hospitalized adults with at least one order of insulin. Exclusion criteria: patients in ICU or palliative care.
  • Logistic regression will be least powerful statistically (standard errors will be much larger, less precise).
  • Recommendations:
    • Better to predict actual glucose level than low/high glucose.
    • Need to think about the correlated nature of the data (multiple rows per patient). Random effects model to incorporate correlation within patient.
    • Options to incorporate glucose trends in to model 1) include covariates related to number of glucose measurements in the last 24 hours 2) include most recent glucose value with non-linear component 3) include most recent glucose value and the change between that and the one right before [build submodels for those with no glucose values, 1 glucose value, 2 glucose models, etc.] 4) use a polynomial estimated on previous glucose values in the model [slope of trend of glucose values as covariate] 5) relaxed Lasso.
    • Consider how to incorporate interactions or non-linear components in all the options of models above.
Topic attachments
I Attachment Action Size Date Who Comment
BoxPlotR.RR BoxPlotR.R manage 5.7 K 17 Apr 2006 - 11:44 QingxiaChen  
InforegardingwhatmySPSSfilesays.docdoc InforegardingwhatmySPSSfilesays.doc manage 24.5 K 17 Apr 2006 - 11:44 QingxiaChen  
LOA_condensed_data.sxcsxc LOA_condensed_data.sxc manage 22.1 K 04 Dec 2006 - 09:17 PatrickArbogast Data from Edward Butterworth
Oluwole_Biostat_Clinic.xlsxls Oluwole_Biostat_Clinic.xls manage 46.5 K 25 Aug 2014 - 11:30 SharonPhillips data file for Olalekan Oluwole
StatisticalAnalysisRequest.docdoc StatisticalAnalysisRequest.doc manage 22.5 K 17 Apr 2006 - 10:26 QingxiaChen  
WellsIschemicCollat.pngpng WellsIschemicCollat.png manage 37.0 K 31 Jan 2011 - 13:58 MattShotwell  
WellsIschemicEF.pngpng WellsIschemicEF.png manage 37.4 K 31 Jan 2011 - 13:55 MattShotwell  
analysisEXT analysis manage 3.9 K 11 Feb 2006 - 20:30 QingxiaChen  
biost_clinic_stephanie_vaughn.csvcsv biost_clinic_stephanie_vaughn.csv manage 4.3 K 23 Apr 2007 - 11:37 PatrickArbogast  
biost_clinic_stephanie_vaughn.dtadta biost_clinic_stephanie_vaughn.dta manage 1.7 K 01 May 2007 - 11:12 PatrickArbogast Stata datafile for Stephanie Vaughn
biost_clinic_stephanie_vaughn.loglog biost_clinic_stephanie_vaughn.log manage 8.1 K 01 May 2007 - 11:13 PatrickArbogast Analysis results for Stephanie Vaughn from April 30th clinic
biost_clinic_stephanie_vaughn.xlsxls biost_clinic_stephanie_vaughn.xls manage 25.0 K 23 Apr 2007 - 11:37 PatrickArbogast  
boxplotdata.csvcsv boxplotdata.csv manage 2.7 K 17 Apr 2006 - 10:27 QingxiaChen  
clinicimage.jpgjpg clinicimage.jpg manage 134.8 K 14 Aug 2020 - 10:15 DalePlummer  
clintCarroll.sxcsxc clintCarroll.sxc manage 40.4 K 26 Feb 2006 - 21:30 FrankHarrell Clint Carroll Langerhans Data
clintCarrollabstract.sxwsxw clintCarrollabstract.sxw manage 8.7 K 26 Feb 2006 - 21:27 FrankHarrell Clint Carroll Langerhans Abstract
specificaims.docdoc specificaims.doc manage 25.5 K 13 Feb 2006 - 10:11 ChuanZhou Specific Aims
tang.rdarda tang.rda manage 13.4 K 19 Dec 2009 - 08:42 FrankHarrell Data from Yi Wei Tang processed using R code above
Topic revision: r766 - 27 Jun 2022, YueGao
 

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback