Health services research, diagnosis, and prognosis

Notes (2022)

2022 December 12

Paras Karmacharya (Leslie J. Crofford), Rheumatology

Determine the associations of body mass index (BMI) and PRS obesity with treatment response in psoriatic arthritis. We plan to: 1) examine the association of BMI with treatment response in PsA (marginal structural models to account for multiple treatment instances in the same patient), and 2) perform a mediation analysis to estimate the direct effect of PRS obesity on treatment response, and the indirect effect through BMI (Figure 1). Mentor confirmed.

Clinic Notes:
  • Determine the association of BMI and polygenic risk score for obesity (PRS obesity) with treatment response in PsA. Mediation analysis.
  • Recommendations:
    • Product of coefficients approach. Conduct 2 regressions and use difference between the 2 to determine if correlation between bmi and treatment and bmi and outcome informs the correlation between treatment and outcome (can extract mediation effect).
    • For power analysis, need to work with statistician prior to submission to simulate mediation effect then calculate power. Due to lack of time, would suggest to write in grant that you will work with funded statistician to estimate width of confidence interval to determine if the trial will continue to be informative in 2B even though it will remain informative in 2A.
    • Sample size, rule of thumb need 50 patients (+10-15 cases for each df) for linear regression or 15 events for each degree of freedom in model for logistic. Good to put based on previous literature, they have X for sample size. This also recognizes we might have additional degrees of freedom so we should be okay with ~600.
    • Suggest modeling age with restricted cubic spline with minimum of 3 knots.
    • Would focus on difference in the proportion of events between the two groups rather than R^2 (doesn't really explain the variance in logistic regression). Could use nQuery to do a simple calculation for sample size for logistic regression.
    • Could look at IDI (integrated discrimination index) and NRI (net reclassification index) for quantification of impact the mediator variable has on increasing true negative rates. Could also look at C statistic.

Ya-Ching Hung (Bradley Hall, mentor not present), Plastic Surgery

We would like to perform non-inferiority randomized trial to compare outcomes using different surgical techniques. We would like to discuss power analysis. Mentor confirmed.

Clinic Notes:
  • Effect of continuous vs intermediate suture on outcome. Want to do a non-inferiority study. Outcome is complication (binary, yes/no for each patient).
  • Delta (difference of two proportions) is an absolute value, odds ratio is a relative change.
  • Recommendations:
    • Need to determine what actually matters, relative difference or absolute difference (ex. arm 1 has 50% outcome, group 2 of 25% outcome, relative is 50/25 = 2, absolute is 50-25=25). What is minimally important difference between the 2 groups you would accept?
    • Look at the known upper bound of the confidence interval. To demonstrate non-inferiority, needs to show the upper limit of the one-sided confidence interval is less than the upper bound of the old treatment.
    • When using nQuery, since it would be a one-sided trial (only care it is not worse), put all 5% of the error rate on one side.
    • For the design of a difference trial, needs to decide with colleagues on the minimally important difference in event rate.
    • Estimate sample size as the proportion/percent rate is inside (less than) 0.42 boundary.

2022 November 28

Seth Reasoner (Maria Hadjifrangiskou), Pathology, Microbiology, Immunology

The aim of this proposal is to examine the longitudinal change in fecal metabolites in cystic fibrosis patients before and after ELX/TEZ/IVA treatment. This work will be the first to elucidate overall microbiome response to ELX/TEZ/IVA and fecal metabolomic change. We will be applying for VICTR funds to perform metabolomic analysis. Mentor confirmed.

Clinic Notes:
  • Cystic fibrosis that causes change of GI microbiome. The study looks at CFTR modulators, specifically Trikafta. Cohort: pediatric cystic fibrosis patients (N = 39). Look at patients twice before study started, and at 6-month and 12-month. Clinical outcomes: FEV1, BMI, antibiotic usage. Hypothesis: the treatment will improve microbiome diversity.
  • Recommendations:
    • Try to recruit a control cohort to establish the inference that treatment is the cause of the change.

Sriya Nemani (Patrick Assi), Plastic Surgery

We will be testing the superiority of polyethylene-glycol assisted fusion on sensation post-phalloplasty. Mentor confirmed.

Clinic Notes:
  • Hypothesis 1: patients with the drug will have greater sensation. Hypothesis 2: Greater sensation is correlated with better quality of life. N = 30, 15 in the control group and 15 in the treatment group. 5 follow-up visits: a year after surgery.
  • Recommendations:
    • If you look at repeated measures, can look at time*treatment interaction using proportional odds model. If you look at cross-sectional point in time, you can look at difference at that time.
    • Would need more frequent measurement to look at time to recovery. With current frequency, can say there is more recovery at time 1 in treatment vs control compared to time 2 or time 3. This way you can say difference in recovery is greater at 1 month than 12 months, etc. Would use mixed effects proportional odds model that uses order of variables instead of values of the variables. Fixed effect due to treatment, random effect of patient (takes away individual patient average so it normalizes patients to each other). Can add secondary analysis of linear regression.
    • For the purpose of this grant, drop time to event and ask patients date when they think it returned.
    • Can do secondary analysis (quality of life) in the same way as the primary analysis.
    • nQuery for power/sample size calculations. Can do a power calculation for a Wilcoxon test (same as proportional odds model) in nQuery.
    • For future, would take advantage of VICTR's studio program for feedback as you write/develop studies (request clinical trials expertise).
    • Possible VICTR voucher. Application website (https://starbrite.app.vumc.org/) and research proposal template (https://starbrite.app.vumc.org/funding/templatesforms/).

2022 November 7

Amelia Maiga (Mayur Patel), Surgery/Trauma Acute Care

Using a trauma national quality improvement database (TQIP) of patients with severe TBI from 2017-2020, does advanced neuromonitoring improve short-term outcomes compared to standard ICP-guided neuromonitoring? Mentor confirmed.

Clinic Notes:
  • TQIP database (2007-2020) of trauma patients collected from >850 trauma centers across US. No long-term outcomes (post- discharge), looking at short term outcomes. Population will include those with severe TBI. Looking at impact of advanced neuromonitoring on outcomes. Research Questions: 1) How has the use of advances neuromonitoring for TBI changed over time, geographically, and by patient population? 2) Does advanced neuromonitoring for TBI improve short-term outcomes compared to standard ICP-guided neuromonitoring? Primary outcome: hospital discharge disposition. Secondary outcomes: inpatient mortality, hospital LOS, ICU LOS, total ventilator days. Covariates: age, sex, race, admission GCS, ISS, withdraw of care +timing, other patient and facility-level variables. Questions: 1. propensity score matching vs. other statistical approach. 2. ordinal vs. dichotomous primary outcomes. 3. best practices for constructing an analytic database out of relational database files. 4. section biostats support vs. VICTR grant.
  • Recommendations:
    • The observational data would be difficult to interpret since there is decision for whether or not a patient is placed on advanced neuromonitoring.
    • Analysis of interest would be time dependent likelihood of getting monitored. This would require time dependent/repeated measures.
    • Propensity score is not recommended since it excludes patients. Should adjust patient characteristics in the usual fashion instead.
    • Ordinal outcome is preferred over binary. And it's better to think of outcome as longitudinal rather than a one-time status.
    • Group hospital discharge dispositions by severity to be an ordinal outcome.
    • Possible VICTR voucher. Application website (https://starbrite.app.vumc.org/) and research proposal template (https://starbrite.app.vumc.org/funding/templatesforms/).

2022 October 10

Soumya Gogia (Ivana Thompson), OB/GYN

I would like some assistance making an analysis plan for survey data regarding active management of the third state of labor. Mentor confirmed.

Clinic Notes:
  • Looks at how providers manage the third stage of delivery (delivery of the placenta) for births after second trimester. Categorical options of how they manage. Providers rank the procedures based on their preferred choice. Looking for best way to analyze data.
  • Recommendations:
    • Need to write down hypotheses/manuscript prior to running any tests. When we are running test after test we are fishing for p-values where one is significant just based on chance.
    • Focus on the descriptive statistics. Continuous variables: mean, sd or median, iqr. Categorical variables: frequency/percentages. Can look at how often people said each management category was #1, etc.
    • Probability tree - probability of what they picked first, then given the rest what is the probability they picked each of the other 4 next and so on. Identify clear pathways that are densely populated.
    • Can also look at top 3 things that people said (this shows priority in practice).

2022 October 3

Alyssa Merkel (Jillian Rhoads), VICTR

We are proposing a retrospective cohort epidemiological study to investigate if treatment with an SSRI would be beneficial in patients with Type 2 Diabetes (T2D) associated Kidney Disease. Our patient population will include overweight adults aged 18-85 years, who have been diagnosed with T2D with kidney complications. Data associated with SSRI exposure, renal function, diabetes, and inflammation will be collected through a custom programmatic pull of EHR data from the Synthetic Derivative. Our primary outcome is renal function decline in relation to SSRI exposure and will be analyzed using a linear regression model. We are seeking feedback on our study design and statistical analysis plan as well as a collaboration with biostatistics for our VICTR Resource Request. Mentor confirmed.

Clinic Notes:
  • EHR-based retrospective cohort study to investigate SSRI use in diabetic kidney disease (DKD) patients. The goal is to gather clinical preliminary data to inform a potential pilot drug repurposing project. Primary outcome: renal function measured by eGFR, adjusted for baseline status.
  • Recommendations:
    • Mixed-effects model with participants as random effect and baseline measurements as covariate. This allows us to take in to account the correlation of measurements within the same patient.
    • Get variables and list them as confounder, covariate or effect modifier.
    • General rule of thumb: sample size is 50 people + at least 10 observations per degree of freedom
  • Possible VICTR voucher. Application website ( https://starbrite.app.vumc.org/ ) and research proposal template ( https://starbrite.app.vumc.org/funding/templatesforms/ ).

Victoria Thomas (Josh Beckman), Cardiology

PRAISE project collecting community data and testing knowledge. Mentor confirmed.

Clinic Notes:
  • Community project. Aim 1: evaluate effects of educational intervention on familiarity with Peripheral Arterial Disease (PAD). Aim 2: determine prevalence of PAD in subgroup. Health disparities is one reason for underlying difference, in general African Americans have higher PAD. Looking at going to churches in community will help knowledge about PAD and lessen the disparity.
  • Recommendations:
    • Make sure the questions in the survey do not accidentally give people answer. In absence of a validated scale, maybe want to look at the answers to each question individually.
    • Greatest power is continuous outcome (ex. scale of numbers they got right).
    • Quasi-experimental design. If you do power of ABI, your power would be based on confidence interval of the proportion of patients with definition for ABI.
    • Go through test and decide which ones you want to be main comparison and we can do power based on those comparisons. The more tests you do, the more of a chance you have to randomly find a significant result by chance. Every time we do additional comparisons we decrease the critical p-value that is needed for significance (multiple comparisons).
  • Possible VICTR voucher. Application website ( https://starbrite.app.vumc.org/ ) and research proposal template ( https://starbrite.app.vumc.org/funding/templatesforms/ ).

2022 September 26

Adam Yock, Radiation Oncology

Experimental devices were irradiated at three dose levels. Biological samples were drawn from the devices at four time points following irradiation and analyzed. I would like to discuss: 1. My statistical approach, including mixed-model ANOVA vs other (t-test or one-way ANOVA over compound metrics like AUC) 2. Management of statistical outliers.

Clinic Notes:
  • Two experiments on blood-brain barrier with chips (20 chips in total). Experiment 1: test blood-brain barrier integrity. Single-sized molecules. Three levels of radiation (0, 2, and 20 Gy) and four time points (0, 6, 24, and 48 hours). Experiment 2: cytokine activation. Same three levels and same four time points. 10 cytokines that could be considered as independent. Outcome: concentration of molecule in the brain, continuous variable.
  • Recommendations:
    • If there is no between-time correlation, use an unstructured covariance matrix.
    • Mixed effects model, two main predictors (level and time) , include baseline permeability as an adjusting covariate. Could start with a linear regression and look at the distribution. If the distribution is not normal, do a proportional odds model.
    • Check outliers via reps. Do a sensitivity analysis and see the effects of outliers on the regression.
    • Peak value or AUC is not recommended since we don't know where is the peak/what is happening between time points.
    • In the dataset, record chip number, time, and radiation level.
  • Possible VICTR voucher. Application website ( https://starbrite.app.vumc.org/ ) and research proposal template ( https://starbrite.app.vumc.org/funding/templatesforms/ ).

2022 September 19

Marianna LaNoue, School of Nursing

Requesting assistance with analysis plan and sample size calculation for a nested case-control study (cohort of pregnant women enrolled prospectively at T1, cases identified at T2 (pre-eclampsia before delivery) and compared to rest of cohort (controls), then cases are matched with controls for additional comparisons at post-natal T3 and T4). DV is an expensive assay measure, prelim data (N = 10) indicates very large effect.

Clinic Notes:
  • Looking for predictors of preeclampsia after 20 weeks of gestation & postpartum hypertension. Follow up to 1 year after delivery. Biospecimens taken at 8-12 week gestational age.
  • Recommendations:
    • Possible outcomes: Single outcome with groups of 1) no evidence of hypertension, 2) hypertension in pregnancy (preeclampsia), 3) persistent hypertension 4) early onset postpartum hypertension 5) late onset postpartum hypertension or time to hypertension, use gestational age as covariate.
    • Model temporal trend of biomarkers.
    • Prospective case control with matching. Enroll case first, then look back in sample to see who matches. Measure everyone at same time point (ex. 27 weeks) so when you look back they all have measurements at the same time point.
    • Power, focus on known magnitude of early effect. Want to have enough patients to fit a regression model with at least 2-3 covariates.

2022 August 29

Natasha Belsky (Nick Desai), Adolescent Medicine

I have completed a QI project and would like to know the best way/help in assessing the data for significance/a poster/abstract. I want to compare the rates after various interventions. Mentor confirmed.

Clinic Notes:
  • QI project for class. Over 3 months, took patient reported and chart review screening information. Main question is how screening documentation has changed over time. Uses standardized screening tool, patient reported, as well as chart review (provider documented). Did two interventions and recorded the changes after each intervention. The sample sizes are different for baseline (~33)/after first intervention (~65)/after second intervention (~70). Binary outcome measured at each visit.
  • Recommendations:
    • Will need to consider provider in analyses as intervention could effect each provider differently.
    • Traditional approach is to compare averages in each period. Instead we want to look at trends over time. Interrupted time series analysis looks at is the difference in screening dependent on time.
    • Need date of visit and screening (y/n).
    • Need to make sure the sample is not biased. Data needs to be representative of general population. Selection needs to be as random as possible.
  • Possible VICTR voucher. Application website ( https://starbrite.app.vumc.org/ ) and research proposal template ( https://starbrite.app.vumc.org/funding/templatesforms/ ).

2022 August 22

Alexander Sin (Eric Bowman), Ortho Sports Medicine

1. Study comparing clinical outcome (return to play, pain, function scores via REDCap survey) in collegiate athletes who has an ankle sprain, will block randomize with treatment group (blood flow restriction with physical therapy) and control group (sham blood flow restriction with physical therapy). Aiming for 20 subjects in each group. Mentor confirmed.

2. Epidemiologic study extracting 22 years (2000-2022) of deidentified / publicly available data from NEISS (national injury registry), regarding keyword 'rugby' related injuries. Will categorize body parts / injury types for the past 22 years and overlap with recent World Rugby rule changes that were aimed to decrease injury rates, and see if the rules are effective.

3. Cross-sectional study, will survey transgender high school students, primary outcome: sports participation (competitive and non-competitive), secondary outcome: depression (PHQ-9 questionnaire) and anxiety scores (SCARED / GAD-7). Compare states with legislation banning transgender sports participation and those without ban in place.
Clinic Notes:
  • Study 1: comparing clinical outcome (return to play, pain, function scores via REDCap survey) in collegiate athletes who has an ankle sprain, will block randomize with treatment group (blood flow restriction with physical therapy) and control group (sham blood flow restriction with physical therapy). Sample size 40 (20 each group).
  • Recommendations:
    • Select main outcome. If primary outcome is less variable would need less people. If standard deviation/variability is large would need more subjects. If you can't pick main one, you will change decision making criteria and have to adjust p-value to consider multiple testing.
    • Possible outcome, time to return to play/readiness to return to play (need objective criteria to define).
    • Loses statistical power if you measure outcome with lots of time in-between (weeks/months). Suggest measuring daily.
    • Next steps - redefine primary outcome. Is time-to play feasible or is proportion at specific point in time that is ready to play more appropriate. Need to think through what is possible in this study. Try to minimize burden to athletes. Make sure statistician is involved throughout study. Send over protocol for us to help finalize before voucher submission.
  • Study 2: NEISS survey. Research question: does rule changes in World Rugby lead to changes in epidemiology in Rugby-related injuries? Will pull large amount of data and categorize injury types.
  • Recommendations:
    • Interrupted time series or time-trend analysis. Look at direction of trend before hand, moment rule was implemented, and trend afterwards.
    • Need to identify specific rule change you want to look at.
  • Study 3: Examine mental health status of transgender (LGBTQ+) high school athletes. Want to look at the impact/access of physical activity/sports on mental health. Goals - evaluate differences sports participation, identify barriers to sports participation, evaluate prevalence of mental health issue in states with laws banning competitive participation vs those without a law. Using BPASQ-LGBTQ+ survey.
  • Recommendations:
    • Difficult to separate mental health based on sports vs other political difficulties. If you can adjust in some way for access then doing something with participation is feasible. Need help from people who are experts in this type of field.
    • This is a very complicated topic. There are a lot of things that can influence mental health and sports participation. Need to create theoretical framework of how this fits together. What you can and cannot control. This will help guide you in what you can do.
    • Sign up for another clinic to talk more about this project.
  • Possible VICTR voucher. Application website ( https://starbrite.app.vumc.org/ ) and research proposal template ( https://starbrite.app.vumc.org/funding/templatesforms/ ).

2022 August 15

Benjamin Collins (Ellen Clayton), Biomedical Informatics & Biomedical Ethics

Examining patient trust in artificial intelligence for cancer care among racial and ethnic minorities. Two stage study, survey and focus group. Need help with setting power of study and thinking about number of participants to recruit at each stage, planning statistical analysis. Mentor confirmed.

Clinic Notes:
  • There is mistrust in health care & artificial intelligence among oncology patients, higher in racial minority populations. Two stage study, first is survey to collect demographics and measuring trust (likert style). The second stage is to have focused groups with people who completed the first stage. Need information about sample size in both survey and focus group stage.
  • Recommendations:
    • For focus groups, it is generally suggested 10-12 people for homogeneous group. If heterogeneous, at least 12 of each "group". Cannot really do a power calculation here.
    • Need to validate whatever questions you are going to develop against what exists currently. Develop additional questions that are related to AI and healthcare. Early focus groups/surveys should probably be about what affects trust, not just measuring trust.
    • For the same group of patients, do a survey on trust of healthcare, and another survey on trust of AI. Incorporate different vignettes with different levels of AI. Arrange vignettes in different orders and each question should be considered on its own.
    • Do focus groups first to get study/survey scale set. For sample size, have to determine what difference on that scale you do not want to miss. Focus group will help you measure distribution of scale and variability, which is needed for sample size.

2022 August 1

Santaria Geter (Izabela Galdyn), Plastic Surgery

This is an anonymous cross-sectional survey, distributed to Vanderbilt University’s plastic surgery residency director and/or coordinator to be forwarded to their residents and faculty. The anonymous survey consists of the Impostorism Scale by Dr. Mark Leary with additional demographic and plastic-surgery specific questions including geographic region of the residency program, biological sex, age, race, United States Medical Liicensing Exam score, and post-graduate year (residents) or years in practice (faculty). Risk factors for the prevalence and severity of IP will need to be conducted for demographic factors using univariate and multivariate analyses. Mentor confirmed.

Clinic Notes:
  • Studying Imposter syndrome (Leary Imposterism Scale, 7-item validated likert scale). Sent to all residents and faculty. Added a few demographic variables. Hypothesis is that syndrome will correlate with various demographics. The lower level (younger) you are imposter scores would be higher.
  • Recommendations:
    • Understand how the scale is measured and/or overall score is calculated. Then come back to clinic for advice on how to proceed with analysis.

Michaela Crawford (Izabela Galdyn), Plastic Surgery

I am sending out a REDCap survey on the imposter syndrome phenomenon in Vanderbilt residents and faculty. The results need to be analyzed. Mentor confirmed.

Clinic Notes:
  • Same study as Santaria. Hypothesis: imposter syndrome is prevalent in both residents and faculty but has a decreased prevalence in faculty.
  • Recommendations:
    • Compare residents and faculty in their prevalence.
    • Understand the scale and come back to clinic.

2022 July 25

Luis Okamoto, Clinical Pharmacology

Our study is a pilot study aimed at collecting preliminary data on the perceived clinical efficacy of the central sympatholytic guanfacine in postural tachycardia syndrome patients with and without chronic fatigue syndrome who have been treated with this medication. We will be using an online questionnaire to evaluate their symptoms and the impact on their quality of life. We would like biostatistics support in determining the best way to evaluate and analyze our data.

Clinic Notes:
  • Postural tachycardia syndrome (POTS), people who have it cannot stand for long periods of time due to orthostatic intolerance. Pathophysiology is not understood. The hypothesis is central sympatholytics such as guanfacine would improve POTS symptoms in hyperadrenergic POTS and CFS-related symptoms. Pilot study looking at efficacy of guanfacine, survey assessment, medical record abstraction, n=200. The goals are to determine if guanfacine improves symptoms in a subset of POTS patients with and without CFS, and to identify potential predictors of response. Primary outcome is Patient Global Impression of Change (PGIC), 7-point likert scale.
  • Recommendations:
    • If primary question is if guanfacine improves symptoms the inclusion needs to only include people with POTS and use guanfacine has an exposure. With no control group it is difficult to know which group will benefit from guanfacine and those who won't. We actually care about the interaction of guanfacine and the sub group.
    • Could use propensity score to look at what clinical characteristics predict outcome of guanfacine treatment, then do propensity matched analysis with guanfacine use as predictor of the outcome (perceived symptom improvement).
    • Need to determine if this is a treatment effect, differential treatment effect or other type of question.
    • Need to have a control group (patients who did not receive guanfacine) to have an effective study.

Candace Grisham (Marjan Rafat), Biomedical Engineering

I am investigating the prognostic value of immune system inflammatory markers on recurrence of head and neck squamous cell carcinoma after radiation therapy. We are specifically looking at recurrence free survival and overall survival using Kaplan Meier curves. Mentor confirmed.

Clinic Notes:
  • In laryngeal squamous cell carcinoma, looking at effects of radiation therapy on reoccurrence & survival. Study variables: ALC, AMC, ANC, NLR. Time frame: from one year prior to one year after radiation therapy. Study outcome: recurrence free survival and overall survival. Number of patients: 69, prior to RT: 41.
  • Recommendations:
    • Could look at a bio-maker as time-varying covariate in a Cox regression.
    • Could measure periods of time that patients experience lymphopenia or absolute days free of lymphopenia. Another option is using proportion of days with lymphopenia that way it is related to how long the patient was followed individually.
  • Possible VICTR voucher. Application website (https://starbrite.app.vumc.org/) and research proposal template (https://starbrite.app.vumc.org/funding/templatesforms/).

2022 July 18

Carlos Plancarte, Pediatric Hospital Medicine

Recently the Children’s Hospital Association has declared a behavioral health emergency with the increasing number of children hospitalized with behavioral health concerns. One certain population (children with Autism) seem to have significant barriers to inpatient psychiatric treatment some reasons may be due to their unique behavioral health issues, developmental level, and/or how comfortable psychiatric hospitals are with helping this population. Our aim is to study the association between length of stay of children with and without Autism (hospitalized at the children’s hospital, awaiting inpatient psychiatric transfer) as well as final disposition. Our hypothesis is that children with Autism have a higher association with being discharged home when compared to children without Autism. We also hypothesize that when children with Autism can be transferred to a psychiatric hospital, they a longer LOS at our institution while awaiting placement. Specific questions below: Is logistic regression the best method to analyze this retrospective data? Is a power analysis needed for this study? I obviously have the sample size, but in these kinds of studies I always struggle to find the population effect size (especially when its not been previously well studied in the literature). (question for my own learning) If you find a significant difference between two populations, does the power analysis even matter? I will also propose the potential covariates to include in the model and would appreciate input on whether they would be appropriate to include/exclude depending on if they are true confounders and/or part of the causal pathway. After presenting our proposed design and approach (including proposed confounders). Is their anything else we could do to further strengthen the study?

Clinic Notes:
  • No previous study has focused on LOS at a children's hospital for children with Autism awaiting inpatient psychiatric care. Retrospective chart review. Time frame: January 2019 through May 2022. Sample is from behavioral medicine appts with diagnosis of autism. Physician has reviewed all charts to confirm. N (number of visits) = ~7,000, 791 with autism. Readmission is possible. Outcomes: 1. Length of stay. 2. Disposition (home or not home).
  • Recommendations:
    • Power should always be considered.
    • Two ways to pick patients 1. who cross paths with group 2. specific characteristics. Need to be aware and be explicit about how to pick people. Patient could have driver/risk factor that could influence LOS so would recommend considering repeat visits.
    • Regression approach with consideration of repeat visits. Ex: mixed effects with patient as random factor in model.
    • Need to know whether the patient is diagnosed with autism at the time of hospital visit.
    • Primary outcome: Decide up front what units you will use for outcome. Possible for LOS to have skewed distribution. Consider either 1. gamma regression 2. cox proportional hazard model (time to discharge, primary predictor is autism). Most popular is cox regression, can adjust for baseline characteristics.
    • Secondary outcome: disposition (home or not home). Use a logistic regression with whether or not patient was diagnosed with autism as the primary predictor.
    • Tactic for missingness: change wording for variables (ex. if variable is only measured in certain people use 1. known "abnormal" value 2. no known abnormal value)
    • Rule of thumb is 10-20 events per degree of freedom. Since the sample is large, could do validation on the data.
  • Possible VICTR voucher. Application website (https://starbrite.app.vumc.org/) and research proposal template (https://starbrite.app.vumc.org/funding/templatesforms/).

2022 July 11

Alyssa Merkel (Jillian Rhoads, Jana Shirey-Rice), Vanderbilt Center for Bone Biology

We are designing an EHR-based retrospective cohort study to investigate selective serotonin reuptake inhibitor use in patients with type 2 diabetes with renal complications. Our primary objective is to see if SSRI use is correlated with a change in rate of disease progression over 5 years from date of diagnosis. We would like input on design of a proper statistical analysis plan to address our primary objective and to address any confounding bias in this study. Mentor confirmed.

Clinic Notes:
  • Gather clinical data to inform potential pilot drug repurposing project. Currently majority of the basis for project is rooted in literature. N ~ 25,000 (would be less with more i/e criteria). 5 year follow-up. Primary research question is to compare change in rate of disease progression with/without SSRI use. Retrospective data.
  • Recommendations:
    • Make sure everyone has full 5-years of follow-up. Newly diagnosed patients while at Vanderbilt. Timing of exposure (SSRI) will be tricky to pull from EHR.
    • Analyze creatinine directly since age plays a part in eGFR calculation (eGFR changes every year even if creatinine doesn't). Consider 2022 definition of eGFR without race.
    • Time varying-covariate cox model. Try to work with Nephrology statisticians to work up analysis plan prior to submitting voucher.
  • Possible VICTR voucher. Application website (https://starbrite.app.vumc.org/) and research proposal template (https://starbrite.app.vumc.org/funding/templatesforms/).

Dalton Nelson (Megan Pask), Biomedical Engineering

Development of a Sequencer-Free SARS-CoV-2 Variant Detection platform. Key questions: • demonstrating data distribution matches population distribution (pdfs, cdfs, pp-plots) • PCR Cq value estimations between empirical data points • probability of certain variants in a given time frame considering US population stats. Mentor confirmed.

Clinic Notes:
  • Variant detection assay/sequencing. Want to show study is robust and valid via random sampling. Looking to compare distribution for generalizability to population.
  • Recommendations:
    • KS test (quantitative measure) but challenge is that it tends to be a test of sample size. Large sample size tends to pick up the most minute differences that don't matter. Better to rely on graphical display. QQ plot, CDF, PDF to show how much agreement you have in terms of disribution. Can look at max difference between the 2 lines. Prespecify the amount of wiggle room (deviation) you will allow between the 2 lines.
    • Dose response curve to determine what it takes to get 19/20 samples. For CI, use logistic regression model and transform the standard deviation from model but would suggest using methods from dose response curve to how interval/error. Could also use Bayesian logistic regression.
    • To show performance of assay, report results to be variant specific.

2022 June 27

Shawniqua Williams Roberson (Leanne Boehm), Neurology

Q: Are there distinct subtypes of post-intensive care syndrome (PICS) with different cognitive, psychological, socioeconomic and physical profiles? If so, what are the demographic and clinical risk factors for these subtypes? We envision this project as a post-hoc analysis of data collected as part of the BRAIN-ICU and MIND-ICU observational cohort studies. We aim to aggregate the data from the two studies and use unsupervised clustering to identify constellations of PICs-related symptoms experienced by study participants at 3 and 12 months after discharge. Mentor confirmed.

Clinic Notes:
  • Post-hoc analysis on data from observational studies on post-ICU outcomes. Sample size: 564 at 3 months and 471 at 12 months for combined MIND-ICU and BRAIN-ICU/
  • Aims of the proposed study: 1. examine the degree to which PICS-related deficits occur in clusters (i.e. PICS phenotypes). 2. Identify potential clinical and demographic risk factors for these clusters.
  • Questions: 1. Ways to get biostatistics help. 2. Statistical approaches for the two aims.
  • Recommendations:
    • Ways of getting biostatistics support: 1. VICTR voucher (needs to be translational research). 2. Delirium group in campus (Rameela Raman). 3. Working with a graduate student (for one-off collaboration) in the Data Science Institute or Biostatistics department. Getting in touch with DGS of the Biostatistics program (Robert Greevy and Ben French). For DSI students, get in touch with people who have a direct connection with students.
    • Statistical approaches: Aim 1: K-means clustering is a good place to start. The important thing is to assess the stability of the clusters afterwards. Bootstrap clustering is a good way to do so. Aim 2: multinomial logistic regression is a good option. Random forest or support vector machine (SVM) also works.
  • Possible VICTR voucher. Application website (https://starbrite.app.vumc.org/) and research proposal template (https://starbrite.app.vumc.org/funding/templatesforms/).

2022 June 13

Whitney Barnett (Kathyrn Humphreys), Psychology and Human Development

Help with a stats review for a VICTR resource request. The proposed project aims to use repeated measures of multiple inflammatory markers to 1) chart trajectories across pregnancy and 2) explore whether changes in inflammatory markers are associated with changes in reward responsiveness and depressive symptoms. The reviewer comment is "The response did not address checking of the assumed correlation pattern (compound symmetry?). And there is a misunderstanding of how preliminary data are used. They are used to show feasibility and to estimate either prevalence (if response is binary) or variability (if it is continuous) for use in a power/sample size calculation for another study. Power calculations never use effect sizes that were observed. They involve only the effect one would not want to miss." We are not quite sure how to respond. Mentor confirmed.

Clinic Notes:
  • Focus on brain and mood changes during pregnancy. Markers on inflammation. N=25, with 3 measurements each participant.
  • The reviewer questions are related to what you will pull from the data and how you will use it. You have correlations in the data (multiple measurements per person), you need to understand how to model those correlations and need to describe the assumptions of the correlation.
  • Recommendations:
    • Check assumptions of correlation matrix in analysis plan (ex. inspect correlation matrix and adjust accordingly).
    • Use the pilot study to look at distribution of data and the "noise".

2022 May 02

Scott Miller (Aaron Yang), Physical Medicine & Rehabilitation

We are looking at the relationship between knee and spine surgery. I have data pulled regarding the surgeries but need help and direction as to the biostatistics section with regards to timing of procedures, which procedures happened first. Mentor confirmed.

Clinic Notes:
  • Looking at relationship between knee and back pathology. Applied for VICTR grant previously. Matched 1:2 cases (knee osteoarthritis (OA) with replacement) with controls (knee OA without replacement). Aims 1) prevalence of low back pain in patients with knee OA necessitating total knee arthroplasty (TKA) compared to matched controls 2) temporal relationship between diagnosis of low back pain and TKA 3) determine rate which patients undergoing TKA also undergo lumbar spine surgery compared with demographically matched controls.
  • Could have missing data when patients do surgeries elsewhere or visit VUMC for the first time (missing date of onset).
  • Recommendations:
    • Focusing on patients that are going to be capture in EHR for both conditions so that the conclusions from the study will be internally consistent. Will need this to ensure quality matches and being able to respond to reviewer criticism.

2022 April 04

Kayla Anderson (Yolanda McDonald), Human and Organizational Development

Return after first attending the clinic on 3/7/2022. Prior description: We are comparing regression methods, such as lasso regression and binomial logit regression, for analyzing weighted survey data. We have questions about how to best implement these methods and whether or not lasso regression can be used with weighted survey data. We have already conducted both methods of regression (lasso and binomial logit) with unweighted data. Questions for this clinic visit: We have questions around standardizing variables and using factor analysis to create latent variables. Mentor confirmed.

Clinic Notes:
  • Looking for ways to create indices for questions that are not on the same scale.
  • Recommendations:
    • Combining variables in to one index, imposed data structure. Bayesian analysis allows you to incorporate outside structure while still performing regression. Put all related variables in as covariates, use coefficients as weights to combine questions in to single score. Could incorporate shrinkage in the prior distribution, protects from over fitting. Use shrinkage to generalize to out-of-sample population.
    • Packages in R that are useful: rstanarm, rmsb. Use a ordinal regression model where the number of violations would be the outcome for the coefficients. Could incorporate sampling weights into the ordinal regression.

2022 March 21

Marshall Wallace, Division of Acute Care Surgery

Return after first attending the clinic on 2/28/22. Prior description of the project: We have developed and administered a 33-question survey which assesses provider perspectives, practice patterns and ethical considerations associated with discharge planning for victims of violence. This survey contains ~10 demographic questions, a set of 5 likert questions which are repeated three times assessing provider responses to three unique clinical vignettes, followed by ~10 separate likert style questions. This survey has been administered to VUMC ED and Trauma physicians, with ~85 responses gathered so far. We would like to perform descriptive statistics on the data obtained. First, we would like to describe the variation in responses to the three clinical vignettes. Second, we would like to assess how covariates such as demographics and responses to the ~10 separate questions relate to trends in responses to the clinical vignette responses. Questions for this clinic visit: I am working with R and excel to perform the recommended statistics from my last clinic on 2/28. I would like help with this on R, help with interpretation of my results and help with determining how to describe this in a methods section of a write up.

Clinic Notes:
  • Developed bar charts after last time's clinic visit.
  • Recommendations:
    • P-values should be used to supplement the bar charts.
    • Question: is there a way to summarize the difference between scenarios for each survey question? There is a correlation of how 1 person answers the same question in different scenarios. Independence test shows differences in the distributions (bar charts). Correlation between scenarios but works best for two instead of three.
    • Understanding patterns among respondents.
    • If you make assumption that differences between each level of the scale is the same you can use some tools like repeated measures anova, calculating means, random effects model. Use lmer function from the package lme4 in R. Compare models with and without the scenario variable.

2022 March 14

Ryan Hsi, Urology

I'm planning an RC2 grant submission for a multicenter prospective registry for kidney stone disease. Need input on study design, sample size/power, resources for de-identified data management/data structure, specimen tracking and storage

Clinic Notes:
  • Kidney stone registry with longitudinal data. Collect data on outcomes such as kidney stone recurrence and kidney stone growth. Timepoints, baseline and followup every 6 months for 3 years. Looking at risk factors for progression of disease. Multiple centers with a sample size of 300-500 patients.
  • Questions:
    • 1) power calculation - since it is not hypothesis testing, don't let "power" be seen anywhere. Make case that you would learn something or that it would change clinical practice. Wanting to estimate risk of outcome/how long for something to develop (prediction problems). Can look in to ordinal outcome.
    • 2) resources for data collection/management.
  • Recommendations:
    • Getting intensive data on a smaller sample of patients may be ideal but patients may not show up. Could consider random intervals for imaging. Could do 3 CTs per patient and randomize the time of the middle one.
    • Rough rule of thumb: number of clinical events divided by 15 is the number of factors you can look at.
    • Attend design/grant-writing studio to get multi-disciplinary feedback.

2022 March 7

Kayla Anderson (Yolanda McDonald), Human and Organizational Development

We are comparing regression methods, such as lasso regression and binomial logit regression, for analyzing weighted survey data. We have questions about how to best implement these methods and whether or not lasso regression can be used with weighted survey data. We have already conducted both methods of regression (lasso and binomial logit) with unweighted data. Mentor confirmed.

Clinic Notes:
  • Main question: could we use LASSO method on weighted survey data?
  • Recommendations:
    • Using your own data to select variables in LASSO will affect your coefficients and confidence intervals. If using LASSO might need to do 10-fold cross validation, bootstrap, etc to make sure you have good outside sampling information then do normal regression. Look in to relaxed LASSO. To incorporate weight information in LASSO, can grab variables related to sampling design and force them to be in the model as predictors.
    • Could also recreate sample based on weights. And it will reflect the sampling probabilities.
    • Backward/forward variable selection methods are known to be unstable. And they give standard errors (confidence intervals) that are too small. One way to look at the stability of the method is to bootstrap the data and do the selection.
    • In LASSO, one way to get stable results is 1se. And you could force variables to stay in the model.

2022 February 28

Bill Nobis, Department of Neurology

We intend to use the Vanderbilt BioVU database to perform whole exome sequencing (WES) of a population of epilepsy patients enriched for Sudden Unexpected Death in Epilepsy (SUDEP). After WES, we will look for biomarkers of SUDEP risk in patients, including genetic and clinical factors. We'd like guidance on the number of control samples to include for the appropriate power.

Clinic Notes:
  • BioVU and SD databases, will use ICD10 codes to create cohort. There is a total of 70 records to compare against control samples. Would like to do sequencing on those 70 patients. Looking at how certain genes are predictive for Sudden Unexpected Death in Epilepsy (SUDEP). Controls are 1000 epilepsy patients, representative of population at Vanderbilt.
  • Recommendations:
    • Email statisticians who are specialized in genome analysis (Ran Tao and Yaomin Xu).
  • Possible VICTR voucher for sequencing. Application website (https://starbrite.app.vumc.org/) and research proposal template (https://starbrite.app.vumc.org/funding/templatesforms/).

Marshall Wallace (Allan Peetz), Division of Acute Care Surgery

We have developed and administered a 33-question survey which assesses provider perspectives, practice patterns and ethical considerations associated with discharge planning for victims of violence. This survey contains ~10 demographic questions, a set of 5 likert questions which are repeated three times assessing provider responses to three unique clinical vignettes, followed by ~10 separate likert style questions. This survey has been administered to VUMC ED and Trauma physicians, with ~85 responses gathered so far. We would like to perform descriptive statistics on the data obtained. First, we would like to describe the variation in responses to the three clinical vignettes. Second, we would like to assess how covariates such as demographics and responses to the ~10 separate questions relate to trends in responses to the clinical vignette responses. Mentor confirmed.

Clinic Notes:
  • Survey on emergency and trauma providers. Ethics on discharge planning when sending someone to a place that might not be safe. 33-question survey, including demographic information. 3 clinical scenarios, same 5 questions. Additional ethical prompts at end of survey. Questions are typically on 5-point likert scale. 83 responses received so far. Question is do different levels/types of physicians react/think differently.
  • Recommendations:
    • What analyses can we do to perform descriptive statistics? A lot are visual, need to get clear idea of what you want to present. Could do stacked bar charts for likert scale questions. Could do separate stacked bar charts for different groups to see the distribution differences. Could do color-coded matrix (5x5 table) based on answer to a particular question.
    • There are single number summaries for group differences (common odds ratio). But the assumptions may not make sense in this case.
    • Can also look at agreement statistics between groups.
  • Possible VICTR voucher for sequencing. Application website (https://starbrite.app.vumc.org/) and research proposal template (https://starbrite.app.vumc.org/funding/templatesforms/).

2022 February 21

Christine Kimpel (Alvin Jeffery), School of Nursing

As part of a VA Quality Scholars improvement project, Drs. Amy Guidera and Alvin Jeffery and I are planning a QI project to increase nursing’s awareness and use of a new quality metrics dashboard at the VA. At baseline and follow-up, we are collecting baseline and survey data (e.g., Evidence-based practice beliefs) consisting of three, 3-item scales. Our questions include, what statistical tests may be used to account for dropout. As we are unable to collect demographic data, what options are there for statistical adjustment (e.g., unit type)? There are surrogate markers from the All Employee Survey available at the unit level. How could we use those metrics? Mentor confirmed.

Clinic Notes:
  • Intervention is training on use of dashboard. 9-question evidence based practice survey is outcome, measured at baseline and 6-12 months later. There will be 3 summary outcomes. Hypothesis is scores changed after training/over time. No randomization in the study, comparing pre and post scores. Sample size potential is 12 record. This is pilot study.
  • Recommendations:
    • What is most appropriate pre/post comparison tests (high employee turnover [21-25%], no demographics, response rate is 50%)? Paired test has advantage of much more precise outcomes, you are looking at within person variation which tends to be less. For a paired test, could only include people who completed both pre test and post test. Assumption you make without paired test is that pre test would be representative of entire population (ie. same type of group/respondents). Compute paired non-parametric test on differences.
    • Pilot study shows that data is feasible to collect. At the next phase, could do a step wedge design randomization from different units.

2022 February 14

Ivana Thompson, OB/GYN

We have designed a survey to access clinical practices for managing delivery of the placenta after second trimester delivery. We would like to share our survey and get your input on how we should approach data analysis.

Clinic Notes:
  • Management of the third stage of labor (delivery of placenta). Survey sent to providers who deliver babies to get sense of how to manage particular stage of labor. Some questions will have ordinal outcome.
  • Recommendations:
    • To analyze ordinal/ranked data, could 1) list out all possible preferences and report percent at each one (good if there is a dominate ordering) 2) think about it as instantaneous run off, rank preference order, count up how many first ranks there are for each possibility (can identify which is least liked, reallocate votes until a preference gets more than 50% of votes) 3) rank aggregation methods, take ranks and put them in to single score which allows you to summarize the question by reporting scores.

2022 January 24

Sarah Welch (Christianne Roumie), Physical Medicine & Rehabilitation

Discussed ahead of time with Rob Greevey. Measurement of exposure variables collected from chart. Proper way to model relationship between exposure variables and outcome. Studying the 4Ms of geriatric care (mobility, mentation, what matters most, medications) in the hospital setting at Vanderbilt. Data was pulled from the RD and abstracted through chart review. Hospitalized older adults are vulnerable to suffer healthcare-related preventable harm. Age-friendly health systems is a national quality improvement initiative that addresses this problem by implementing a care delivery model called the 4M's (Mentation, Mobility, Medication, and what Matters most). We have adopted the 4Ms on Vanderbilt's Acute Care for Elders (ACE) unit. To our knowledge, no prior studies have been been published correlating the measurement of the combined 4M's with hospital outcomes. We have a retrospective cohort study consisting of chart review on patients admitted to the ACE unit in 2020. We reviewed the electronic medical record to quantify the measurement of each M on the unit. We had an IDASC custom data pull from the Research Derivative to include hospital outcomes and additional health demographic variables/covariates on our ACE unit patients in 2020. We would like help with our analytic approach to examine the relationships between the exposure (measurement of the 4M's in the hospital) and the hospital outcomes pulled from the Research Derivative (discharge to post acute care, hospital readmissions, etc.). Mentor confirmed.

Clinic Notes:
  • Question: Is incremental delivery of 4Ms care associated with reductions in hospital readmissions or discharge to post-acute care among those 65+? 4Ms care (what matters most, medication, delirium/mentation, mobility).
  • Exposures are the delivery of 4Ms (binary at this point, categorical if needed). Covariates collected through RD include age, race, ethnicity, hospital LOS, social vulnerability index, BMI, insurance, admitting source (home vs. institution), charlson comorbidity index. Covariates collected through chart review include Vanderbilt familiar faces, Lives alone, discharge diagnosis, fall during hospitalization, cognitive Impairment/dementia, frailty index. Outcomes include 30 day readmission (binary), 30 day ED visit (binary), 30 day death (binary), discharge destination (home vs. post acute care, binary), last follow up date (for censoring).
  • Prospective cohort has 400+ patients, retrospective chart review has 500+ patients. The two data sets do not necessarily have the same variables and the variables were not collected in the same way.
  • Recommendations:
    • Outcome should be redefined if you can be admitted from PAC and then return to it. Almost like an exclusion criteria because you already experienced the outcome (discharge to PAC). Huge limitation is not having admission PAC status.
    • Useful descriptive statistics might serve better than creating complex analysis.
    • Could look at differences between level 0, 1, 2, 3 for outcomes.
    • If you want good estimate of direction of effect of 1 M on outcome, control for everything else including other M's. Would not combine M's at this point.
    • In readmit model, can add in discharge to PAC as covariate.
    • Use this data to trim down covariates for next study.
    • Include death in outcome (ex. readmit or death).

2022 January 10

Aileen Wright, Biomedical Informatics

Predicting hypoglycemia (low blood sugar) in inpatients. I am inputting many variables into a logistic regression (e.g. patient’s age, weight, etc.) However, one of the variables is actually time series data, of glucose values over time. What’s the best way to incorporate this time series into the logistic regression which also has other variables in it (i.e., this is not just a regression of time series data only.)

Clinic Notes:
  • Try to develop a predictive model for low blood sugar (lower than 70). Outcome is next glucose level. Looking for advice on time-series models, glucose over previous 24 hours.
  • Inclusion criteria: hospitalized adults with at least one order of insulin. Exclusion criteria: patients in ICU or palliative care.
  • Logistic regression will be least powerful statistically (standard errors will be much larger, less precise).
  • Recommendations:
    • Better to predict actual glucose level than low/high glucose.
    • Need to think about the correlated nature of the data (multiple rows per patient). Random effects model to incorporate correlation within patient.
    • Options to incorporate glucose trends in to model 1) include covariates related to number of glucose measurements in the last 24 hours 2) include most recent glucose value with non-linear component 3) include most recent glucose value and the change between that and the one right before [build submodels for those with no glucose values, 1 glucose value, 2 glucose models, etc.] 4) use a polynomial estimated on previous glucose values in the model [slope of trend of glucose values as covariate] 5) relaxed Lasso.
    • Consider how to incorporate interactions or non-linear components in all the options of models above.
Topic revision: r2 - 18 Dec 2023, IneSohn
 

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback