Health services research, diagnosis, and prognosis

Click here for 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, and before.

Current Notes (2024)

2024 June 24

Nikita Bastin (Alaina Brown), Ob/Gyn

In this retrospective study, we will assess the extent to which symptoms at presentation varied by race and socioeconomic status. We will also evaluate whether such disparities in symptomatic variation were associated with differences in ovarian cancer diagnosis and outcomes, time to referral to palliative care, and symptomatic improvement at palliative care consultation and following palliative care treatment.

Questions on statistical plan:
We will run a linear regression to assess the relationship between race and pain, and SES and pain. We will run a logistic regression to assess the relationship between race/SES and the other categorical primary outcomes (fatigue, GI symptoms, urinary symptoms, and depressed mood).
We will run logistic and linear regressions to assess the relationship between our primary outcomes and ovarian cancer specific secondary outcomes.
We will run logistic regressions to assess the relationship between our primary outcomes and palliative care specific secondary outcomes. We will be using regressions to assist with controlling for confounding.

Clinic Notes:
  • Retrospective study that looks at the disparities in ovarian cancer over race and socioeconomic status. Expect to get more than 100 patients.
  • SES will mostly be extrapolated from ZIP code.
  • May expect challenges in enrolling enough participants from a lower SES class based on the design.
  • Recommendations:
    • If the sample size is small, may be difficult to adjust for all covariates. Can consider ordinal logistic regression instead of linear regression.
    • For biostatistics support, can consider 1) coming back to clinic for consult, 2) applying for a VICTR voucher (https://starbrite.app.vumc.org/s/vrr), or 3) seeking department collaboration.
    • In the proposal, specify plans for patients who are lost to follow-up, for example a sensitivity analysis.

2024 June 10

Derek Williams and Leigh Howard, Pediatrics/Hospital Medicine and Infectious Diseases

We are proposing a substudy nested within a multicenter RCT of antibiotic treatment duration to test hypotheses regarding presence and durability of antibiotic resistance measured using a next gen sequencing platform (Illumina Respiratory Pathogen ID Panel) from upper airway samples.

The parent RCT will compare effectiveness of a short (5d) vs standard (10d) course of antibiotics for children of all ages with 3 acute infectious conditions (pneumonia, urinary tract infection, or cellulitis/skin abscess). The study will enroll 1200 participants. The primary outcome will be measured ~14d post-enrollment. Secondary outcomes at 30 and 90 days.

For the substudy, we hope to enroll ~50% of subjects (n=600 total) and collect repeated upper airway samples at enrollment, 2 wks, 1 mo, 3 mo.

The analytic method detects the presence or absence of numerous known antibiotic resistance genes, each represented with a binary 0/1. We hypothesize that antibiotic duration will affect the presence and pattern of antibiotic resistance genes (# and types of resistance genes) immediately following treatment and that these changes will be durable to 3mo.

Our ask is brainstorming outcome constructs that will best allow us to quantify and compare resistance patterns across treatment arms at discrete time points and over time.

Clinic Notes:
  • Secondary analysis of a larger randomized clinical trial concerning hospitalized children with acute infectious conditions; primary outcome is ordinal effectiveness. Detecting respiratory bacteria and antibiotic resistance genes (ARGs) by next-generation sequencing (NGS). The goal is to enroll ~600 subjects.
  • 4 proposed time points to collect nasal swabs - enrollment, primary outcome visit (~5 days after end of study treatment), 30 days, 3 months.
  • Recommendations
    • Treating each ARG related outcome as a binary 0/1 may cause issue of assigning equal weight. Can consider count outcome though needs to be aware of zero-inflated outcome. Zero-inflated Poisson model can be used in this case.
    • If sample size permits, can consider joint models - fitting multiple outcomes at the same time.
    • If using a longitudinal model, can add the interaction between time and treatment.
    • For power analysis, can use the timepoint of primary outcome visit.
    • To analyze patients' profiles over time, can consider latent transition analysis.

2024 May 20

Emma Clark (Marianna LaNoue), Nursing

Criteria for non-nested model selection. Several methods of combining survey data into a composite score to serve as a predictor variable and used different versions of predictor variable to model three outcomes. I applied Vuong's and Clark's tests for non-nested models and am having trouble interpreting results to come to a concrete conclusion about model selection and if different models by outcome are we selecting appropriately by case.

Clinic Notes:
  • Composite score analysis using data from three studies with 26-27 components spanning across five domains.
  • Sample size is ~140 for two studies and ~90 for one study.
  • Comparing linear models' fitness using AIC.
  • Recommendations:
    • It is to be expected that model selection methods do not always agree. With the sample size, the model selection methods may not work.
    • Redundancy analysis from Frank Harrell's book chapter: https://hbiostat.org/rmsc/multivar#sec-multivar-data-reduction
    • To execute redundancy analysis, can use R function 'redun' in Hmisc package.
    • To evaluate model improvement, can use adjusted R-squared.

2024 May 13

Rachel Azevedo (Ashley Shoemaker), Pediatric Endocrinology

My project is a two-phase study where we will be analyze patient perceptions of genetic testing in the evaluation and management of pediatric obesity to see if these perceptions influence patient outcomes. The first phase is nearing completion; we are using surveys and qualitative interviews on a cohort of ~117 patients who have already completed genetic testing to learn broad themes about patient perceptions of genetic testing in regard to pediatric obesity management.

We are finalizing the protocols for the second phase of this study, where we will actively recruit patients aged 12-21 at the VUMC weight management clinic. These adolescents are all offered free genetic testing as part of their management, so for patients choosing to get genetic testing we will ask them to complete a pre- and post- testing survey and will follow their outcomes (including BMI, appointment show-rate, medication fill-rate) to determine if there is a correlation between their views on genetic testing and their outcomes. Based on recommendations from my previous biostats clinic, patients who initially agree to get genetic testing but don\x{fffd}\x{20ac}\x{2122}t get this completed will not be excluded from the study and will serve as a reference group.

As the second phase is searching for correlations between patient perceptions and the collected data, I am requesting assistance with planning this analysis & creating a statistical model.

Clinic Notes:
  • Two-phase study to examine whether patient perceptions of genetic testing in evaluation and management of pediatric obesity influence outcomes. Sample size of 54 surveys of patients and parents combined. Estimating drop out of ~20% over 6 month period (about 45 surveys left - 30 patient, 15 parent).
  • Survey patients prior to genetic testing and 6 months into treatment, as well as quantitative health variables measured at each follow up visit.
  • Recommendations:
    • There are a few ways to look at longitudinal/pre-post data. Depending on sample size, can look at correlation or fit a model. For modeling, can consider mixed effects model.
    • For Likert variables, Spearman's correlation is more suitable.
    • It would be more powerful to get longitudinal data and build models based on that.
    • Suggest not excluding patients who complete the first survey but do not complete the second survey due to small sample size.

2024 April 22

Valeria Rullan (Jessica Leschied and Katherine Van Schaik), Radiology

The purpose of our project is to develop a predictive algorithm of clinical severity (no infection vs local infection only vs disseminated) of pediatric musculoskeletal infections based on imaging findings. Three different reviewers evaluated images of 220 patients obtained at time of initial presentation and assessed each image for specific findings. Radiographs, ultrasound, MRIs were all included if they were obtained on the same day. A value of either 0 or 1 was given for presence (1) or absence (0) of each predetermined predictive variable. Our goal is to see if there are any variables that can predict the clinical outcome of the patient. We will use data from a previous research project that categorized each of the cases reviewed into either no infection, local infection, or disseminated infection based on clinical information.

Clinic Notes:
  • Would like to build a predictive model using binary imaging features for musculoskeletal infection in children.
  • Recommendations:
    • Consider ordinal regression model as statistical model for maximizing interpretability; model will fit a three-level outcome in a proportional odds model fasion with covariates as regression coefficients.
    • Can also consider this as a classification problem with random forest method in machine learning.
    • For collaborative options, there are VICTR voucher or department collaboration.

2024 April 15

Carrie Donohue, Hearing and Speech Sciences

We are submitting an Alzheimer's Association grant to determine the prevalence of voice and swallowing impairments in 80 people with AD, and to examine dysphagia-related caregiver burden and risk factors for caregiver burden in 80 caregivers of people with AD via the following specific aims:
Aim 1: To determine prevalence of voice and swallowing impairments in people with AD using imaging.
Aim 2: To evaluate concordance between patient and caregiver-perceived voice and swallowing function.
Aim 3: To examine dysphagia-related caregiver burden for caregivers of people with AD.
Aim 4: To identify risk factors for caregiver burden in caregivers of people with AD.
Questions related to statistical analysis plan and power analysis- is a power analysis still appropriate to do given that there is limited prevalence data of voice and swallowing impairments in individuals with AD?

Clinic Notes:
  • There are three aims: 1. Determine the prevalence of voice and swallowing impairments in people with Alzheimer's disease. 2. Evaluate the concordance between patient and caregiver evaluations. 3. Examine caregiver burden for caregivers of people with AD. Plan to do a cross sectional study, maybe follow up patients for patient-reported outcomes. Plan to use a validated, 8-point ordinal scale to measure swallowing impairment.
  • Limited literature on prevalence of voice and swallowing impairments
  • For another proposal, looking at screening tools for voice and swallowing impairments in people with AD. Plan to look into a combination of screening tools.
  • Recommendations:
    • For foundational grant, perhaps restrict to cross sectional study design and include some aspects of longitudinal follow up in consent form
    • Can do a precision calculation, for example the number of patients to have a confidence interval of X
    • For concordance analysis, consider Kappa agreement score with caveat that it can be "harsh" for ordinal data. Can also consider ROC curve to compare against the "gold standard"
    • For caregiver burden, consider assessing univariate association, and using ordinal regression model
    • For the second proposal, the sample size is limited to evaluate a combination of screening tools. Can be an exploratory analysis.

2024 March 18

Taneya Koonce, Mallory Blasingame, and Jing Su, Center for Knowledge Management

We are conducting a project with ChatGPT to assess accuracy of responses. Our questions are about sample size calculation and statistical approach method.

Investigator requested no notes to be posted.

2024 March 4

Kathryn Lewis (Daniel Romero), Hearing and Speech Sciences

A prospective observational mechanistic study assessing the association between autonomic and vestibular dysfunction in sub-acute and chronic moderate-severe traumatic brain injury and its impact on long-term disability and early death.
  • We are measuring autonomic and vestibular function via standardized tests at 2-4 months and again at 9-12 months.
  • We are comparing objective results from these tests to age matched controls
  • We will also be conducting semi-structured interviews and subjective questionnaires
  • Participants will be recruited from an inpatient rehab unit
  • This is a feasibility study and my target sample size is 60 participants, though this can change.
  • Need help with
    • research study design
    • power analysis
    • effect size/sample size
    • planned statistical analyses
  • Already have submitted a VICTR studio request
Clinic Notes:
  • Feasibility study on concurrent incidents of vestibular dysfunction and autonomic dysfunction on TBI patients 2-4 months post-injury. Sample size target is 60 with age-matched controls.
  • Dysfunction will be considered binary (present vs. absent) based on test results.
  • Originally considered longitudinal component, currently unsure whether one-time assessment is better.
  • Recommendations:
    • Can consider paired t-test for continuous outcome (McNemar 's test for categorical), or consider adjusted analysis based on age. Can also consider conditional logistic regression for binary outcome adjusted for age, conditioned on yes/no autonomic dysfunction and other covariates.
    • For longitudinal analysis, can consider longitudinal conditional logistic regression. Consider dropout due to death if doing a longitudinal analysis.
    • May consider calling it a pilot study rather than a feasibility study.
    • Instead of power, can consider reporting precision. For a statistic, consider reporting the confidence interval for it.
    • For the proposal, can report sample size based on a test (for example, McNemar 's test). One tool to consider is G*Power. Need to make assumptions for marginal counts without pilot data, based on past literature.
    • Fisher's exact test makes less assumptions compared to conditioned logistic regression, and could be a better choice if there is an uneven spread of samples among results.
    • Be aware of bias in assessments of vestibular and autonomic dysfunction.

2024 February 26

Chen Chia (Charles) Wang (Sunil Geevarghese), Hepatobiliary Surgery & Liver Transplantation

Starting a survey based project to look at impact of preoperative, intraoperative, and postoperative stretches on surgeon's physical health. Working with Dr. Geevarghese.
Clinic Notes:
  • Current plan: have surgeons take a baseline (prelim) survey, intervention of stretches pre-, intra-, post- operation (liver transplant), and conduct pre- and post- op surveys on pain levels. Concerned about data quality issue due to the time constraint of surgeons
  • Sample size is undecided, concerned about missing data (censored) and confirming that the surgeons actually completed the stretches
  • Recommendations:
    • Can use a validated pain scale.
    • Ordinal regression modeling for the pain outcome with data on length of surgery.
    • Do not recommend complete case analysis. Alternatively, can use multiple imputation (assumes missing at random) to model the missing values. Can further explore sensitivity analysis regarding missing data. Making surveys easy to fill out can reduce missing data.
    • May be more feasible to consider the intervention as recommending to do the intra-operation stretches rather than the intra-operation stretches themselves.
    • VICTR research design studio may be worth exploring research.support.services@vumc.org. Could also look at attending Wednesday clinic with specialty in surgery.

Hannah Chew (Neerav Desai), Pediatrics - Adolescent Medicine

Mixed methods study involving qualitative interviews of patients 18 years and older on how to improve the Adolescent Transition Clinic (transition of care site for patients with HIV), and quantitative descriptive analysis of patient demographics, time in care, viral loads, retention in care, etc.

Need help with making Kaplan Meier plot and designing figures.

Clnic Notes:
  • Things that the transition clinic can do better for adolescents. Interested in the viral load for patients during the transition period
  • Qualitative analysis is being carried out by VICTR qualitative research core
  • Recommendations:
    • Can do the analysis and come back to clinic, or apply for a VICTR voucher. VICTR support augmentation to the qualitative core may be feasible.
    • May need to reformulate the question to fit a Kaplan Meier curve, some information may be lost.
    • Consider plotting time viral load is suppressed vs. time viral load is not suppressed.

2024 February 12

Daniel Romero, Hearing and Speech Sciences

It is well known that TBI patients experience a variety of cognitive impairments including deficits in spatial memory. More recent evidence suggests there is an association between vestibular impairment and spatial memory/navigation deficits. However, it is unknown whether the deficits in the vestibular system contributes (direct or indirect) to the chronic spatial memory impairments commonly seen in TBI.

The long-term goals of my research are to determine how the vestibular system functions and contributes to memory impairments in patients with cognitive disorders, and to explore the role that the assessment and rehabilitation of vestibular impairments play in the care of this patient population.

The purpose of the proposed project is to begin this line of work in TBI using established measures of vestibular function and spatial cognition.

Questions to ask: It may be possible that TBI participants with and without vestibular impairment could be impaired at all memory tasks (temporal and memory) making it difficult to tease apart the contribution of vestibular impairment to spatial memory. If this is the case, we would want to see where vestibular impairment is contributing to the memory performances above and beyond the effect of TBI. We have a sample size of 30. Are there any statistical approaches/analyses we can take to see where vestibular impairment is contributing to the memory performances above and beyond the effect of TBI?

Clinic Notes:
  • Established relationship between spatial memory and vestibular impairment. Interested in how TBI comes into factor, when it is established that TBI is heavily related to spatial/temporal memory impairment. Two groups of patients: patients with moderate to severe TBI and control patients, with 30 patients from each group. Of the TBI group, approximately 60% of them have vestibular impairment.
  • Memory impairment is measured as continuous variables, e.g. time, accuracy
  • Vestibular impairment is measured as continuous variables, possible to use as categorical normal/abnormal
  • Recommendations:
    • With the sample size, it might be difficult to do a formal mediation analysis. Can do two analyses: spatial memory with TBI and vestibular impairment, and spatial memory with TBI alone.
    • Formal mediation analysis assumes all confounders are adjusted for, need to specify in hypothesis
    • Find a meaningful composite of outcome variables so that they are expressed as a single variable
    • Baron & Kenny (1986) framework & causal mediation analysis (software packages available) for reference
    • May be helpful to compare mediator model for TBI and spatial memory impairment to mediator model for TBI and temporal memory impairment to illustrate the narrative, even if the p-value is not statistically significant

2024 January 22

Scott Risney (Jeffrey Weiner), Pediatrics

The data are in a STATA file (the only way the database provides). The project utilizes a dataset of around 57,000 patients that was extracted from a multi-state database. Our research question is \x{201c}what is the effect of race on presentation to the emergency department requiring a procedure for critical congenital heart disease (CCHD).\x{201d}

Our variables:
  • diagnoses - categorical, includes number of patients for each level of CCHD
  • procedure - categorical, includes number of patients for each surgical intervention
  • race - categorical, includes number of individuals of each race
  • gender - male/female (I cleaned to remove missing)
  • median income - continuous

Questions:
  1. How could we use population data (i.e. race and ethnicity statistics at a state/national level) to assess if certain racial groups present more frequently to the ER with CCHD in comparison to our race data. Ideally, we would be able to compare race within each diagnosis, controlling for gender, median income, and +/- state. Would we need to find/upload those statistics?
  2. What is the best statistical model for using count variables with categories such as these?
  3. All of the observations in the dataset are for those patients that needed a procedure, so I have no comparison (eg. procedure vs no procedure). The only comparison I can make is across race or across the procedures/diagnoses. How can we report our data in a way that is meaningful?

Clinic Notes:
  • The study team is interested in how patient reported race affects clinical outcomes. The STATA file contains ~19,000 cases that the study team is interested in (newborns with critical congenital heart disease from the emergency department).
  • Data is mainly in population level; there is no control group.
  • Recommendations:
    • Multi-site studies may comprise population with characteristics that is different from the census, may need to think of a way to guess the racial distribution of the population. Upon further description, sounds like data is a good representative of all interested population.
    • Pay attention to language usage: refrain from using words such as "impact", lean towards correlation and association adjusted for confounders.
    • With count data, usually modeled with Poisson or negative binomial regression, conditioning on race and other confounding variables. For really skewed data, would choose negative binomial regression; while Poisson regression is for less skewed data.
    • Be cautious about removing observations with missing info, known as complete case analysis/rowwise deletion. Consider using multiple imputation instead.
    • Try fitting a Poisson model first, then tackle multiple imputation.

2024 January 8

Neeraja Swaminathan, Pediatrics

Our aim is to do a systematic review on the prevalence of heavy menstrual bleeding in adolescent females on anticoagulation for venous thromboembolism. I am in the process of drafting the proposal, based on the PRISMA checklist of items. I need help with the biostats items (some parts of the methods and results section pertaining to risk of bias/ risk assessment). In the PRISMA 2020 checklist in the link below, see questions 11-15 under Methods and 18-22 under Results).
http://www.prisma-statement.org/PRISMAStatement/Checklist

Clinic Notes:
  • In the process of drafting the proposal on prevalence of heavy menstrual bleeding in adolescent females on anticoagulation for VTE, need assistance on biostatistics portion.
  • Recommendations:
    • Effect measures: add how to report summary statistics in prevalence in case different sources use different measures
    • Synthesis methods: use a mixed effects model. Could use a weighted meta-regression, where the weight can be the sample size of the study
    • Synthesis methods: address heterogeneity by calculating estimate and confidence of I^2 statistics
    • Risk of bias assessment: assess each study for aspects such as selection bias, data quality
    • Risk of bias assessment: may use sensitivity analysis

Mae Wimbiscus (Michael Topf), Otolaryngology

Looking for someone to help us with multivariate analysis for a chart review that we have compiled of data for head and neck cancers.

Clinic Notes:
  • Abstract on early stage tongue cancer and rate of recurrence in patients who do not receive PORT, sample size ~ 263. Looking at predictors of the rate of recurrence.
  • Manuscript is due May 2024.
  • Recommendations:
    • Effect measures: add how to report summary statistics in prevalence in case different sources use different measures
    • Both the hypothesis and the analysis plan look reasonable from clinic attendance
Topic revision: r890 - 24 Jun 2024, YueGao
This site is powered by FoswikiCopyright &© 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback