Health services research, diagnosis, and prognosis

Click here for 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, and before.

Current Notes (2023)

2023 December 11

Megan Passarelle & Thu Can (Sarah Welch), Acute Rehabilitation

We are studying co-treatment in the hospital setting. We are planning on distributing a survey to occupational therapists, certified occupational therapy assistants, physical therapists and physical therapy assistants who work in an adult, inpatient setting to understand what factors influence their decision to co-treat. Co-treatment occurs when 2 therapists of different disciplines (OT and PT) work together within one patient treatment session. It is a controversial topic with limited published evidence currently. There is little guidance on when co-treatment is most appropriate, leaving many therapists confused surrounding its practice.

We (myself and co-investigator Thu Can) attended a biostats clinic in 9/2023. It was recommended we apply for VICTR funding. SRC recommended that we submit our survey prior to requesting funding. To prepare for survey distribution, our questions are:
  • Is the calculated sample size of 385 accurate for the main hypothesis that therapists are documenting rationale for co-treatment less than 50% of the time?
  • We are using REDCap for our survey. Do we need to code the answers in a certain way in order to make analyzing the data easier?
Clinic Notes:

  • The study team has questions about sample size calculation and coding in REDCap. The null hypothesis is that greater than 50% of therapists will respond "always" or "sometimes" for co-treatment.
  • Recommendations:
    • For sample size calculation, this will be a one-tailed hypothesis on proportion.
    • In REDCap coding, should have consistency between questions (for example, 1=Yes, 0=No for all questions).
    • In the proposal, can frame it either as sample size calculation or margin of error. Formula for the margin of error of a proportion with large sample size: 1.96*sqrt(p*(1-p)/n), where p is the proportion, n is the sample size, and sqrt is the square root.

2023 November 27

Praveen Vimalathas (David Bichell), School of Medicine

Regarding VICTR VR64421 - Dr. Harrell had stated in his review “The proposal is significantly improved, but some statistical problems remain. Detecting a reduction in NGAL from 473 to 241 was not justified as being the minimal clinically significant effect/difference (MCID) to detect, i.e., the effect you don’t want to miss. A 50% relative reduction is massive. Powering for this will miss a 25% observed reduction. MCID needs to be specified by a disinterested party. The statistical plan assumes that non-normality in the data will be obvious, which is not a very sound assumption. Quantiles are descriptive for both Gaussian and non-Gaussian data distributions. Similarly, it is not best statistical practice to assess data characteristics and then choose between parametric and nonparametric tests. First of all, nonparametric tests have excellent power for Gaussian distributions, and secondly the data may not possess the information to highly accurately judge normality.”

We are hoping to gain some guidance as well as address his question. Current literature available to us shows that in adults NGAL will decrease ~75% (112 in patients who received nitric oxide vs. 462 in those without nitric oxide), and we’re hoping after this clinic to clearly respond to his comments about the appropriateness of our n = 50 sample size and appropriate powering to detect a 50% reduction (which in fact, we may even see an identical 75% reduction in our pediatric population).
Clinic Notes:
  • Congenital Heart Surgery patients. Evaluating biomarkers for the impact of nitric oxide on patients. N=10. Hypothesis is that nitric oxide is that at least 50% decrease NGAL value. Submitted VICTR voucher but need help with feedback on analysis plan. Currently in data collection.
  • Recommendations:
    • Power section: possible language change to minimally clinically significant difference instead of effect size you expect to see. Communicate you are trying to confirm this result in an extremely similar population. State how you define clinical significance as what it would take to change practice and explain how you got there.
    • Parametric vs. non-parametric tests: not best idea to check for normality. Pre-specify the distribution of variables from previous studies and use this information to make decision on whether to use parametric or non-parametric approach.

2023 November 13

Cosby Stone, Allergy-Pulmonary Critical Care

Appropriate power for validating a patient performed penicillin allergy risk questionnaire and for a stepped wedge randomized controlled trial.
Clinic Notes:
  • Providers asks patients whether they have penicillin allergy, but this traditionally has a high false negative rate
  • Aim 1: can we validate a questionnaire where patients answer the questions that will be considered equivalent to providers asking questions in-person? Binary outcome of low-risk or not.
  • Recommendations:
    • If response rate to electronic survey or postcards is not high enough (ex. 90%), it can introduce nonresponse bias to the data.
    • Looking at ways to optimize patient interaction might be beneficial for this and future studies.
    • Frame the aim 1 question as a non-inferiority for power. Want it no less than 95%.
    • Discuss biases with pre-post in comparison to complexity of stepped-wedge design. In general, there are a lot of things that can go wrong with pre-post.
    • One of the most accurate confidence interval estimates for proportion is Wilson confidence interval. For this study we feel that computing the sample size based on precision is more appropriate than based on power. The sample size of 200 will yield a lower confidence interval for a proportion of .85 when the observed proportion is 0.9 and the sample size is 200 using the Wilson method.

2023 November 6

Rachel Mersfelder (Scott Borinstein), Pediatrics

This project is a retrospective cohort study examining if interval between neoadjuvant chemotherapy and surgical resection correlates with tumor necrosis determined at the time of surgical resection in children and young adults with osteosarcoma. We will secondarily be examining how this interval is related to outcome measures such ability to achieve remission and event free survival following completion of adjuvant chemotherapy. We have used a database of 121 patients and following exclusion criteria, we have gathered information on approximately 85 patients.
Clinic Notes:
  • Retrospective cohort study consisting 121 patients from VUMC on years 2020 - 2022, collected information on 92 patients. Outcome is time interval between chemotherapy and surgery, time from chemotherapy to event or last follow-up. Plan on univariate Pearson's correlation between time interval of chemotherapy and surgery and tumor necrosis and MLR. Hypothesis is that longer interval (chemotherapy - surgery) correlates to worse tumor necrosis and clinical results. Secondary outcome is event free survival modeled on Kaplan-Meier curve.
  • Recommendations
    • Need to estimate based on precision (width of confidence interval). Power would be more arbitrary, but would be able to say power to reject the null hypothesis.
    • Be flexible on how to include interval variable in the model to allow for non-linear relationships. Spline functions in regression models (piecewise days can change). Linear effect might be overly simplistic.
    • Project would fit in scope of voucher. Application website ( and research proposal template (

Neeraja Swaminathan, Pediatrics

We have collected data from patients that were followed at the vascular anomalies’ clinic during a 2-year time frame. This includes demographic data, clinical data and quality of life surveys that were filled out by families. I would like to analyze the data that is on redcap and need help with data analysis.
The questions are:
  1. Present the demographic data in the form of descriptive statistics
  2. Analyze and present the QOL data (Peds QL surveys for parents and children/ teen/adults used in the study)
  3. How does QOL correlate with age, gender, vascular anomaly diagnosis, presence of a genetic diagnosis?
Clinic Notes:
  • Retrospective study of 75 patients at the vascular anomalies clinic in a two-year period; data collected includes demographic and clinical data as well as quality of life survey responses. Data is collected in REDCap. Need help on how to present and analyze the data.
  • Recommendations
    • Modeling to look at relationship between QOL and the characteristics of interest. For likert scale (0-100) or continuous outcome, linear regression model. For likert scale (0-5), ordinal regression/proportional odds model. Suggest to have a composite outcome as your primary analysis to avoid multiple comparison issue.
    • Considering sample size, may want to narrow down QOL results for testing correlation.
    • Possibly include covariate for whether patient/surrogate filled out the survey.
    • To get to statistical help 1) can learn online and run yourself, 2) check to see if your department has a collaboration with the biostatistics department or 3) apply for VICTR voucher. With pending deadline, will need to complete work on your own as collaboration and voucher work will take weeks/months.

2023 October 30

Taneisha Gillyard Cheairs, Biomedical Sciences at Meharry Medical College

Our VICTR pilot project proposal is a for a enhanced, collaborative doula training program that incorporates clinical knowledge from academic researchers/clinicians into a community-based doula training aimed at address Black maternal mortality rates in Middle Tennessee. This is an observational, qualitative study. In pre-review, we were advised to seek counsel from Biostats Clinic to ensure that our sample size was in line with qualitative analysis perspective.
Clinic Notes:
  • Community-based doula training to address Black maternal mortality rate in Middle Tennessee. A one-year, observational, qualitative study. Five doulas, two pregnant women each. The sample size needs to be in line with qualitative study.
  • Recommendations:
    • Data saturation needs to be considered in qualitative research (min-max range in sample size)
    • Clarify that inference is not a goal in the proposal
    • For moms, they may not be able to assess the effectiveness of training if this is their first pregnancy. Consider collecting data with moms about their previous pregnancies.

2023 October 23

Christopher Khan (Brett Byram), Biomedical Engineering

My project focuses on applying an ultrasound processing method that has been developed by our lab in order to improve the quality of transthoracic echocardiography images. Specifically, I am planning on performing an observer study where I have clinicians score ultrasound images produced using both our lab’s developed processing method and the standard ultrasound processing method. The scores will be quantitative based off image quality. The purpose of this study is to determine whether our lab’s processing method provides a statistically significant difference in image quality when compared to the standard method. Moreover, we would also like to analyze the effects of other variables on image quality, such as BMI, age, and biological sex. Therefore, the questions that I would like to address are regarding what the best statistical analysis method would be for doing this. For example, should the images be scored using a Likert scale from 1-5 and then an ordinal logistic regression be performed using our different variables, or should the images be scored on a continuous scale and then a linear regression be performed using the different variables?
Clinic Notes:
  • Ultrasound image data. Processing data with standard ultrasound algorithm plus the new method. Want to show new method has improvement in image quality. 3 clinicians will score the images on a likert scale.
  • Covariates include age, biological sex, BMI
  • Recommendations:
    • Can we use regression to predict score from different imaging adjusted for characteristics? Continuous score you get more power to find smaller differences. Weaknesses are they get data not expected (not as much variation). Can also you ordinal (1-5) outcome, makes more sense and would be more comfortable.
    • Sample size calculation depends on the hypothesis. Think about unadjusted analysis (covariates not included) first. For a t-test, needs to make assumptions on mean and standard deviation on the data. Be conservative in power calculation since the actual multivariable model is different from t-test.
    • Look into other research on imaging data for analysis method.
    • May want to explore regressions with interaction terms (for example, intervention and BMI).
    • Within-clinician scores will not be independent, will need to adjust. Different methods can address this, such as including three covariates (for each clinician), or run mixed-effects model (sometimes called random effects model, multi-level regression model). Look into software instructions on different models.

2023 October 16

Alex Gimeno (Cristin Fritz and Neerav Desai), General Pediatrics

Our project is a pilot study involving chart review of uninsured clinic patients to identify and describe demographic information and compare based on insurance eligibility (age, gender, country of origin, primary language, secondary language, number of medical diagnoses, and mean ZIP code distance from clinic; n=~25 total). There is also a qualitative component, outside the realm of this meeting.

We received feedback on our submission for a lack of yield/margin of error estimation prior to statistical analysis. We are hoping for advice on yield/margin of error/power analysis for a pilot study, as data regarding population standard deviation on numerical variables is unknown/unavailable.
Clinic Notes:
  • Pilot work on healthcare access for uninsured patients at the Adolescent Clinic of Shade Tree Clinic (STC). N=25.
  • Questions for estimating margin of error
  • Eligible: ineligible ratio is 3:22 based on the criteria
  • Recommendations:
    • Think about approaches to boost the sample size (past studies, other clinics, etc.).
    • May be easier to focus on the description, rather than comparison, of the two populations based on the ratio, with a focus on the ineligible group
    • Can work backwards from the confidence interval formula of one-sample t-test (if focusing on ineligible group only) to derive margin of error, formulas for proportion and continuous variable will be different. Even better if the study team can cite standard deviation from a previous study.
    • For dichotomous variable, can report the proportion, and generate confidence interval from a statistical software.

2023 October 9

Brian Hou (Jonathan Schoenecker), Orthopedic Surgery

Pediatric orthopedic surgeons are frequently consulted to triage musculoskeletal infections (MSKI). Rapid triage is essential, as timely intervention can significantly impact the outcomes of children with MSKI. The primary goal is to quickly determine if these children are at risk for adverse outcomes resulting from the local infection and potential disease dissemination. Proper triage includes the laboratory tests that reflect the likelihood and severity of infection. Although white blood cell (WBC) count has been traditionally used to assess infection in other pediatric conditions, it has limited diagnostic utility for MSKI. Instead, C-reactive protein (CRP) has been demonstrated to be a reliable diagnostic marker for differentiating MSKI from non-infectious causes and predicting disease severity. However, CRP testing is limited in some settings. To address the need for a rapid and accurate diagnostic marker, we hypothesized that the neutrophil-to-lymphocyte-to-platelet ratio (NLR/P) could be a useful alternative to CRP in predicting MSKI and disease severity.

WBC count is a metric used in the Kocher criteria, a set of clinical indicators used to differentiate septic arthritis from non-infectious etiologies in pediatric patients presenting with hip pain. While often used in clinical practice, their effectiveness and usefulness have come under scrutiny. The criteria, including fever, refusal to bear weight, elevated WBC count, and erythrocyte sedimentation rate (ESR), were developed based on adult studies and may not adequately capture the unique clinical presentation of MSKIs in children. Additionally, the sensitivity and specificity of the Kocher criteria have been found to be suboptimal, leading to both over- and underestimation of infection severity. To improve upon the limitations of the Kocher criteria, there is a need for more refined and accurate tools that consider the specific characteristics and challenges associated with pediatric MSKIs.

Question 1: Can we predict infection from no infection using admission variables. Variables to include in model (either individually or as a combo): Temperature and WBC. We would anticipate these variables will not predict well: absolute neutrophil, absolute lymphocyte, absolute platelets, and NLR/P. Compared to CRP which is the gold standard, would love to know how NLR performs compared to CPR.

Question 2: Can we use admission values to predict the future severity of disease (no infection vs local vs disseminated vs disseminated + complications)? Values to include: Same as above. How does this perform vs CRP (which should work great)?

Question 3: (data set still being cleaned) COMING LATER
Does this new multivariable system (NLR + WB + Temp) perform better then the standard Kocher criteria as it was defined. We would anticipate our model here would be better then the Kocher system to predict infection vs no infection of the hip. Can this now be applied to other areas of the lower extremity like the knee?

Mentor confirmed.

Clinic Notes:
  • Pediatric Muskuloskeletal Infection. Trying to come up with the best lab markers to answer are they infected and how severe? C-reactive protein is the gold standard marker but not common in all labs. What about neutrophil to lymphocyte to platelet ratio instead? Have generated an ordinal outcome. N=712. What models for outcome as well as lab values? Would like to use the dynamic course of the case (longitudinal).
  • Recommendations:

2023 October 2

Cara Donohue, Hearing and Speech Sciences

Study design for a pragmatic clinical trial in heart transplant patients.

Clinic Notes:
  • Working on research grant for American Heart Association. Research study for those listed for heart transplant. Wanting to enroll patients in a program to help with outcomes post transplant (swallowing, cough, etc). Need help with design.
  • Recommendations:
    • Need to think about exclusion criteria for patients who are on the waitlist for a relatively short period of time.
    • Think about having a different person other than the primary clinician for assessment to get rid of blinding issues.
    • For primary outcome, since it's measured before surgery, death doesn't affect that. Though death would affect secondary outcomes.
    • Can do a two-tiered analysis with different sample sizes. For the binary comparison, can exclude patients who wait for fewer than two weeks. For dose-response analysis, can include these patients.
    • Plan to keep track on adherence between the two arms.
    • Block randomized study design.

2023 September 25

Mikaela Bradley (Sarah Stallings), Genetic Counseling Program

Neurofibromatosis Type 1 (NF1) is a common genetic condition that affects approximately 1 in 2,500-3,000 individuals. The goal of this study is to investigate if a reported family history of NF1 influences perceived levels of stress and coping styles in adults with NF1. To do this, adults with NF1 have been recruited to complete a survey that includes questions about their diagnosis, their family history, the Perceived Stress Scale 10-Item Version, the Brief Coping Orientation to Problems Experienced Inventory, short response questions, and demographics. The scores from the two validated scales will be used to evaluate how our participants perceive their stress and how they use coping styles, respectively. Scores will be compared between individuals with “inherited NF1” and “sporadic NF1” to evaluate if a reported family history influences perceived stress and coping styles of adults with NF1. During this clinic, I would like to review the types of analyses I plan to run and make sure I am setting things up correctly.

Clinic Notes:
  • Data collection will be complete in a couple weeks. Outcomes are PSS10 (stress) score and 3 brief cope scales. Primary independent is whether there is family history of NSF-1 or not. Want to verify tests used, sample size, design.
  • Recommendations:
    • A good first step is t-test. Would consider non-parametric tests as well to avoid normality assumptions.
    • Can also run regression models with covariates (sex, age, race, etc.). Suggest ordinal regression model.
    • Would not worry about looking at individual questions as you run in to multiplicity issue; you could get a significant result simply by chance having so many tests run.
    • Possibly look at correlation between stress and coping. Pearson correlation or Spearman's correlation and corresponding confidence intervals.
    • Interpretation is association not causal.
    • Ensure you keep track of what you do in SPSS that way you can reproduce what you did later.
    • Would address variable missingness as a limitation in the paper. Can describe the group who didn't finish the whole survey.
    • For those who are adopted, do a sensitivity analysis. Start by excluding them in the main analysis since they don't nicely fit in a bucket. But then do the sensitivity analyses with them included once in each group and see how it influences the overall results.

Katherine Hajdu (Jon Schoenecker), Orthopedics

There is a commonly used classification system of a pediatric injury that is used to reporter their overall risk for an adverse event. We are adding a subclassification to that classification system that increases the specificity of identifying patients with that adverse event. We are wanting to make sure we are calculating the specificity appropriately for this condition.

Clinic Notes:
  • Rare disease, small n. Prognostication of AVN in SCFE. SCFE: In obese kids, as they develop the ball of the hip can slip off either over time or abruptly. When this happens slowly, kids legs are rotated out and cannot sit but is fixable with surgery. If the ball dies, it is not fixable and will need a hip replacement. What are the predictors of getting AVN? Goal is prevention. Currently the classification is based on whether the patient can walk (Loder Stable versus Loder Unstable). The Vanderbilt standard further classifies Loder Unstable into Epiphyseal Stable and Epiphyseal Unstable. How to detect 1) loader - false positive and 2) predictive value of loader - with addition of intra-operative assessment.
  • Recommendations:
    • Present 2x2 table of classifiers (will give you sensitivity, specificity, ppv and npv and their confidence intervals). If confidence intervals overlap, they might say it would not be significant to add in that extra variable.
    • Second step could be logistic regression, one model with published classifiers but then second model with that plus the additional classifier you want to add. This will see if there is anything in the 1 classifier that goes above and beyond what the previous method can do.
    • Instead of a logistic regression, can first show that the estimated classifier sensitivities are both equal to 100%. Then do a direct statistical test of whether the specificities are different using a Fisher's exact test on a 2*2 table with rows as "Classifier #1", "Classifier #2", and columns as "Classifier says negative and it's not true AVN", "Classifier says positive and it's not true AVN".
    • Might also need to decide if there is an option of having adding in a few false negatives if it means decreasing many more false positives. Is there a give and take you are okay with.

2023 September 11

Megan Passarelle & Thu Can (Sarah Welch), Acute Rehabilitation

We are studying co-treatment in the hospital setting. We are planning on distributing a survey to occupational therapists, certified occupational therapy assistants, physical therapists and physical therapy assistants who work in an adult, inpatient setting to understand what factors influence their decision to co-treat. Co-treatment occurs when 2 therapists of different disciplines (OT and PT) work together within one patient treatment session. It is a controversial topic with limited published evidence currently. There is little guidance on when co-treatment is most appropriate, leaving many therapists confused surrounding its practice.

The questions we would like answered are:
  1. What type of analysis should we conduct?
  2. How large should our sample size be? What number would you recommend for conducting a pilot study?
  3. Do we need a statistician to analyze our data?
Clinic Notes:
  • Perspectives on co-treatment in hospitals. Aims are to see how the following factors impact the perspectives: 1. Demographics. 2. Billing practice. 3. Conditions that people find beneficial to co-treat. Has a survey with 22 questions (Likert scale).
  • Recommendations:
    • Choose a primary hypothesis and base sample size calculation on that.
    • Start with basic descriptive statistics and figures. Then think about ordinal regression model.
    • Can come back to clinic to talk about sample size, and/or apply for a VICTR voucher. If applying for a VICTR voucher, then it's recommended to apply before distributing the survey.
    • VICTR voucher is appropriate to apply for if interested. Application website ( and research proposal template (

Mark Rolfsen (Wes Ely and Matt Mart), Allergy, Pulmonary and Critical Care

We are performing a multi-arm observational survey based study to assess the awareness, communication practices and patient preferences of Post Intensive Care Syndrome. We will have several data sets of varying but generally low complexity and are requesting biostats support.
Clinic Notes:
  • Survey on Post Intensive Care Syndrome (PICS) for patients discharged from ICU. Current design has four arms: 1. Patients. 2. Providers for patients in arm 1. 3. General ICU providers. 4. Patients with PICS.
  • Recommendations:
    • Will need to create an official hypothesis and get a power/sample size calculation for VICTR voucher application.
    • Can use agreement statistics to measure the discordance between patient and provider experience, there are several approaches.
    • VICTR voucher is appropriate to apply for if interested. Application website ( and research proposal template (

2023 August 28

Erica Carballo (Courtney Penn), OB/GYN

I'd like to run an ANOVA analysis I did by you for this project:

We seek to estimate the annual percentage of patients with advanced-stage epithelial ovarian cancer in the United States who are eligible for and will derive benefit from PARP inhibitor therapy based on US FDA-approved indications. We compare the rates of eligibility and expected benefit, then analyze these trends over time.

Indications have changed year to year since 2014 as trial data emerges.

Each PARPi indication specifies one of 3 treatment timings (maintenance after first-line, maintenance after recurrence, late-line treatment) and biomarker status that falls into one of 3 distinct categories (BRCAmut, homologous recombination (HR) deficient/ BRCAwt, HR proficient).

I wanted to compare the effects of these two variables. By 2018, there was at least one PARPi FDA approval across all treatment timings and homologous recombination statuses. A two-way ANOVA demonstrates significant effects of biomarker status (F(2, 45) = 10.6, p ? 0.001) and treatment timing (F(2, 45) = 36.7, p ? 0.001) on number of patients benefiting from PARPi from 2018 to 2023. There was a significant interaction effect between biomarker status and treatment timing (F(2, 45) = 3.8, p = 0.010). Variations in populations with FDA eligibility for PARPi therapy is a known contributing factor to this interaction in addition to variation in efficacy (Figure 3). Subsequent ANOVA analyses without replication were performed separately comparing biomarker status and treatment timing to year (and therefore FDA approvals for a given year) which also show significant effect of biomarker status (p = 0.001) and treatment timing (p ? 0.001) independently.
Clinic Notes:
  • Estimating annual percentage of patients with advanced-stage epithelial ovarian cancer in the USA eligible for PARP inhibitor therapy and expected benefit
  • Three biomarker status categories and treatment timing tested using two-way ANOVA
  • Recommendations:
    • Instead of focusing on p-values, focus on newly eligible patients.
    • Explore PubPeer for any possible criticisms against previously published articles using similar methodology

2023 August 21

Rachel Azevedo (Ashley Shoemaker), Pediatric Endocrinology

My project is a two phase study where we will be analyze patient perceptions of genetic testing in the evaluation and management of pediatric obesity to see if these perceptions influence patient outcomes. In the first phase, we will be sending surveys to ~117 patients who have already completed genetic testing to determine if there are themes in patient perceptions towards testing, followed by interviews for 10-15 of these patients. Once we have distinguished any themes in these surveys, we will proceed with our second phase, where we will actively recruit patients aged 12-19 at the VUMC weight management clinic. These children are all offered free genetic testing as part of their management, so for patients choosing to get genetic testing we will ask them to complete a pre- and post- testing survey and will follow their outcomes (including BMI, appointment show-rate, medication fill-rate) to determine if there is a correlation between their views on genetic testing and their outcomes. I plan to also apply for a separate VICTR voucher with the QRC to aid in the design of the interview protocol and analysis of surveys in the first phase. As the second phase is searching for correlations between patient perceptions and the collected data, I am requesting assistance with planning this analysis & creating a statistical model.
Clinic Notes:
  • Survey cohort study, 117 pediatric patients with genetic testing and obesity. Phase 1 is qualitative interviewing. Phase 2 is going to recruit patients from weight management clinic to see how their understanding of genetic testing impacts outcomes. Follow for 6 months to track bmi, visit rates, medication fill rates.
  • Phase 1 population is people who have already taken genetic testing, phase 2 population is eligible for genetic testing which they can accept/refuse
  • Recommendations:
    • For phase 2, patients who refused to take genetic test can serve as a reference group
    • Clearly note the primary outcome in the application, better to select an outcome that will be most impacted
    • Sample size calculation/justification needed for VICTR voucher, base on primary outcome

Jason Samuels, General Surgery

Polycystic ovarian syndrome (PCOS) is one of several obesity-related diseases that results from derangements in hormone signaling pathways. Bariatric surgery has been shown to improve PCOS through weight loss and improvements in patients’ metabolic profiles. However, to date, whether a particular bariatric surgery, i.e. sleeve gastrectomy or roux en y gastric bypass, provides greater resolution of PCOS via greater weight loss is unknown. This study consists of two parts. 1. A retrospective analysis will be conducted identifying patients using the research derivative with PCOS. This will provide preliminary data for a society grant through SAGES that will fund a prospective observational study comparing outcomes following sleeve versus gastric bypass in patients with PCOS. Our hypothesis is that gastric bypass will achieve greater weight loss than sleeve gastrectomy resulting in greater disease remission in PCOS.
Clinic Notes:
  • Retrospectve analysis to compare different types of bariatric surgery on obesity related to polycystic ovarian syndrome (PCOS). Big issue with dealing with missing data.
  • Follow up post-surgery is not required, about 50% of eligible patients are seen up to a year
  • Primary outcome is weight loss in patients after bariatric surgery, secondary outcome is resolution of PCOS with measurement to be determined
  • Population is patients diagnosed with PCOS who are referred to bariatric surgery
  • Recommendations:
    • Due to missing data, using pregnancy indicator as a secondary outcome may not be optimal
    • Identify a biomarker that is reliably collected for measuring secondary outcome
    • Need to use a data source that contains the same population of interest
    • May be helpful to identify patients whose primary care is within VUMC (for more reliable outcomes related to pregnancy)

2023 August 14

Elizabeth Longino (Shiayin Yang), Otolaryngology/Head & Neck Surgery

Need statistician to help perform statistical analysis on data from our Randomized Controlled Trial investigating the use of TXA (tranexamic acid) in rhinoplasty surgery. 2 groups (TXA and control, total ~70 patients) with intraoperative data and postoperative data to analyze. We are nearly done accruing patients – data will be ready for analysis within the next month and will be presented at a national conference in late October 2023.
Clinic Notes:
  • A randomized controlled trial investigating the use of TXA (tranexamic acid) in rhinoplasty surgery, using intraoperative data and postoperative data to analyze. Still in the enrollment process, with 60+ patients now.
  • Primary outcomes: reduction in intra- and post- operative bleeding, reduction in post- operative edema. All are measured in a Likert scale.
  • Sample size calculation is conducted before data collection, recalculating the sample size may result in requiring more patients for enrollment
  • Given the time constraints, decision is needed in whether to move forward with statistical analysis or sample calculation

Chen Bo Fang (Avni Finn), Vanderbilt Eye Institute

Identify ophthalmologic features on OCT imaging predictive of clinical outcomes such as visual acuity. Would like to examine how 3-4 quantitative variables from the OCT imaging correlate with pre-operative visual acuity (quantitative variable). There are also additional variables we would like to control for in a multivariate analysis.
Clinic Notes:
  • Project is to identify OCT features predictive of pre-/post-op visual acuity and successful outcomes in patients with macular hole repair.
  • Interested in the effect of number of cysts and the overall cystic volume on the outcomes (pre-operative vision, minimum linear diameter, lines of visual improvement, symptom duration). Sample size is 46 patients.

  • Unless there is a clear clinical cut-off for a feature that is meaningful, it is usually better to leave well-defined continuous variables as is
  • Needs at least 96 patients to estimate a proportion within 0.1. Consider the sample size, will need to reduce the number of confounders
  • Instead of adjusting confounding variables separately, can consider adjusting for the propensity of confounders
  • For lines of visual improvement, analyze the post result adjusting for the pre value

2023 July 31

Rachel Appelbaum, Surgical Sciences

• In the field of Trauma Surgery, many essential research questions revolve around the initial resuscitation and management of unstable trauma patients. Due to their clinical status on presentation, they are often incapable of providing informed consent or identifying their legally authorized representative (LAR) [1-3].
• In 1996, the Federal Drug Administration (FDA) established EFIC, a set of federal regulations implemented when conducting a clinical trial to inform best practices in an emergency [4]. These regulations include utilizing a process known as community consent, defined as consent obtained through collaboration between the research team and community members. The requirement of community consultation remains poorly defined and variably interpreted [3]. Additionally, once approved, the education of patients’ families around the concept of EFIC and the consent process are variable and researcher dependent.
• EFIC studies help researchers perform trials with significant societal benefit that would otherwise be unattainable. It is incumbent upon the researchers to prioritize respect of patient concerns and experiences [5].
• In the setting of EFIC, pre-existing interactions are not present to build the trust and rapport between the researcher and patient/caregiver that is so important in conducting meaningful research.
• Previous efforts to gain patient/caregiver insight regarding EFIC highlight the need for continued work to determine how best to partner with a patient’s family to improve understanding and the importance of EFIC.
• This work will inform a more patient-centered community consultation process as well as create a more collaborative consent process for individual patients/caregivers and researchers.
Clinic Notes:
  • Perception of patients on informed consent (EFIC).
  • Qualitative study: phase 1 (subject enrollment) would consider demographics such as gender, race, time from injury
  • Phase 2 - trial of delivery methods; phase 3 - standardized EFIC education; phase 4 - validate materials. For phases 1-3, three different groups of participants will be enrolled.
  • Recommendations
    • Biostatistics collaborate plan with Surgical Sciences. Biostatistics faculty contact is Fei Ye.
    • For phase 1, descriptive analysis would be enough and sample size doesn't really matter.
    • For phases 2-4, quantitative comparison requires justifying sample size for detecting meaningful effect size and listing inclusion/exclusion criteria
    • Define outcome measure to refer to in hypotheses
    • Combining phases (to reduce the number of phases) recommended for less complication with sample calculation, IRB approval, inclusion/exclusion criteria; can also have less dependency on previous phases
    • Needs to clarify the outcomes of phase 2 and how to proceed to phases 3-4 depending on the outcome of phase 2.

2023 July 24

Lindsay Podraza (Maya Neeley), Pediatrics

We implemented a new curriculum within the pediatrics clerkship to improve student knowledge, confidence, and clinical skills related to a diagnosis of streptococcal pharyngitis. We have pre- and post-intervention survey data (REDCap), as well as a small control group that we want to compare. Our question is what type of test would be best to analyze knowledge questions (right vs wrong), confidence questions (likert scale), and clinical skills (based on a rubric, gauging # of satisfactorily performed items). We have tried t-tests and Wilcoxon Rank Sum tests so far.
Clinic Notes:
  • Curriculum intervention. Three different domains: knowledge (percent), attitude (Likert scale), and clinical skill (percent). Have control group (post assessment) and intervention group (pre and post assessment). From the intervention group, 65 students had a pre assessment and 28 had a post assessment. Control group had 8 students.
  • Recommendations:
    • Likert scale scores are ordinal, there might be some information lost if using Wilcoxon Rank Sum test (which treats variables as categorical).
    • Pre vs. post assumes that everyone who had a pre-assessment also has a post-assessment, and usually can't withstand even a single loss in the post group. Missingness in post assessment is usually not random.
    • Question: How to best share the data that we have collected?
      • Descriptive statistics can be used to illustrate what happened to the specific students in the study, but making inference is hard with the current study design and collected data
      • Mean works well for summarizing Likert scale as well as continuous data with symmetric distribution
      • Plot raw data, scatter plot/histogram to show variation between students
      • Correlational analysis (spearmans rank) within sample is possible for biased data to compare the relationship between variables

2023 July 17

Anna Pfalzer, Neurology

We have run proteomics on approximately 500 plasma samples collected from individuals with 30 different Neurodevelopmental Disorders. I have several pieces of clinical information from these individuals that I would like to include in my analysis. I am several primary research questions I would like to address, but need help articulating the correct analysis plan and procedure.
Clinic Notes:
  • Grant to collect plasma samples on children with neurodevelopmental disorders. Sample size is about 20 samples for each of the 22 disorders and 40 samples from controls (n~480). Hypotheses are looking at proteomic differences between disorders, within disorders by variant, within disorders by covariates. Also looking in to similarities of disorders, which cluster together. Interested in how neural components (ex seizures) impact nuero-specific proteins.
  • Recommendations:
    • Track record of research when you have more features than you have patients is not the best. It has to be a "smoking-gun" for the research to work. Do literature review to see what has been done. Need to ask more general question that can live within the constraints of the sample size.
    • Start with the simplest question: one protein with one trait still need N=300 to get a reliable correlation coefficient.
    • Principal Component Analysis should not go above 3 components based on literature. First PC maximizes the disagreement between patients. Info would not be available for which protein maximized that difference.
    • 22 disorders may be too many to research for the given sample size (N=20 per disorder), try to group them by characteristics/biological function so that we're not limited by N=20 per disorder, or use a numeric metric/trait as a summary measure to get more power.
    • A mixed approach would be selecting specific proteins and grouping all the remainder proteins to conduct PCA.
    • For the clinical covariates, do sparse PCA first to decide which ones to include in the model.
    • Another method to discover potentially important proteins is bootstrapping the importance of variables: do the data support that you will find the same "smoking gun" over and over?
    • Possible VICTR voucher. Application website ( and research proposal template (

2023 May 22

Amelia Maiga (Mayur Patel), Surgery / Acute Care Surgery

We are designing an unfunded multicenter retrospective study to look at the impact of an institutional process improvement program (trauma video review) on outcomes in trauma patients (time to hemorrhage control, inpatient mortality, etc.) at a systems level (ie, by hospital, not by patient). We are specifically looking for assistance with power calculations to identify the number of centers needed to be recruited and number of patients per center.

Clinic Notes:
  • Using TVR (trauma video review) program improves systems-level outcomes for trauma patients at highest risk of death and disability, specifically in 2 populations. Aims 1) shortens time to hemorrhage for trauma patients presenting in shock 2) shortens time to neurosurgical intervention for trauma patients presenting with blunt traumatic brain injury and midline shift.
  • Retrospective multicenter cohort study; exposure is presence of TVR program (institution/hospital level); primary outcome is time to hemorrhage control/neurosurgery (min); covariates at both patient and hospital level.
  • Recommendations:
    • Need to include a fair number of hospitals so that the hospital characteristics are not related to whether TVR is used. In cross-sectional, the more hospitals you get the more imbalances can cancel out but still does not remove all bias. Having a study where 20 hospitals can be their own controls (pre v post) is much more powerful than having a cross-sectional study with 40 hospitals. Pre-post design removes the bias between hospitals since the specific hospital can be its own control.
    • Since primary outcome is per patient unit, there is more room to include per-patient covariates even with relatively smaller number of hospitals (cluster).
    • If system level covariates are collinear with using video then you would not be able to assess impact of using video review. Best situation is if you first cannot predict use of video from hospital then it means that you should be able to disentangle the effects of video review.
    • Possible use of state-transition model. Finds expected length of time patients are in each state. Death is worst state, discharge home is best state. Would measure every 30 min-1 hour during hospitalization. (Link to lecture:
    • VICTR vouchers are limited time and support-wise for biostatistics support to 90-hours and 12-month duration. Would contact department collaboration statistician first, otherwise best possibility would be to look in to learning health system (if chosen, unlimited help). Contact is Cheryl Gatto.

2023 April 24

Aileen Wright, Biomedical Informatics

LHS supported trial. Interruptive Versus Non-Interruptive Alerts to Recommend Statin Prescribing in Primary Care Questions about outcome measures, statistical analysis, power calculation.

Clinic Notes:
  • Purpose of trial is to determine if interruptive vs. non-interruptive alerts increase statin prescription. Three arms (interruptive, non-interruptive, control). Primary outcome is percent of patients prescribed a statin within 24 hours of alert firing. Secondary outcomes: 1. the percentage of patients prescribed a statin within 12 months post-intervention. 2. First postintervention LDL.
  • Recommendations:
    • Better to state outcome as presence/absence of event for a patient (e.g. statin prescription within 24 hours of alert firing).
    • For analysis, test for difference in 3 proportions using logistic regression; encompasses an overall test in difference of the 3 groups with 2 degrees of freedom. This is equivalent to an ANOVA with continuous outcomes. If 3 groups comparison shows evidence for difference, then you can do pairwise comparisons freely.
    • Increase the power to 0.9 or higher. The power calculation program PS Power and Sample Size can take three different probabilities.
    • Time to event assumes equal opportunity surveillance of outcome. Time to event up to 6 months, 12 months, etc.
    • The R function power.prop.test from the stats package does power calculation for two-sample test for proportions.

2023 April 17

Rong Wang, Human and Organizational Development

My team has been working on a large Facebook dataset (1.3 million posts) to analyze how moral framing and message sentiment influence people’s reactions to posts on social media. Our dependent variables (DVs) are emotional reactions indicated by how people use the Facebook reaction buttons. For DVs, we have tried using both the raw count of emotional reactions and also the proportions of emotional reactions (e.g., number of angry buttons used divided by number of total reactions received by a post) in the measurement. And we used negative binomial regression to run the models. For both approaches of measuring DVs, our negative binomial regression results generated high incidental rate ratios. All of our IVs (moral framing and message sentiment) are measured on a scale of 0-1, so it is impossible for any of our DVs to reach a significant increase in score.

Because of that, we wonder if we could interpret our high IRR scores differently through some form of transformation. E.g., we assume changes in sentiment scores are of the unit of .001, then we open 100th power of the IRR (say 170 become 1.05). This way, the interpretation would be if there is .01 increase in this IV, we would expect 1.05 increase in DV. If there is .02 change in IV, there would be 1.05*1.05 change in DV. Another potential solution we are exploring is the rescaling method by transforming our IVs. For example, we can multiply our IVs by 100, and then rerun the model. Is it feasible?

Clinic Notes:
  • Data: 1.3 million posts on Facebook about vaccine and reactions. All public available posts from March 2020- August 2021. Used negative binomial regression for count variables.
  • Independent variables: moral framing and message sentiment (of a Facebook post), on a scale of 0-1
  • Dependent variable: Facebook reaction button to a post ( 6 categories)
  • Received critique on interpretation of effect size
  • Recommendations:
    • Calendar time and time of day (local time) can be relevant.
    • Offset in Poisson regression (negative binomial regression) should be independent of the thing you're studying.
    • The independent variable should not penalize the length of the post. Could see the effect of interaction with the length of the post.
    • Use LOESS non-parametric trend smoother to visualize data points, could color-code according to dimensions (ex. volume or length of post)
    • Ordinal regression (proportional odds model) does not need as many assumptions (on distribution shape) as negative binomial regression, use around 1000 categories
    • Allow variables to be non-linear (splines). Could also explore interaction with time.
    • Would not use p-values for interpretation.
    • <- resource for ordinal regression models
    • Do a validation study with 200 random samples on the scale.

2023 March 6

Margaret Rutherford (Byron Schneider), Physical Medicine and Rehabilitation

Study title: Outcomes Associated with Guideline-Based Conservative Care for Low Back Pain Question: We would like assistance with some more advanced data analysis. Mentor confirmed.

Clinic Notes:
  • Study population: standard of care patients (therapy + medication). Six-week outcomes: pain score and ODI. Outcomes Associated with Guideline-Based Conservative Care for Low Back Pain Interested in types of prior treatment (physical therapy (Y/N), medications (several types, Y/N), injection (Y/N)).
  • Aim 1: Assess whether conservative care improves outcomes (pre/post). Aim 2: Assess whether prior treatment type affects outcomes.
  • Recommendations:
    • Collapse treatments into meaningful groups.
    • For Aim 1, could do non-parametric tests and pairwise comparison. For Aim 2, could do ANCOVA to adjust for baseline characteristics. The variable of interest would be prior treatment type.
    • Power calculation /Sample size justification can be discussed with VICTR Biostatisticians and should be included inthe the voucher proposal.
    • Possible VICTR voucher. Application website ( ) and research proposal template ( ).

Aileen Wright, Biomedical Informatics

This LHS study is planned as a randomized trial to compare interruptive vs non-interruptive vs. no alert to increase statin prescribing in primary care. I plan to randomize at the patient level. I would like advice about my power calculation, and whether I should target a certain number of patients in each arm or have a recruitment time period.

Clinic Notes:
  • Randomized trial on alerts (interruptive v.s. non-interruptive) for statin use. Study population: statin-eligible patients not currently on a statin. Objective: to determine whether an interruptive or non-interruptive alert increases the proportion of statin-eligible patients who are on a statin. Intervention: 2 interventions (comparator arm A: interruptive BPA recommending statin, comparator arm B: non-interruptive BPA recommending statin), 1 usual care arm (control arm (usual care) with no BPA displayed). Primary outcome: the percentage of patients prescribed a statin in each arm within 3 months after the implementation.
  • Recommendations:
    • Consider focus group interviews for physicians to know their reaction for non-interruptive alerts.
    • Potentially, there is a gap between physicians prescribing statin and patients picking up the medication.
    • Primary outcome may need to be more conservative in order to accurately measure the affect of BPA alert on physician's decision to prescribe statin.
    • Look at pilot study data to check the proportion of patients prescribed statin on the day of visit (and the next day) after the intervention to refine the time window for primary outcome; large proportion can be reason to shorten the time window.
    • For the secondary outcome, 18 months may be too long to attribute the effect to BPA alerts.
    • If doing two tests (interruptive v.s. control, and non-interruptive v.s. control), could do a Bonferroni correction and use alpha=0.05/2.

2023 February 20

Giovanna Giannico, Pathology

Project 1: I have created a survey, and would like to analyze survey results and association with epidemiological variables.

Clinic Notes:
  • Survey of genitourinary pathologists, ~120 records (50% complete). Involved epi data (pathologist demographic questions) and prostate biopsy (practice specific questions). Survey was disseminated using both emails and social media.
  • Purpose of survey is to find how much uniformity there is in reporting prsotate cancer across different societies and countries.
  • Recommendations:

Project 2: I would like to perform an outcome analysis of patients with prostate cancer age </= 45 years (young) compared to those with regular screening age (old).

Clinic Notes:
  • Recommendation for screening PSA (for detecting prostate cancer) is 50 years old, so comparing patients under 45 years old to patients over 45 years old. Collected data on pathological features and outcomes. Outcome: biochemical recurrence. Time 0: time of surgery.
  • Recommendations:
    • The projects may be appropriate for cancer center biostat help.
    • Good to have an analysis plan drafted in voucher proposal.

2023 January 30

Benjamin Collins (Ellen Clayton), Biomedical Informatics, Biomedical Ethics and Society

Development of a measure for patient literacy of artificial intelligence in healthcare. Planning sample size for testing and validation of scale and statistical analysis of results. Mentor confirmed.

Clinic Notes:
  • Develop education module for patients on artificial intelligence.
  • Need to be aware of bias in surveys. Surveys are not great to think of hypotheses, but think about how much does something happen (estimation).
  • Recommendations:
    • Can get proportions/correlations and confidence intervals on them.
    • Margin of error +- 0.1 (on a scale of 0-1) takes 96 subjects, helpful to think about when considering sample size.
    • Randomizing which questions to ask to different patients can put less burden on low response rates; also consider randomizing the order of questions if they are sensitive.
    • Could do redundancy tests on cluster of questions.

2023 January 23

Matt Christensen (Michael Ward), Pulmonary and Critical Care Medicine

We aim to develop a clinical prediction score to estimate the risk of a MRSA infection among patients diagnosed with Sepsis in the ED.

1) feedback on appropriate population? We can analyze data from all ED encounters, or use data from an existing clinical trial (ACORN) which enrolled patients at VUMC with an order for an anti-pseudomonal antibiotic.

2) How to estimate sample size for building a clinical prediction tool?

3) Which method to use for building clinical prediction score (ehr based that can be augmented by provider input)? - principle component analysis? - sequential component selection? - Others? Mentor confirmed.

Clinic Notes:
  • Looking to do risk predictions score for MRSA. Aims: 1. Compare performance of existing strategies (risk factors, syndrome-specific scores, MRSA PCR) for predicting MRSA risk to current practice (provided order for anti-MRSA). 2. Derive an automated EHR based MRS risk score in sepsis and validate internally. Population: adults with suspected sepsis. Effective sample size: number of MRSA*3.
  • Recommendations:
    • Try to collapse variables that act similarly (variable reduction).
    • Broader population (from a non-study) vs. narrower population (from a regulated prior study): use the clinical trial data to get a model, and not assume the model is calibrated correctly for low-severity patients. Weakness: when real data has many noise predictors, obtaining intercept needs to be carefully done.
    • Wouldn't recommend accessing sensitivity/specificity.
    • Can use C index or AUC (area under the curve) to assess the discriminative power of the prediction score.

Carson Moore (Thomas F. Scherr), Chemistry

I work in the Mobile Health for Global Health group. We are currently working on a project that maps human schistosomiasis infections and environmental factors related to the spread of this disease. We are interested in consulting with the Biostats Core to work on calculating the number of sites needed to sample and number of intermediate host snails needed to collect to provide statistically valid results, and assistance on determining the best methods to randomly sample mapped areas. Mentor confirmed.
Clinic Notes:

  • Interested in schistosomiasis infections. Would like to know the best way to sample to validate the risk map (existence of snails).
  • Recommendations:
    • Do not recommend classification method (presence/absence of snails). Consider semiparametric model that uses ranks (more abundance means higher in rank).
    • Use lift curve (as in marketing) for areas.
    • Consider geospatial models that account for correlation between areas. Use random effects in the model.
    • Stratifying regions valid for resource savings, whereas complete randomization assures unbiasness.
I Attachment Action Size Date Who Comment
BoxPlotR.RR BoxPlotR.R manage 5 K 17 Apr 2006 - 11:44 QingxiaChen  
InforegardingwhatmySPSSfilesays.docdoc InforegardingwhatmySPSSfilesays.doc manage 24 K 17 Apr 2006 - 11:44 QingxiaChen  
LOA_condensed_data.sxcsxc LOA_condensed_data.sxc manage 22 K 04 Dec 2006 - 09:17 PatrickArbogast Data from Edward Butterworth
Oluwole_Biostat_Clinic.xlsxls Oluwole_Biostat_Clinic.xls manage 46 K 25 Aug 2014 - 11:30 SharonPhillips data file for Olalekan Oluwole
StatisticalAnalysisRequest.docdoc StatisticalAnalysisRequest.doc manage 22 K 17 Apr 2006 - 10:26 QingxiaChen  
WellsIschemicCollat.pngpng WellsIschemicCollat.png manage 36 K 31 Jan 2011 - 13:58 MattShotwell  
WellsIschemicEF.pngpng WellsIschemicEF.png manage 37 K 31 Jan 2011 - 13:55 MattShotwell  
analysisEXT analysis manage 3 K 11 Feb 2006 - 20:30 QingxiaChen  
biost_clinic_stephanie_vaughn.csvcsv biost_clinic_stephanie_vaughn.csv manage 4 K 23 Apr 2007 - 11:37 PatrickArbogast  
biost_clinic_stephanie_vaughn.dtadta biost_clinic_stephanie_vaughn.dta manage 1 K 01 May 2007 - 11:12 PatrickArbogast Stata datafile for Stephanie Vaughn
biost_clinic_stephanie_vaughn.loglog biost_clinic_stephanie_vaughn.log manage 8 K 01 May 2007 - 11:13 PatrickArbogast Analysis results for Stephanie Vaughn from April 30th clinic
biost_clinic_stephanie_vaughn.xlsxls biost_clinic_stephanie_vaughn.xls manage 25 K 23 Apr 2007 - 11:37 PatrickArbogast  
boxplotdata.csvcsv boxplotdata.csv manage 2 K 17 Apr 2006 - 10:27 QingxiaChen  
clinicimage.jpgjpg clinicimage.jpg manage 134 K 14 Aug 2020 - 10:15 DalePlummer  
clintCarroll.sxcsxc clintCarroll.sxc manage 40 K 26 Feb 2006 - 21:30 FrankHarrell Clint Carroll Langerhans Data
clintCarrollabstract.sxwsxw clintCarrollabstract.sxw manage 8 K 26 Feb 2006 - 21:27 FrankHarrell Clint Carroll Langerhans Abstract
specificaims.docdoc specificaims.doc manage 25 K 13 Feb 2006 - 10:11 ChuanZhou Specific Aims
tang.rdarda tang.rda manage 13 K 19 Dec 2009 - 08:42 FrankHarrell Data from Yi Wei Tang processed using R code above
Topic revision: r860 - 11 Dec 2023, YueGao
This site is powered by FoswikiCopyright &© 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback