Recommendations, Analyses, and Data for Health Services Research, Diagnosis, and Prognosis Clinic
Notes 2016


Angela Maxwell-Horn, MD, Assistant Professor of Developmental Pediatrics, Monroe Carell Jr. Children’s Hospital at Vanderbilt

I am a pediatrician wanting to do a study about the effectiveness of a medication to treat ADHD symptoms in children with autism. I would like to come to a biostats clinic to help me figure out what type of analysis that I should do and how many subjects I need to effectively power my study. I have attached a copy of my study proposal.

  • Recommend a randomized cross-over study design with double blinding if possible
  • Select a side-effect measurement tool
  • Clearly state inclusion/exclusion criteria

Heather Limper, Center for Clinical Quality and Implementation Research

"I would like to get some help with execution of times series analysis using STATA (ideally)."


Katie McGinnis, MPH Candidate, Global Health

Perform surveys in three children hospitals on parents and staff. 69 respondents from parents and 97 from staff. Parents survey: demographics about parents and children, how the experiences in hospital impact parents and children, patient satisfaction Staff survey: demographics, education, child's hospitalization needs

Research questions: what do you think caused the child's illness? The language barrier in receiving proper care? The correlation between child's experience in hospital and staff's education and experience.

Survey matrices are similar in parents' survey and in staff's survey (a dozen of likert-scale questions). Want to check the correspondence between parents' responses and staff's. First check if parents agree with each other. Code the answer to each question as 1,2,3,4,5. Summarize the score of each question across all the patients. Small SD is an indication of better agreement between parents. Second check the consensus of staff. Third, to evaluate the staff's characteristics, compare staff's responses to parents' consensus; to evaluate the parents' characteristics, compare parents' responses to staff's consensus. Take the difference between staff's response and parents' consensus as outcome, fit a regression model on providers' characteristics.

Could generate a summary score over multiple questions in one category (Rockwood's index).


Samantha Gustafson, Hearing and Speech Sciences

VICTR application for dissertation research. EEG measures for speech sound processing in quiet and in noise. Looking for age effects. How does the effect of noise change with age? Proposed analysis based on linear regression. Expects one EEG measure to be more sensitive than the other. Second question is to look for mediator with EEG response and how well they do behaviorally depending on age. Particularly tricky how to size a study for an exploratory mediation analysis. Have replaced repeated measures ANOVA with a linear model. Each EEG task takes 10 minutes. Two listening conditions, same task. Quiet vs noise order is randomized. Half of participants hear "da" and the other half receive "ga" (randomized). Model: EEG = intercept + age effect + noise/quiet + age x noise/quiet interaction. Can use generalized least squares (correlation structure irrelevant except don't assume the correlation is zero, since only 2 times per subject) or repeated measures ANOVA if very careful to use the correction for correlation (if can handle interaction between group (noise/quiet) and age). But GLS is ideal. Need to check normality assumption of residuals.

Power of a test of interaction is much lower than a test of main effect (difference in slopes vs. slope not being flat). Data not available for making initial guess of sample size required to achieve a given precision or power. Only thought is related to a minimum possible sample size - the size needed to estimate a difference in mean EEG for an adult with very good precision. The SD of the noise-quiet difference is used here. Once the acceptable margin of error (half-width of 0.95 confidence interval for the mean difference) is determined can plug in formulas related to precision - see e.g. . Beware: sample size needed for interaction is easily 4 times as large.

Alec Pawlukiewicz, Neuroscience and Psychiatry

Effect of exercise on neuro cognitive testing. Database of 20,000 participants - 9,000 after exclusions. Control for covariates sex, age, education level, # prior concussions. Interested in matched analysis. Not having enough controls. Suggested using full qualifying sample without matching, to maximize power and avoid any arbitrariness in how matches are determined. Non-matched analysis requires careful specification of the statistical model.

Several neuro scores are given by the test. If scales are continuous enough can use the standard multiple regression linear model if analyze one score at a time. May need to model age as a smooth nonlinear effect and perhaps likewise for education. Age and education may be co-linear. Variable of major interest is exercise (binary). Need to consider whether exercise may interact with age, sex, etc. What about type of exercise? For variables such as # prior concussions a quadratic effect often suffices.

Dillon Pruett, Hearing and Speech Sciences

Respiratory sinus arrhythmia. Comparing in children who do not stutter, stutter and persist, stutter and stop. They watch a video followed by a task, and this is repeated with different videos/tasks. Baseline re-measured at the end. Question about whether to form groups or to have a continuous-time longitudinal model with stuttering measures as the response variables (without categorization). Answer questions by estimating difference in means over time. Need to interpret the result in a clinically meaningful way. Need to adjust for baseline stuttering measure as a covariate. This might possibly be interacted with the intervention effect. Need to carefully formulate the linear model and account for within-subject correlation using something like GLS or mixed effects models (the latter is mainly used if there are more than 2 or 3 measurements over time within subject).


Omair Khan, Center for Research on Men's Health

  • "I would like to request some time to talk to another statistician about exploratory factor analysis I am doing in R with the psych package. This procedure is fairly new to me and I have some questions that I would like help with."


Mary Lauren Neel, Neonatal

  • Association between ITSP and illness severity score
  • Association between parenting style (PSDQ) and infant adoptation.

Mark Tyson, Urology

  • Bladder neck size on incontinence, controlling for BMI, age, preop score, disease status, and stitch.
  • Restricted cubic spline examples: MSCI Biostat II STATA


Dillon Pruett, PhD student in the dept of hearing and speech sciences working with Dr. Robin Jones

  • I'm working on a project involving longitudinal data with children who stutter and persist, children who stutter and recover, and children who do not stutter.


Scott Karpowicz

  • Matched design, 1:1, 1:many, BOOM
    • match on socio-economic, clinical factors, etc.
  • Change point analysis
    • see if readmission rates change at time of policy implementation
  • REQUEST FOR VICTR SUPPORT: Clinic statisticians recommend a 90 hour voucher.


Sam Gannon


  • Developing a randomized controlled clinical trial in mental literacy. Working notion, to increase mental literacy, communications which in turn increase mental health outcomes.
  • Submit concept paper to NIMental Health. Questions to address and want to get statistical expertise.
  • Questions: 4 educational arms and a control group for a total of groups. Setting community mental health clinics
Metric for outcome measure clinician reports notes - self management, behavioral adherence to protocol and rate of compliance These response measures are known to be correlated. Intervention: different educational programs. Control will have standard of care.
  • Consider cluster randomization. Figure out how many clinics that you will have access to. Five arms note one clinic receive one arm.
  • How to assess "fidelity"? Recording data consistently. Approach with assessment for some of inter-rater reliability.
How do you capture your outcome? If survey or standard form then it will be much easier to make results consistent. For example if reporting is done through RedCap, you will have the opportunity to formalize or standardize process.
  • Mediation analysis (Baron & Kenny, structural equation modeling). First you need to show that your intervention has an association with response variable. Mediator will be communication for example
. What factors mediate the intervention?
  • * (Y~X) Education is associated with improved mental health.
  • * (X~M) Education works through health literacy and/or communication(Mediators) to improve mental health.
  • Will I benefit from cross-over design? We believe that once knowledge is gained it will be difficult to have a "wash out". Cross over design will be more appropriate to a set up such the development of new drug with clear wash out.
  • Question from biostatisticians: do you need 4 arms? Can you combine some of these educational programs.
  • Transient effect: Is it common in the literacy literature and look into other clinical studies such as in diabetes which require behavioral changes. There are issues of relapse and maintaining adherence.
  • Timeline: Extend two years follow up time to address the "transient effect" although most studies have short follow up. Can you follow up subjects on StarPanel to show that you can address long term effects. Need to sit down with statitiscians to address realistically the multiple issues. How many clinics do you think that you could have access to? Recruitment time? How many subjects are needed?
  • Consider short term effects and long term outcomes. Can you design you study pragmatically without too much effort to collect data? Using the real set up Dr. entries for follow up assessment.
  • Recommendation: Follow up with VICTR voucher and statistician for help with proposal.


Heather Lillimoe

General Surgery Resident

I am currently in the process of designing a research study pertaining to resident feedback within the department of surgery. My hope is to utilize REDCap for my primary mode of obtaining data. I was hoping to meet with a biostatistician as I apply for VICTR funding for the study. It involves an educational timeout before an operation. This is a 3rd year rotation in plastic surgery. There is an iphone app to do a competency rating.
  • Survey - baseline assessment - residents and attendings - 85 questions
  • Additional survey after rotation

Cara Singer

I am a PhD student in the Department of Hearing and Speech Sciences. I would like to attend the biostat clinic today (if possible) to discuss appropriate analyses for a study I am conducting under the mentorship of Robin Jones (Developmental Stuttering Lab). The study is investigating whether a risk factor assessment (a mix of categorical and continuous variables) can predict stuttering persistence. 70-80% spontaneously recover. Would like to identify those likely to persist, in advance, for focusing therapy. Multiple risk factors have been identified. Empirical evidence for supporting predictive ability of the risk factors is sought.
  • Children previously seen - diagnostic visit; 4y ago; stuttering up to 18m; English is primary language
  • New follow-up for status at one point in time
  • Baseline variables that originate from continuous measurements (e.g., age at onset) need to be analyzed as continuous variables
  • Include baseline stuttering severity as a predictor
  • With a maximum of 150 children the maximum number of candidate predictors might be around 10 if the outcome variable is almost continuous (it's worse if outcome is almost binary)
  • Stuttering is multi-dimensional, e.g., some children may reduce amount of speaking because of the problem, so they seem to stutter less
  • May consider a compound summary of all the outcome measures, e.g., average rank across children; clinical ranking of scenarios can also be used
  • Dependent variable needs to have at least 5 frequently levels and be ordered or continuous
  • If there is one standout, popular scale, that one could be used by itself
  • Empirical variable selection requires an enormous sample size to reliably find the "right variables" so it's best not to use selection procedures; can find various approximations to the model for clinical non-computerized application
  • Data reduction methods (variable clustering, principle components, redundancy analysis) can be useful for effectively reducing the number of predictors to use in the multivariable model


Chris Brown, Internal Medicine Resident

  • To go over analysis produced by VICTR biostatisticians


Kazeem Oshkoya, Division of Clinical Pharmacology, Dr. Dan Roden's Lab

Data analysis on blood sample storage and drug concentration - look at whether a gel absorbs too much of a drug in the blood to make drug assessment accurate enough. Measured at baseline and 4h. Need to know how to describe the base value. Triplicate measurements available. More interested in relative comparison.
  • Best to present all the raw data
  • Might use 3 quartiles (25th and 75th percentiles and median) as descriptive stats and use Wilcoxon signed rank test for testing for a difference between baseline and 4h
  • There's also two types of samples - same study repeated with different samples, sample drug concentration
  • Only have 2 patients; plan to have 5 later
  • Better to not average over the 3 replicates - may hide variability
  • Bland-Altman plot (mean-difference plot) is a good way to show agreement and whether variation is stable over base levels. If band of variability expands going from left to right, this is an indication that perhaps the analysis should be done on the log concentration scale.
  • Other useful ways to summarize data: mean absolute difference between estimated and true concentrations - separately by no gel and gel
  • Can also show mean absolute differences between replicates ignoring the true concentrations
  • There are problems with lower limit of detection, representing missing values that are not randomly missing; ordinary analysis may be problematic

Jessica Dennis, Lea Davis, Genetic Medicine

Modeling lab values to look for genetic variation; data from the synthetic derivative
  • Interested in variation over time within patient
  • Variants are summarized into polygenetic risk scores
  • Difficulty in interpreting results if patients are being treated for the lab abnormality being studies
  • How to define time zero?
  • May want to ignore records corresponding to post-Rx periods
  • Started with HDL
  • Side study: confirm that med initiation that is supposed to modify HDL really does
  • Simplest longitudinal analyses:
    • Compute within-patient Gini's mean difference to correlation with gen. risk score; asks whether gen. risk is correlated with variability
    • Similar but summarize with the median to correlate gen. risk with overall height of the longitudinal records
    • Summarize entire longitudinal record with slope and intercept; AUC and relate summary measures to gen. risk score
  • Would be useful to summarize the data using representative patients after clustering on mean HDL, shape, number of observations, maximum time gap between any two measurements
  • Another type of analysis: summarize each patient using the 9 deciles of HDL; use these deciles to predict polygen. risk score
    • Does not take time ordering into account
    • Might add a slope or shape summary to the deciles


Amanda Peltier, Department of Neurology

Discuss Aims and power analysis for R01


Jake Landes, PT, DPT Vanderbilt Sports Medicine, Rehabilitation Services

  • I am a physical therapist in the Sports Medicine outpatient department and we are planning two studies that we would like to discuss. Primarily, though, we would like to discuss a prospective observational study we will be performing this coming school year with overhead athletes – we will be looking at the relationship of core strength to the likelihood of shoulder injury in overhead athletes. We plan to test the athletes’ core strength at start of their season and then collect data on injuries and time lost from playing their sport during the season. Specifically, we have questions about what our number of subjects should be in order to determine a difference and what we will need to do statistically in order to analyze the data.
  • Outcomes: number of days (or proportion) lost during the season due to shoulder injuries
  • Need information on the proportion of athletes who would get shoulder injury during a season. Sample size needed would be large if the proportion is very low.
  • Could use logistic regression to examine association between core strength and incidence of injury
  • Consider other factors that could affect shoulder injury such as the type of sport, number of years practicing, etc. These factors can be adjusted for in the regression model.
  • To calculate the sample size, need to specify the outcome, type of analysis used, the meaningful difference (effect size: odds ratio of injury upon one unit change in core strength) you want to detect, and some preliminary data on the outcome measurements (rate or variation). A rule of thumb: 20 cases of injury are needed for each factor you'd like to analyze.
  • Consider choosing a type of sports with the greatest association between core strength and shoulder injury.
  • how to quantify core strength, a single summary score?
  • A second study I am wondering about is an Anterior Cruciate Ligament Reconstruction study where we are going to compare a group of patients in a home based program versus standard care (control). We are wanting to do a feasibility study this year in our clinic, and I think it will be a prospective case-control study, or maybe prospective cohort—we also want to know about N size and analysis after ward.
  • Enroll 7 patients in one month. Feasibility study.


Katherine McDonell, Neurology

  • Parkinson's disease - norepinephrine; VICTR application
  • Original intention peripheral blood pressure support
  • Interested in a combined medication regiment
  • Goal to get nor. into CNS
  • Propose to study n=16 patients
  • Need dose titration 100mg bid -> 600mg 3/day
  • Which dose do patients tend to end up with?
  • Is a safety & tolerability study, partly dose-finding
  • Patient response that is monitored is blood pressure - minimizing orthostatic symptoms without side effects; target supine BP plus headaches, dizzyness, mania; symptoms are of primary emphasis
  • Is there an accepted symptom summary scale? If not may need to just count the number of symptoms present
  • But dose adjustments are clinical adjustments based on a symptom "gestalt"
  • Target for analysis is final dose
  • Need SD of dose; best available data will probably come from what doses are used long-term in clinical practice; we'll assume this is a stand-in for the final tolerable dose
  • Once a useful SD estimate is found, it can be used to compute the likely margin of error in estimating the population mean required dose when n=16, with say 0.95 confidence. The margin of error is the half-width of the confidence interval.
  • Would be good to know what evidence exists for the usefulness of plasma drug concentrations in estimating the final required dose


Reagan Leverett, MD, MS, Assistant Professor, Department of Radiology, Women's Imaging

  • PQI project. Two types of images (new vs. old method) were performed for each patient.
  • Examine the agreement between the two methods based on the paired data (kappa stat). Readings are ordinal values.
  • Let a few radiologists read the two sets of images in random order to study the agreement.
  • May need a couple of hundreds of patients, and a few (2 to 6) radiologists. (also want to have good agreement between radiologists, that is, readings of a certain method do not heavily depend on the experiences of radiologists).


Akshitkumar Mistry

Reserved spot for consulting with Chris F. about meta-analysis


Stephen Patrick, Assistant Professor of Pediatrics and Health Policy, Division of Neonatology

  • Mary-Margaret Fill, TDH EIS
  • Neonatal abstinence syndrome and long term outcomes
  • Merge TennCare data with educational data
  • Suggest regression model with traditional covariate adjustment unless need to do special matching (family, neighborhood)
  • Biggest assumptions: children move away from TN for reasons unrelated to potential educational achievement
  • Confounding: women giving birth to infant with NAS may tend to be different from those not having an NAS child; need to adjust for all factors related to this that might be associated with educational outcome
  • Also what is the effect of school on test scores?
  • Birth records have mother's educational level, zip code, tobacco use
  • Matching records may be challenged by mother changing last name
  • Might also look at infant and mother utilization of services, diagnosis of ADHD, etc.; cross-correlate with educational achievement


Lindsey McKernan

Here is the feedback I received on my application: Power analysis never should involve having a power of detecting a previously observed (and probably measured with bias) effect. Power should always be defined as the probability of detecting a minimal clinically meaningful effect. Also, this type of study is more suited for justifying sample size on the basis of precision of an effect of interest (usually a difference or a correlation). Precision is stated as a margin of error e.g. half-width of a confidence interval. Please revise Section E of the proposal and feel free to attend a clinic to discuss.

What was initially written: Power Analyses: Previous researchers have found moderate relationships between trauma severity and pain symptoms (r = .29; Poundja, Fikretoglu, & Brunet, 2006). Power analyses using unadjusted effect size from this study based on their sample size of 130 suggest a necessary sample size of at least 97 for the present study to reveal similar effects. Power analyses of the results of studies of the relationships between trauma severity, pain severity, experiential avoidance, and anxiety sensitivity (Gootzeit, 2014; Ruiz-Párraga & López-Martínez, 2015) suggest that a sample size of 144-158 is necessary to find these associations. The hypotheses outlined above will be tested through bivariate correlation and linear regression analyses. Specifically, relationships among variables of interest (Hypotheses 1A, 1B, 2A, 2B) will be assessed through Pearson product-moment correlation analyses to determine the strength of the association among these constructs in our sample. Tests of moderation (Hypotheses 2C, 3) will be tested using multiple linear regression with cross-products of the variables of interest to assess the interaction between predictors. All analyses will be carried out on either SPSS 22 (IBM, 2013) or the R statistical package (R Development Core Team, 2010)
  • See Chapter 8, P. 8-12 of - suggest using the r=0 curve. This approach is using the margin of error based on 0.95 confidence limits. E.g.: "With a sample size of N subjects we can estimate the correlation coefficient between two variables to within a margin of +/- xx with 0.95 confidence (see graph)."
  • Important to prioritize the comparisons and to report them in this pre-specified order so that no multiplicity corrections will be needed
  • A regression model that allows for interaction between time since trauma and amount of trauma would allow for estimation of the time-decay or enhancement of memories-effect. The time interaction effect may be nonlinear.
Topic revision: r1 - 15 Jan 2021, DalePlummer

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback