Clinical and Health Research Clinic
2022,
2021,
2020,
2019,
2018,
2017,
2016,
2015,
2014, and
before.
Current Notes (2023)
2023 December 7
Tara Helmer, VICTR
We are working with a PI who wants to use My Health at Vanderbilt to recruit patients for her survey study. We would like to randomize patients to receiving the study invitation via a central mechanism (MHAV Recruitment Requests) or direct messaging by the study team (similar to clinical communications). We often get asked if one mechansim is more effective than another and want to explore this research on research opportunity.
In attendance: Frank Harrell, Terri Scott, Tara Helmer, Jackson Resser, Cass Johnson
Is it feasible to study the difference in response rates between MHAV recruitment request and MHAV messages?
Caveats: Patients can manage notifications for both, and may set different preferences for each.
Notifications may look different, and message appearance would be different as well.
Study of interest: Impact of Restricted Medication Access on Care of Multiple Myeloma Patients (PI: Autumn Zuckerman). About 90 patients will be enrolled. PI asked if there was difference in response rate between the two mechanisms.
Patients would need a MHAV account.
Can it be determined whether a patient has viewed the message? Response rates may be highly dependent on patient notification settings, which we may not be able to confirm.
We can determine when the message was sent and when a patient showed or refused interest in the study.
Tara may consider randomization of whether a patient receives message via recruitment request or MHAV message. Currently, the choice is largely determined by study team size and available time, as MHAV messages are more manual than the recruitment request.
All patients will come from the same report (must be marked OK to contact). Randomization could occur from there. Eligibility criteria would have already been accounted for (fairly basic inclusion / exclusion criteria for this study). Also possible additional screening is required.
Also limited to "active" patients (per Epic classification).
Frank also notes on sample size: if you recruit 90, the more even split you can get the better. The margin of error for estimating the difference may be around .15 (15%) at that size; not very specific, but may still give useful insight.
Goals would be to have a response for investigators as they determine how they would lik ethe study to be messaged; VICTR may also want a manuscript as a result.
As long as denominators stay constant, the information should provide meaning.
2023 November 16
Kelly Vittetoe (Alexander Gelbard), Otolaryngology
clinic trial studying effect of peripheral nerve block on postoperative opioid requirements after head and neck cancer resection with free flap reconstruction
Current pain management: narcotics/IV meds
Investigate if treatment will reduce pain scores and overall satisfaction
Items for today: sample size, optimal study design
150 free flaps a year
Best way to randomize
Literature review: effect of IV tylenol
Frank: previous studies provide measures of patient to patient variability
 Power of two sample ttest depends on standard deviation of the outcome
 More variability, power goes down, more sample size needed
 Testretest unreliability noise  bad characteristic for outcome
Want response to have little variability within the patient
Is administration of morphine standardized?
 Pain score correlates with dosage
Pain level is logical primary outcome, secondary could be narcotic utilization
 Pain score would be reliably captured in EMR (010 scale; pain of 17 = oxy5, 810 = oxy10)
 Want to be able to distinguish level 7 from level 10
Placebo effect: surprisingly absent for a lot of medical outcomes
Reliability of pain assessment and ability to be assessed by blind observer
 Bedside nurses rate the pain scale and administer the medicine
 In most patients, nurse will not know whether block was done or not
Improve power: measure calibration between patients
Covariate adjustment (no downside to adjusting for a covariate if it turns out it doesn't predict anything):
 neuro assessment to adjust for risk
 cancer site pain
 age
 smoking status (most would be smokers and drinkers)
Sample size calculation using:
 relative frequencies of pain level
 Minimal clinically importance difference in pain level (difference you don't want to miss)
Grouplevel randomization vs participantlevel randomization
 intracluster correlation coefficient
Will need at least 20 clusters in a cluster randomized trial
 Fewer leaves you susceptible to chance imbalance
Individual randomization has problems, but gives you the highest power
 Person on the ground, for every case, defining block Y/N
 Other option: everyone gets a block in January, no one gets a block in February, and flip flop
Next step: Learning Health System to drill down trial design
General lead time on VICTR grant: 8 weeks
2023 November 9
Stephen Camarata, Hearing and Speech Sciences
Analysis of RCT data on speech, language, literacy, and hearing data.
 Advances in cochlear implant technology; improve auditory signals
Strictly randomized crossover study  N = 48, one withdrew after baseline > 47
Treatment intervention: is there a benefit to cochlear implant tuneup in hearing status? Does this predict performance changes?
 Problem: a lot of measures, will need to subset
Measures NOT amalgamated in the literature; kept distinct
 Pitch aggregation method
15 measures of spectroresolution
Q1: Is there an improvement? Q2: what predicts that improvement?
Frank: desirability of outcome ranking (DOOR)  which treatment resulted in the most desirable outcome? No clinical interpretation  relative, but projected into one dimension
 Rankdifference test > signed ranktest, respects pairing
Randomization test or a priori way to aggregate variables that share same variance?
 PCA, sparse PCA (combo of clustering and PCA)  find out which variables move together across kids
SAP drafted, first 20 participants unblinded (regressionoriented model
)
 So many variables that familywise error will be a problem
Frank: two challenges
1) number of parameters
2) unblinded part of data  once unblinded, you need to use statistical plan without modification
Univariate approach worth exploring
Another research question: given standard scores & percentile ranks
Zscores makes assumptions: assumes reference population selected at random
 SD needs to be meaningful (measure must be symmetrically distributed)
Percentiling is like assessing speed of runner relative to other runners rather than using a stopwatch
Frank's preference is to do things on raw data scale (raw scores will have different meaning for different ages)
Raw score regardless of age is like a stopwatch
Jennifer Duke (Samira Shojaee), Interventional Pulmonary
consideration of multistate analysis for endpoints in a RCT looking at patient outcomes of chest tube flushing vs no flushing.
No RCTs evaluating role of regular chest tube flushing in the setting of pleural space infection for optimal drainage and treatment outcomes
Many studies of pleural space infection do not report a chest tube flush protocol
Hypothesis: regular flushing of catheters leads to early tube removal
Target recruitment: N = 96
Stat considerations:
 Multistate model for time to event analysis
 Cumulative failure of therapy/drainage curve will be calculated from the KaplanMeier method and compared using logrank test
 MV linear regression to evaluate treatment effect on outcomes measured 1 week after treatment randomization adjusting for measurements at baseline and covariates
tPA group vs nontPA group; chest tube = clogged or patent; chest tube + tpA was considered treatment failure or not
8 state transition model with 17 total transitions  some states extremely unlikely
Original plan: endpoint = time to test tube removal
 time to something good is only interrupted by something bad (never good)
 time to event is hard to interpret, hard to handle censoring
Put patient paths in a hierachy (how bad was the worst thing that happened to a patient in a given day)
 One parameter: odds ratio, measures odds of transitioning to something worse
 Ordinal transition model requires clinical concensus on what is better and what is worse
 Bad = level X or worse
Ordinal transition model  rare states are not a problem
Next step: order levels for transition model by severity
What is frequency after you take overrides into account?
Literature is missing several states (Jen developed CRF so information is captured on daily basis)
 Be clear on primary endpoint
 Proportion of patients getting level 3 or level 4 outcome? State levels you want to calculate probabilities of from multistate model
Frank may seek permission to share Vancomycin protocol (or at least statistical analysis portion) with the investigators
2023 November 2
Garrett Booth, Pathology
Preliminary stage.
I am seeking feedback on the feasibility of performing a metaanalysis on gender award conferral rates. To date, several studies have been conducted demonstrating gender inequities in different medical society award conferral rates.
For example, Am J Clin Pathol
. 2022 Oct 6;158(4):499505. doi: 10.1093/ajcp/aqac076., that highlighted significant gender inequities within my field of practice, pathology.
My primary aim is not to clear up a controversy, rather I would like to see if it is possible to collect (and how best to collect) data on the effect sizes and compare them across different studies that have looked at gender inequities in medical awards.
Garrett Booth, Frank Harrell, and Cass Johnson in attendance:
· Can effect size of authorship and authorship attribution be used in a metaanalysis?
· Does doublecounting of folks on different clinical practice guidelines count?
· Frank: There is nothing about this context that makes metaanalysis less useful than for a medical attribute, for example.
· Timeoriented flowchart; would that look like a denominator where a woman enters as a pathologist, and we are interested in what happens from there?
o Garrett: Primary question is more how equitable is representation across all these smaller fields of study. Metaanalysis could look at authorship attributes and ideally determine the effect of the bias.
· Frank: If available, longitudinal data is some of the most effective data (apart from randomized); or, if age of each person (compared w/ average age of entering the profession) could be determined, perhaps use that in place of longitudinal data?
o Unlikely to have that information right now.
· Cass emailed previous investigators on 11/2/2023 for information on the metaanalysis management software they used and will reach out as soon as a response is received. If needed, can also provide an example template for how metaanalysis data may be organized so that it can be easily used in statistical analysis.
· Additional resource on metaanalysis:
Welcome!  Doing MetaAnalysis in R (bookdown.org)
Ashley Leech (Shawn Garbett), Health Policy
I would like feedback on the analysis below.
d. Research Design. The study design will be a retrospective observational cohort study of pregnant individuals with either a diagnosis of opioid use disorder or evidence of medication use for opioid use disorder at least three months prior to pregnancy. We will require continuous enrollment three months prior to pregnancy and up to 28 days postpartum. For sensitivity analyses, we intend to relax the study/continuous enrollment criteria for the prepregnancy period, and how we define the threshold of pre/postpregnancy based on estimated LMP and gestation. Our research goal is to estimate the effect of medication switching from pre to postpregnancy on the incidence rate of adverse events related to opioid use disorder and pregnancy complications. Our design will answer the following research question: What are the potential risks of switching OUD medications from prepregnancy to pregnancy for (1) Individuals who switch “down” (i.e., methadone to buprenorphine), (2) Individuals who switch “up” (i.e., buprenorphine to methadone), (3) Individuals who stopped MOUD from prepregnancy to pregnancy (defined as a medication gap greater than 14 days); and (4) Individuals who did not switch medication from prepregnancy to pregnancy; measured up to gestational week 19.
e. Exposure. The exposure for Aim 2 will be individuals who had any switch in medications from prepregnancy to pregnancy (from three months before pregnancy through gestational week 19), stratified by: (1) Exposure (a): Individuals who switch “down” (i.e., methadone to buprenorphine) and (2) Exposure (b): Individuals who switch “up” (i.e., buprenorphine to methadone). The comparator groups will include, (3) Individuals who stopped MOUD from prepregnancy to pregnancy (defined as a medication gap greater than 14 days); and (4) Individuals who did not switch medication from prepregnancy to pregnancy. We will take an intenttotreat approach where we will focus on an individual’s first switch due to our preliminary data findings showing that downstream switches could be a result of the first switching decision.
f. Outcome. The outcomes will be measured from week 20 through the neonatal period (within 28 days postdelivery). We will measure two outcomes: (1) Complications of OUD; and (2) Complications of pregnancy. Complications of OUD will be defined as overdose, hospitalizations, infections such as endocarditis, abscess, osteomyelitis, and maternal death, while complications of pregnancy will be defined as hemorrhage, primigravida cesarean section, preeclampsia, and chorioamnionitis.
g. Covariates. A priori covariates in our model will be measured 90 days prior to estimated LMP (prepregnancy period) and will include demographic variables such as age, race/ethnicity, income, zipcode/census divisions, eligibility group code, pharmacotherapy dose prior to switching, days between previous medication to switched medication, median travel distance to medication prescriber prior to switching (Aim 1), opioid use disorder severity (e.g., opioidrelated emergency department and inpatient visits), nonopioid substance use, other mental health conditions, chronic comorbidities, and prenatal care engagement. We will account for relapserelated indicators such as opioidrelated emergency department and inpatient visits and gaps in medication use prior to the first medication switch (operationalized as a binary yes/no variable). Additionally, we will account for the following factors in our analysis: The number of medication switches during the exposure period, whether individuals who stopped MOUD in early pregnancy later switched back to MOUD at any point during the exposure period, and the “type of switches”, i.e., antagonist or partial agonist to partial or full agonist; full or partial agonist to partial or antagonist.
h. Analysis. We will begin by performing several descriptive analyses to gain a better understanding of our sample cohort. These analyses will include examining demographic and other characteristics, quantifying the individuals within each exposure group, and, for those who switched medications, calculating the median number of switches during the exposure period. We will then employ a propensity score method with overlap weighting to balance important patient characteristics across treatment groups. Since we expect that our medication comparator groups could be quite different, overlap weighting is particularly advantageous when comparator groups are initially very different. Following our propensity score design, we will report both risk ratios and an extended Cox proportional hazards survival model, particularly, the Prentice, Williams, and Peterson (PWP) (total and gap times since the previous event), to account for recurrent and competing outcome events. Reporting both risk ratios and timetoevent outcomes accounting for competing events provides a more comprehensive understanding of the relative risk of an event occurring across groups, while also providing detailed information about when events occur and the impact/interplay of multiple potential events over time.
Ashley Leech, Shawn Garbett, Frank Harrell, and Cass Johnson in attendance:
· Analysis: Propensity score weighting,
Cox proportional hazards with recurrent / competing events
· Feedback from Frank:
o May be two reasons to change medication:
§Planned change vs. reactionary change
§Is causal analysis needed to get rid of this feedback loop?
· Andrew Spieker, Bryan Shepard (a few others as well) specializing in causal inference within the department. We could ask someone to join clinic.
o Internal timedependent covariates are present here.
§External version would be crossover study where everyone must switch drug at a certain time.
§Interpretation is made more difficult with internal timedependent covariates.
§If covariates aren’t updated frequently enough, what we are trying to learn from our change variable will be difficult to interpret.
§Propensity adjustment may not be sufficient for that.
· Ashley: Propensity score weighting was chosen because exposure groups may be vastly different. Ashley wanted to account for that via several covariates (about 10). Sample size is estimated at 70,000, but many have not switched at all (lots of 0’s).
o Static Propensity score: if we’re looking at baseline or characteristics at one point in time, then we are not accounting for people switching back and forth (increasing dosage and then decreasing, for example).
§The present situation is more dynamic; timedependent covariates are important.
§Confounders would need to be measured within days / weeks of switch
o How do you want to word your conclusion? We learned something that gives the recipe for required changes to affect better outcomes (causal), or a noncausal conclusion?
§Frank suggests going without the propensity score weighting
§Miguel Hernan (
Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease  PMC (nih.gov)) could serve as a great case study to look into. Can get same results as RCT if timedependent covariates were well understood and updated frequently.
· He also has a great book on causal inference (
Causal Inference: What If (the book)  Miguel Hernan's Faculty Website  Harvard T.H. Chan School of Public Health)
2023 October 26
Mert Demirci (Annet Kirabo), Nephrology
We would like to add a substudy to our R01funded main study investigating health equity (race differences in salt sensitivity), and I will be the primary investigator. I submitted my grant to VICTR for funding.
As a new research fellow, I have limited knowledge of biostatistics. I would like to discuss biostatistical analysis options for our small sample size substudy.
Main study conducted with NIH funding for three years. N = 24 (6 black, 18 white), still enrolling
 Recent published paper shows no difference in race (but enrolling black patients is difficult)
 Limited budget
 Existing research shows black people are more saltsensitive
 Add substudy to target enrollment of black patients
New aim 1
Chose MannWhitney U Test because small sample. Power analysis?
Frank: Issues investigators are experiencing are very common
 Fisher: When pvalue is large, you need more data. Pvalues do not handle small sample sizes well
You already know blacks are more saltsensitive. The question is how much
 Use confidence intervals
With small sample size, need to choose one parameter. "Put all your eggs in one basket"
 Calculate single confidence interval for association you are most interested in
To cut margin of error in half, you need 4x as many participants
 Precision = square root of sample size
Also interested in female vs male
Width of CI will narrow with more data, but center will move around
Precisely measured variables lends itself better to small sample sizes than a variable that is not precisely measured (race)
Mert: if we run some correlation studies between ADMA and salt sensitivity, can we compare groups?
 Frank: no free lunch  to have MOE ? +/ 0.1, you need 400 patients. Would need even more patients for that
Annet: Come up with genes related to nitric oxide
Frank: only way to separate race and sex would be to have a balanced dataset (difficult to achieve)
 Boost sample size with repeated measurements? Study patients under different conditions
False discovery rate irrelevant without false nondiscovery rate
 Procedure could have zero power to detect genetic characteristics on account of FNDR
Research with small sample size is tough
hbiostat.org/bbr
 One chapter on statistical inference
 One chapter on highdimensional work, check robustness of findings
We are planning a pragmatic RCT for flushing with saline vs no flushing in patients with infected pleural spaces requiring chest tube placement. Primary outcome will be days until chest tube removal. Primary question would be to review plan of multistate analysis for data.
2023 October 19
Marina Aweeda (Alexander Langerman), Otolaryngology
We are working on a videobased coaching project to teach a surgical procedure to residents. We are interested in asking about validated survey tools (such as OSATS, OCHRA, SURGTLX) and which ones would be the most appropriate for statistical analysis. We’d also like to ask for recommendations on what types of data to collect for the statistical analysis.
No formal feedback system to give residents
Videobased coaching intervention (highlight reel of most important steps of operation)
 Was this useful, feasible, repeatable?
Unable to power study with max N = 10 residents (up to 8 attending)
 Simplest thing you can learn (proportion of residents who responded a certain way)
 Minimum to estimate proportion ~ 96
 Cannot generalize to population of residents with reduced sample (margin of error would be wide)
Can learn a little bit (quantify uncertainty with uncertainty intervals)
Which validated survey tools to use?
Accounting for observer variability is important
Ideal set up with different observers: each resident scored by several different observers
 Minimize observer variability by averaging
If one observer for all participants: uniformity within and between residents, but preferred over different observer for each resident
Timing of intervention would give away whether participant was in pre or post
 Could hold off on evals completely until the end
Some experts that could help with that (sociology dept)
OSATS: quantitative data (number score)
OCHRA: scored on performance in surgery
SURGTLX: evaluates mental workload, how demanding, etc.
One way ANOVA used?
With N = 12, statistical test would be more misleading than helpful
Better and safer to quantify what you have with confidence limits
 Confidence limits penalize for small sample size
 Wouldn't talk about power; instead use confidence limits for mean differences pre & post
VICTR voucher almost not needed, but could help with some stats if needed
 90 hours, $5k grant
Also VICTR studio  free, invite experts from other fields
Design ideas:
 Increase sample size, could give more options
 Randomize residents > half get intervention, half don't > parallel group randomized trial
 Could attribute differences in groups to the intervention
 Strongest way to determine that intervention caused the result
Hybrid approach: delay intervention, assess people randomized to have intervention late at same time you assess someone randomized to have the intervention earlier
Keep focus on feasiblity with reduced sample size (shy away from pvalues, use confidence limits)
How are data being collected?
 Survey data in REDCap
Could bring draft of the REDCap to another clinic to get statistician input
2023 September 28
Jennifer Quinde (Carissa Cascio), Psychiatry
We have zeroinflated/semicontinuous data to model. Our DV (dependent variable) is a percentage of video frames during which a participant’s facial expression likelihood was above a set threshold in response to two stimuli conditions. One of the conditions elicited little to no facial activity and thereby our output for those trials is 0.
Are hurdle models better than zeroinflated models? Are there alternative models to consider?
Two stimulus temperatures (warm/hot) and record facial expressions
Record muscle activity in the face (engagement)
Plotted median engagement against pain rating
 Potential three way interactions between groups
Zeroinflated data  how to best model?
 Has threshold of 25% been validated? Would take a lot of data. Assumes discontinuity, does not capture close calls
What is the area under the curve when you are in some danger zone?
 AUC captures severity of hemoglobin and how long you stay in the bad zone
Difference between being correlated with the truth and fully reflecting the truth
 Consequence of using threshold when assumption of discontinuity does not hold: artifacts & edge effects, amplify measurement error
 Removing threshold = more powerful statistical analysis
Ordinal regression will handle any distribution and will give you probability of being above a given threshold via backend calculation
No need to transform Y when using semiparametric model
Use robust sandwich variance estimator
Model with rms package
Emily Wooder (Amelia Maiga), Surgery/Acute Care Surgery
We are using trauma video review to analyze the impact of communication patterns on the resuscitation efficiency of bleeding trauma patients. Specific exposures of interest include the percent of overlapping communication during key moments. We would specifically like input into an analysis plan that may inform our study design and data collection plan. This is a VMS RI project that I am supervising. Mentor confirmed.
Interested in overlapping communication, interruptions during EMS handoff, and speaking during BP measurement
Outcome: total resuscitation time (wheels in to wheels out of trauma bay)
Want to do linear regression
Add age to covariates?
Frame as survival problem? Time to successful resuscitation?
 Right censoring if no resuscitation
Assumption: Resuscitation is independent of dying
Covariates are not defined during resuscitation itself
 Avoid circularity  separate effects
Several other discrete timepoints between when patient comes in and when patient leaves
Regarding competing risk of death  plan is to exclude those who die
 Whether a participant will be excluded cannot be defined at time zero
Time period of wheels in to wheels out: 25 minutes +/
Capture extent of shock at fixed time would be key
First five minutes = landmark observation period
 To qualify for analysis, have to survive for at least five minutes
 Those who don't survive 5 minutes likely came in deceased
Failed resuscitation in trauma day takes ~ 20 minutes
Predicting participant's status five minutes from now
Want to collect data so all options are on the table
 Collect data so that you don't make decisions you can't undo
2023 September 14
Ashley Leech (Shawn Garbett), Health Policy
The study design is a retrospective observational cohort study of pregnant individuals with a substance use disorder. Our design will answer the following research questions: (1) What are the potential risks of switching opioid use disorder medications from prepregnancy to pregnancy for (Group a) antagonist or partial agonist to partial or full agonist; (Group b) full or partial agonist to partial or antagonist, up to gestational week 19; and (2) for new initiators in pregnancy, what are the potential risks of switching medications any time during pregnancy for (Group a) antagonist or partial agonist to partial or full agonist; (Group b) full or partial agonist to partial or antagonist. The comparator group will encompass individuals who have zero switches or switch less frequently; we will include the effect size of the switching count.
Exposures. The study exposure will be the count of medication switching; either from (a) to (b) and/or (b) to (a); from prepregnancy to pregnancy up to gestational week 19 and anytime during pregnancy. The exposure time will be censored by the observation period (accounting for the exposure time in the model).
Outcomes. The primary outcomes will include medication discontinuation, severe maternal complications, allcause hospitalizations, and ED visits.
Questions:
 Confirm whether a negative binomial regression is feasible, especially given sample size constraints.
 How to phrase a power analysis statement give what we know from previous analyses:

In TN, 25% of 14,000 pregnant individuals with opioid use disorder or on medications for OUD had at least one medication fill, which = 3,500 people in total.

Based on our persistence study, 6% of individuals switched from buprenorphine to naltrexone over the oneyear study period

Based on our 3,500 pregnant individuals on medication, this roughly equates to 210 individuals (but this is just considering buprenorphine to naltrexone and no other combinations, including methadone).

If our Medicaid sample includes 10 states, this would roughly equate to 2,100 individuals with a medication switch during pregnancy.
K1 award based on decision model; putting in for R01 soon
Two aims: using large medicaid sample to answer questions
Question today: sample size concern  what design maximizes info?
Care about effect of switching medication relating to several outcomes
Negative binomial regression (more individuals that don't switch; 58% that do switch)
 Extra parameter accounting for overdispersion
Poisson regression? Have more parameters than Poisson would need
Count outcomes could have 01 inflation
Rule of thumb for sample size? Frank might have one
Features to adjust for in regression (given observational study):
 if timefixed confounder, add as covariate in model
 Covariates: cotherapy dose
Possible trigger: relapse
For treatment switches: interested in looking at both number of treatment switches & highlow vs lowhigh
Treatment covariate feedback loops? Might not be relevant in this field
Covariates will need to be measured at different times during the exposure period (the pregnancy)
Timevarying covariates
 Multiple rows for each participant where covariate corresponds to a value at a given time
 Would this work for a negative binomial model?
Survival model that would allow for recurrent events
Mixed model would depend on how data are structured
Bryan: negative binomial model might not use the data most efficiently, but could be more interpretable to reviewers
Recurrent event survival analysis is a little more complicated  AndersonGill model
 More power, account for timevarying covariates
Benefits of timevarying covariates:
 Adjusting for timevarying as if fixed leaves you susceptible to confounder treatment feedback loops
Continuous variable dichtomized = 5% of information you had before
Negative binomial could work, need to adjust for covariates
 No closed form formula for effect size/power; could estimate using sims (use preliminary data)
 passed R package has negative binomial for 2sample; not quite what we're looking for
Recurrent event model would have more power, but would be more complicated & might be more difficult to interpret
Zeroinflated regression options: adds parameter to adjust for lots of zeroes
2023 August 31
William Tucker (Whitney Gannon, Matthew Bacchetta, Jonathan Casey), Cardiac Surgery
We plan to conduct a retrospective observational study examining doses of anticoagulation in patients supported with venoarterial extracorporeal membrane oxygenation during lung transplantation. Over the past 5 years, anticoagulation dosing for this population has evolved to be considerably less. We hypothesize that lower anticoagulation dosing is associated with the presence of thromboembolic complications and correlated with a decrease in blood transfusion requirements. In particular, we are interested in examining doseresponse relationships between anticoagulation dose during surgery and blood transfusions required.
Research Q: What is the optimal dose of heparin for intraoperative support?
Hypothesis: Heparin exposure associated with blood product transfusions during lung transplantation supported on venoarterial ECMO
Inclusion: bilateral lung transplant between 2018 and July 2023, intraoperative VA ECMO support, age > 18. Several exclusions...
Exposures: heparin bolus dosing, heparin drip rate, ...
outcomes of interest: blood products transfused, thromboembolic events
Flowsheet  N of > 180
Pharmacology of heparin nailed down such that, for example, optimum dose is known as a function of body mass?
 In lung transplant surgery, ACT is not checked & additional heparin not provided based on any ACTs to follow
Aim of project: quality of patient's overall outcome with regard to coagulaterelated outcomes
With lots of events, you might find different doses optimize different events
 Advantageous to have ordinal scale outcome to assess what dose makes patients do well overall
 Clinical overrides, increases power
For dose x, did patients have worse outcomes than dose y?
Patient could have two different bad outcomes  ignored if only looking at one outcome, not if you use ordinal scale
Ordinal scale could have as many levels as you want
Many ways to break ties to make scale more clinically relevant with more power
Ordinal scale won't be disturbed by infrequent bad outcomes
Good to crowdsource  REDCap survey where clinicians vote/rank outcomes
 Declare winners based on small groups of runners, then put all of that together
Accounting for confounding: include covariates in the model
 Clinical experience super important to identifying confounders  which features/lab values do clinical experts use?
Could assess dose over time (apply smoother)
 Cultural shift to less and less heparin over time
Likewise assess how ordinal scale changes over time
Will reviewers want ordinal scale validated? Sometimes, but Frank says alternative is almost always worse
 Some reviewers do not like the PO assumption in the PO model... often they are already make worse assumptions (Frank blog articles)
Time zero: patients have to make it to this time to make it into the study
ECMO is to some extent a timedependent covariate
Patients have to get ECMO to be in analysis  have to live long enough to get ECMO, so time zero might be initiation with ECMO
Bolus given just before canulation  for intents and purposes, a simultaneous act
With sample of 150 or so and rarity of events, w
ould descriptive paper be more effective?
Frank: descriptive analysis suffers the most from sample size, causes measures to be noisy
Is sample big enough to do most powerful analysis?
More uniform the scale, more statistical info
Not slam dunk: depends on signal to noise ratio
What will ordinal scale look like?
 Example 1: some lab value + 10 times the number of transfusions needed (transfusions puts you in a different part of the scale, more transfusions = bad)
 Example 2: 0  50 did not need angina meds ; 51  89 did need angina meds; 90 = death
Don't need to write out every level, but define zones
 Hierarchical scale  if participant has one minor event and one major event, assumes you don't care about the minor event
VICTR voucher  90 hours of biostat help, finishes in a year
Expect that data will be complete.
Comorbidities not much of an issue here (maybe BMI)  not the practice
 BMI can't be extreme  have to be healthy enough to receive transplant
2023 August 17
Alexandra AbuShmais (Ivelin Georgiev), Pathology Microbiology Immunology
We obtained ~2700 B cell receptor sequences specific for several antigens that we have grouped into 5 different categories (by viral family) We have analyzed the data with respect to several sequence features (V gene usage, HC:LC pairs, somatic hypermutation, CDRH3 length) and want to compare the data across the viral family categories. Need to know which statistical tests to use. Number of sequences in each category are not equal.
Data generation phase completed
2700 sequences across 10 individuals  no interdonor analysis
 uneven with respect to participant and viral group
If studying left & right eye of a patient, patient contributes an experimental unit of roughly 1.5
 Minimum of 10 experimental units here (10 participants)
How much do receptor sequences correlate within the participant?
Singlecell data, multidimensional
Ivelin: sequences are independent, some correlation for sequences coming from the same participants
Germline gene usage clustered by viral group specificity
 Relative frequencies
Stacked bar chart  percent of antigen specific repertoire x IGHV
Hypothesis testing is the aim
Difficult to conceptualize generalizability beyond the single individual
Most of the data is dominated by 34 individuals
Could reproduce stacked bar chart by individual
Other ways to present proportions than stacked bar chart?
Unfamiliar with singlecell data
 many Biostat faculty members work on single cell data. Could invite some on a special Thursday morning
2023 July 27
Brittney Snyder (Tina Hartert, Pingsheng Wu), Department of Medicine, Division of Allergy, Pulmonary and Critical Care Medicine
We are performing a retrospective, populationbased cohort study utilizing an administrative database (
TennCare). We use a marginal structural model with inverse probability of treatment and censoring weights to estimate the effect of an asthma medication on influenza. Each individual’s followup time is divided into periods to capture changes in medication usage, etc., so an individual may have multiple rows and corresponding weights in our dataset. How can we create a weighted Table 1 to assess for covariate balance if individuals could have multiple weights? Would it make sense to use a selfcontrolled design as a sensitivity analysis to mitigate issues of potential unmeasured confounding?
Exposure: LMA protected periods
Outcome: severe influenza illness
Analysis: restricted to time within influenza seasons based on regional virologic surveillance
MSMs with IPTW and IPCWs were used to estimate the effect of LMA use on severe flu illness
1 year of continuous follow up to assess baseline vars followed by flu season
Want to control for vars changing over time that can be both mediators and confounders
Estimating weights: calculated stabilized IPTWs and IPCWs for each person period using methods from Fewell et al
 Logistic regression
 Numerator models included only nontime varying covariates
 Denominator models included timevarying and nontime varying covariates
Inverse probabilty weighted model
Reviewer suggested weighted table 1
 Question: should table be presented by personperiod instead of by personseason?
 Issue: does not account for # of periods per person > could bias
Table 1 should not have been included in submission
 Conditions backwards on outcome  not beneficial in cohort study
Just do baseline characteristics for population (no columns)
Literature does not show precedent for Table 1 with marginal structural model
Measures of relative explained variation
Table of odds ratios for LMA use
 Calendar time is important
Figure with odds ratios
Good to look for alternatives to weighted analyses (lose a lot of power)
What goes wrong when you do a standard time dependent covariate analysis such that you need weights and MSM?
 You have variables acting as confounders and mediators (danged if you do, danged if you don't)
Landmark analysis  keep starting the clock over and doing covariate analyses
Case crossover study  within subject design
Matching weights  create population like you're matched, don't lose a lot of efficiency
 Matching equivalent for marginal structural model?
Could include in markov model whether participant had flu in the previous year
Adrienne Marler (Lea Davis, Kevin Niswender), Pediatric Endocrinology, GME
We will be performing a retrospective analysis of lipid levels in transgender adults, comparing those who are on genderaffirming hormone therapy (GAHT) to those who have never been on it. We would ideally track changes in lipids over time, but I am concerned our population of adults never on GAHT will be too small to make meaningful comparison. We’ve alternatively discusses a crosssectional model or using individuals as their own controls. I also hope to minimize confounders that independently impact lipid levels  age, weight, diabetes status, smoking. I am requesting assistance with planning this analysis & creating a statistical model. I also plan to apply for VICTR voucher for continued assistance with analysis.
H1: identify additional variables that may impact lipid metabolism as well as perceived risks of benefits of sex/gender minority research.
H2: Transgender men on GAHT will have higher rates of and more severe dyslipidemia than transgender men not on GAHT
H3: Transgender women on GAHT will have higher rates of and more severe dyslipidemia than transgender women not on GAHT
Experimental group: trans adults on GAHT for 2+years
Control: trans adults not on therapy
Outcomes: cholesterol levels (continuous)
Known & anticipated limitaitons:
 currently do not know how many TG adults never on GAHT have had lipid panels at VUMC
 Inconsistent documentation of risk factors (smoking, alc use, etc)
 GAHT divided into masculinizing and feminizing therapy, as regimes vary
 After 2 years, patients more likely on maintenance dosing
 Psychotropic medications may also variably impact lipids
 Different regimens, duration of therapy
Inclusion: 18+ adults selfidentifying as transgender/nonbinary
Exclusion: known familial hypercholesterolemia, no lipid panes at VUMC
Covariates: age, race, weight, smoking status, alcohol use, A1c
Analysis:
1) longitudinal, retrospective
2) crosssection, retrospective
Design Qs:
What is longitudinal time zero?
 Baseline, before starting therapy for those in experimental group, first time at VUMC for control group
For patient landing at nine months, can still get good info (just would have incomplete profile from time zero to nine months)
Characterize longitudinal profile with as much resolution as data allow, rather than require two years for entry into study
 Mixed effects model may not reflect correlation pattern within patient (maybe more serial correlation)
Continuous time
If few people are followed for seven years, these people could have a lot of leverage/influence on estimates
One tool: variogram (semivariogram)  calculate correlation between all pairs of measurements from same patient (uses all available data)  assumes correlation is isotropic
Mixed effects model assumes variogram is flat
Getting correlation structure right has a # of advantages (small one is software runs faster)
Include variables like `on psychotropic meds`, `started on lipidlowering Rx` in analysis?
Often discourage folks from matching on similar covariates (throwing away info)
Interested in applying for VICTR voucher. Another clinic meeting would be helpful to continue brainstorming
Voucher includes analysis work. Investigators & analyst finalize analysis plan together at the beginning
Involve statistician early
2023 July 20
Alison Swartz (Heidi Silver), Gastroenterology
We are using a curated SFRN dataset from the synthetic derivative of 886,899 patients who have cleaned weights and BMIs over the period of time from 19972020 (23 years).
We plan to: a) identify weight cyclers versus non weight cyclers (weight stable or weight loser); b) determine the characteristics of being a weight cycler versus nonweight cycler; c) determine the demographic and clinical risk factors that predict being a weight cycler; and d) determine the cardiovascular and other medical outcomes of being a weight cycler. We will define a weight cycle as gain and subsequent loss of 10% from weight extrema (or vice versa).
Questions:
 How to prepare for VICTR voucher application for biostatistics support for the project?
 If approved for a VICTR voucher, what is the process of requesting statistics support?
 Do you have any memoryefficient recommendations for computation of this very large dataset?
 What are your suggestions for optimizing usage of this large dataset?
 Any particular pitfalls or concerns when using large datasets?
Little experience with gigantic datasets
Initial analysis done in Python  interested in using R for stats
We have weights, heights, BMI, etc.
Weight cyclers vs nonweight cyclers  longitudinal
 Debate on whether weight cycling increases risk
 Frank: weight cycling has likely been defined, but not validated (no consensus on definition)
 Ask more general question: "what extent of weight cycling correlates with what extent of clinical outcome?"
Characterize weight cyclers vs nonweight cyclers
 Frank: don't want to label as cyclers vs noncyclers  eliminates close calls
 How frequently is weight measured?
Crosscorrelation problem or "predict the future" problem?
 More interested in landmark study (the latter)
23 years of data, though scattered
 Qualification period
 In other study, those who cycled more frequently were shown to have worse outcomes
10 randomlychosen weights for each year
Citation: Frank Harrell & Shi Huang paper on weight
Can prespecify contests  just don't throw the kitchen sink (multiplicity problem)
Bryan: longitudinal model and looking at association with timevarying covariates
 Concern: generalizability when restricting sample to folks with ten measures in 10 years of data (subject matter expertise to decide cutoff)
How much does collinearity between lagged variables present issues?
Need to demonstrate doseresponse relationship
Two dimensions of weight volatility  "Genie's" mean difference
"What is the 90th percentile of changes, adjusted for height?"
Prepare data with date, weight, and height for each measure
Frank's book: Search "R workflow"
To keep in mind with large datasets: possibility of numeric overflow
Avoid pvalues with extremely large N > use relative explained variation in outcome
datatable package handles memory really well
Use clinical considerations to cut down on pool of variables
Appropriate variable transformation  not always clear
VICTR studio: any biostatisticians to invite?
 Shi Huang & Frank
Ask who has experience with height data
How to settle on granularity of weight measure? Year, 6 months, etc?
 Most patients are outpatient
If gaps in the data, process to correct for uneven gap times?
Need adjustment for variation in gaps
Hexagonal binning
For longitudinal analysis: spaghetti plots are useful (take random sample)
How many measurements there are on each patient, nonparametric curve, clusters patients (catch data problems)
We have performed a preliminary
PheWAS study in collaboration with VICTR that has linked two SNPs in the GRM7 gene with a
PheWAS code for neurofibromatosis, and we hypothesize that this gene is a genetic modifier in the NF1 background. There are now 2,482 neurofibromatosis patients in Vanderbilt’s
BioVU database, and approximately 416 have a banked DNA sample of high quality. We plan to amplify and sequence the rs9870680/rs779710 SNPs from these individuals at Azenta. Using this larger cohort of data samples will allow for a more robust association between GRM7 rs9870680/rs779710 and NF1. To maximize this sample size and increase the rigor of our analyses, we plan to sequence all available
BioVU samples associated with the ICD.9/10 code for NF (237.7/Q85.0), including patients of all genders, races, and age ranges. This would amount to a total of 416 samples. We would then like to mine
BioVU patient data to determine if patients of specific genotypes at these SNPs are enriched for specific comorbidities in NF1, most notably learning disabilities and ADHD as we know that the GMR7 gene is linked to learning and memory. We need assistance in determining how to mine patient records from a statistical viewpoint.
2023 July 13
Luis Okamoto, Clinical Pharmacology
This is our followup biostats clinic to the one we attended on 6/22/23. We are designing a pilot trial to test the efficacy of guanfacine treatment for chronic fatigue syndrome patients. We would like biostatistics support for a VRR.
Objective: gather preliminary data on efficacy of guanfacine vs placebo on hyperadrenergic ME/CFS
Hypothesize that treatment with central sympatholytic guanfacine will improve fatigue & function (disability)
No biomarker that has efficiently identified this subset of patients with high sympatholytic activity
Study design: Proposed doubleblind, randomized, crossover study with placebo vs guanfacine
 Enrichment withdrawal design
Responder analysis: assess perceived response to guanfacine for POTS symptoms
 see which clinical characteristics were associated with improvement
Improved approach: correlation between scores & different clinical characteristics
Positive trend between PGIC change in hyperadrenergic symptoms frequency
 Frank: add rank correlation coefficient
 Bubble plot to represent sample size for coincident points
T score is normalized
Pvalue is "too easy"  include rank correlation coefficient instead (ranks can be compared, pvalues cannot)
Plan to use wilcoxon rank sum test in primary study
Are relationships strong enough to base clinical trial on?
How to leverage information to identify who should be treated?
 Not seeing "smoking gun" in plots
Bryan: does guanfacine lend itself to study with respect to blinding? For instance, can participant identify whether they are being treated with drug or placebo based on symptoms?
Italo: guanfacine does make you drowsy... but drug that makes you drowsy can improve fatigue
Washout period  two weeks is good balance between getting rid of drug and study feasibility
 Patients tend to be reluctant to being taken off the drug
Ideal to put confidence intervals when doing correlations
2023 June 22
Luis Okamoto, Clinical Pharmacology
We are planning a pilot study to test the efficacy of guanfacine treatment in patients with chronic fatigue syndrome or postural orthostatic tachycardia syndrome. Our preliminary data suggests a subset of patients with central sympathetic activation could benefit from treatment with a central sympatholytic like guanfacine. We would like to discuss a statistical analysis plan for our study.
No diagnostic test/FDAapproved treatment for ME/CFS (myalgic encephalomyelitis/chronic fatigue syndrome).
In previous work, hyperadrenergic phenotype associated with more severe disease & greater autonomic symptoms with sympathetic overmodulation and lowest quality of life
Clonidine has been tried as treatment, but all studies have failed to improve fatigue & function
 Potential reasons for failure: not selecting for hyperadrenergic ME/CFS patients & side effects of clonidine
We propose central sympatholytic therapy with guanfacine would be effective for treatment of fatigue & function in CFS patients with phenotype
 Preliminary study conducted: patients rate their impression of change
 26% nonresponders, 74% responders (to guanfacine)
Found improvement was related to the frequency and severity of hyperadrenic symptoms & ...
Compared to nonresponders, overall improvement associated with improvement in ...
 Similarities between responders & nonresponders in age, disease duration, post exertional malaise, impairment in sleep & memory, hyperadrenergic symptoms, orthostatic intolerance (responders tended to lower baseline tolerance)
 Responders tended to have more severe fatigue
Predictors of response to guanfacine: headup tilt (HUT) and Valsalva maneuver
 Responders: > DBP increase at 1 & 3 min of HUT, > BP increase during late phase 2 of Valsalva
Target population: CFS patients with hyperadrenergic phenotype
Overall goal: grant proposal to assess efficacy of guanfacine for the treatment of CFS symptoms
Objectives: conduct small study to gather preliminary data
 Efficacy of guanfacine (vs placebo) on hyperadrenergic ME/CFS
 To estimate sample size & power
Study Design: doubleblind, randomized, crossover study with placebo vs guanfacine added to standard of care for 2 weeks
 Enrichment withdrawal design (suggested by FDA for initial assessment of efficacy of ampreloxetine in improving OH in patients with autonomic failure)
 Outcomes assessed at home (questionnaires, actigraphy, orthostatic vitals
Study design advantages: patients already identified, minimal involvement from patient, clear stopping rules, lowcost alternative
Outcomes assessed at second week of treatment: fatigure (CIS, primary outcome), POTS & CFS symptoms & PGIC, subjective function (SF36), objective function
Analysis: outcomes assessed on the 2nd week outcomes: placebo vs guanfacine
 Hypothesis: fatigue score (CIS, primary outcome) after 2 weeks of placebo > guanfacine
 Assessed via Wilcoxon signed rank test or paired ttest
Frank concerns: problems with patients not remembering very well
 Measuring change is difficult because it is dependent on patient's initial reading
Not good to base responder analysis on change
 Responder analysis = minimum information analysis (doesn't capture close calls)
 Responder analysis with 19 patients = noise (need ~ 180 for anything meaningful)
Italo: Need more preliminary data
Frank: Need high signal:noise ratio, especially with small sample size (would need to demonstrate doseresponse relationship  rank correlation)
 Difficult to distinguish minimal to no change
Delta scores boxplots need to be revisited (raw data display = scatterplot)
 Boxplots are combining unlikes
 No need to categorize responders  basic rank correlation and scatterplots
Responders & nonresponders is low information measure  "what is rank correlation between disease duration & amount of global change?" would be better
 Use six levels on xaxis, you may find that nonresponders in best nonresponse group are more similar to lower end of responders than to the next group of nonresponders
 If so, this would cast doubt on the choice threshold
Italo Challenge: some patients are already on the drug and believe it is helping
 There aren't any patients who are on the drug and don't believe it is helping
Frank Challenge: threshold was decided on without validating it was the right one
Design aspects are state of the art
 Washout period for observational phase (study team determined two weeks was sufficient)
 Patients who have been miserable for so long are reluctant to commit more than two weeks for washout
 Frank mentioned need for washout period before crossover
Wilcoxon rank difference test better for paired data than ordinary wilcoxon  same pvalue even if you transformed outcome
FDA says there are too few randomized withdrawal studies
 No runin period, randomly tell half the folks they can't take drug anymore
Patientoriented outcomes  great payoff in terms of power when outcome has high resolution
2023 June 15
Leon Scott, Orthopaedics
"The impact of weight loss programs on the survival of a native joint."
P:Patients with BMI 35 at the time of diagnosis of knee osteoarthritis. I: Weight loss. C: Usual care. O:
Aim 1: To evaluate the impact of weight loss on the time to joint arthroplasty. Hypothesis: Subjects that demonstrate weight loss over time will have a statistically significant reduction in joint arthroplasties over time. The analysis will include multivariate analysis adjusted for age, gender, prior arthroscopy, radiographic grading of osteoarthritis, unilateral vs bilateral disease, a percentage of weight lost.
 Cox PH model
Aim 2: To determine certain interventions can predict the maximal amount of weight lost. Hypothesis: Patients who underwent bariatric surgery will demonstrate the greatest amount of weight loss. A statistical difference between groups of usual care, diet counseling, pharmacologic intervention, and bariatric surgical intervention will be measured, possibly using a Repeated Measures Analysis of Variance.
 5% total weight loss = significant change
Aim 3: To determine if weight loss improved patients reported measures of knee pain and function. Hypothesis: Patients who achieve weight loss will have a statistically significant change in patient outcome measures including the Promis Physical Function test, NRS pain, and KOOS Jr. The data will be adjusted for the percentage of weight lost & time to achieve weight loss, I think this statistical measurement will employ a nonparametric regression test.
Time zero = time when diagnosis was first entered into the chart
 Someone could have osteoarthritis that started a while ago  one's "time zero" might not be really time zero
 No way to extrapolate when the person's symptoms began
Trade off of doing controlled study on few patients vs uncontrolled study on many patients
Circularity between function and weight  can it be disentangled?
VICTR studio  multidisciplinary, could avoid certain pitfalls
 Could help define tight criteria to extract from EHR & tighten aims
Loss of 10 pounds for person who is carrying 100 pounds of extra weight means less than a person carrying around less extra weight
Generalize aims to look at absolute weight at time zero and other times
Kevin Seitz (Jonathan Casey), Allergy, Pulmonary, and Critical Care Medicine
We are conducting a secondaryanalysis of a clusterrandomized clustercrossover trial (PILOT trial, PI: Matt Semler, Biostats: Li Wang), with a subgroup of patients. Given the complexity of the statistical analysis, we are seeking a VICTR voucher to fund the data analysis. We would like to get a quote and notes for submission of the VICTR voucher. Mentor confirmed.
Interested in route to getting VICTR voucher
Subgroup analysis of the PILOT trial among survivors of cardiac arrest
Parent trial  clusterrandomized (entire ICU randomized to a treatment group for two months), clustercrossover clinical trial
All adults who received MV in medical ICU at VUMC 7/188/21
Intervention  3 groups for target of oxygenation (lower, intermediate, higher)
Secondary analysis  subgroup from PILOT (patients who survived cardiac arrest prior to enrollment, N = 339)
 Primary outcome: 28 day inhospital mortality
 Secondary outcome: survival to hospital discharge with a favorable neurologic outcome
Analysis Plan:
 Assess separation between groups in
SpO2
 Analyze primary outcome  logistic regression with independent covariates of group assignment and time
 Analyze secondary outcome  logistic regression with independent covariates of group assignment and time
 Test for effect modification by characteristics of cardiac arrest
How are intraunit correlations handled?
 In PILOT study, only adjusted for period cluster
 18 time clusters (3 groups, 6 times, order is random)
Ordinal outcome always has a place to put death
"Multivariate" vs "Multivariable"
Don't use outcome from original study (don't use death 1; can't summarize results using median)
 Median is designed for truly continuous variables (bad with ties)
 Analyze raw data (what is your status on a given day? On ventilator, dead, not on ventilator, etc.)
VICTR: one size fits all, $5000, 90 hours over a year
2023 June 8
Brian Hou (Lauren Porras), Department of Orthopedics & Sports Medicine
Evidence based recommendations are needed to define which patients, if any, should be considered at risk for these shortterm hyperglycemic episodes as well as evaluating the longterm effects on glucose levels after a single administration of corticosteroid. The purpose of this study is to look at how a diabetic person’s blood glucose levels change over time with a steroid medicine injection. It is believed that steroids may briefly elevate a person’s blood sugar levels in the immediate time period after receiving a steroid injection. Significantly high blood sugar levels may be dangerous and can lead to a range of effects from fatigue and vomiting to confusion and coma.
The Shade Tree Clinic (STC) is a comprehensive, free health clinic run by Vanderbilt University medical students for Nashville residents with limited resources. The Shade Tree Orthopedic Clinic is a subspecialty clinic at the STC that provides quality care for acute and chronic orthopedic conditions. At the STC Orthopedic Clinic, corticosteroid injections are heavily relied upon as a treatment for management of pain and other orthopedic conditions given that the patient population often lacks access to surgical treatment.
The purpose of this study is to assess the glycemic effects of methylprednisone in patients with diabetes, prediabetes, or no diabetes given the utility of both types of injections. Our questions are determining an adequate sample size for the study, as well as help with a VICTR grant. Mentor confirmed.
Attendance:
Brian Hou, Lauren Porras, Frank Harrell, Cassie Johnson
Meeting Notes:
May look to balance diabetic and nondiabetic patients. We may realistically end up with more nondiabetic patients.
Looking at change from baseline may be less meaningful than “If we adjust for baseline blood glucose (as covariate, perhaps nonllinearly), for a given starting point, where do you end up after injection?”.
Looking at absolute glucose, instead of change in glucose. We may look at slope or areaunderthecurve, in this case.
Descriptive tool to start with – spaghetti plot. Allows you to view raw data trajectories without assuming anything.
Question from Dr. Porras: Diabetic patients that we are enrolling are well controlled, so will likely have baseline blood glucose that are similar to nondiabetics. So are we really answering the primary question regarding glucose sensitivity?
Likely approach – interact baseline with diabetes status. Requires a larger sample size, but addresses this concern.
Sample size: could be a concern solely from the orthopedic side. Could get primary care involved, or could change the question to just include nondiabetics. IRB preferred an observational study, though Frank warns that not having a control group can require a leap of faith when it comes to results.
If an observational study is required, Frank would require a clinic at Vanderbilt where they have proticalized the collection of fasting blood glucose among the population getting these injections. (Blood glucose are reliably collected without missing many measurements)
If this data isn’t already being collected in a clinical setting, may need to pay for nondiabetics via grant.
Dr. Porras provides the following paper:
Systemic effects of epidural and intraarticular glucocorticoid injections in diabetic and nondiabetic patients  ScienceDirect
Frank: To learn a lot about these types of relationships, a minimum of 70 patients would be required. For correlations, this number may be closer to 300.
2023 May 25
Pingsheng Wu, Medicine/Allergy, Pulmonary, and Critical Care Medicine
We try to multiple imputation of a key variable to define/determine the start of follow up time. It has 40% missing. In a proportion of subjects, we do have another variable that informs this missing variable (set boundary for the missing variable such that the missing variable can only be in certain values).
Outcome is RSV LRTI, exposure is RSV immunoprophylaxis
Causal inference is of interest
Missingness of LOS related to year (decreases with later birth years), GA (less missing for older GA), BW (less missing for larger BW), NICU admit (less missing for no NICU)
Measure exposure to RSV immunoprophylaxis:
 children born AprOct: every 30 days during the winter RSV season for a max 5 doses
 children born NovMar: every 30 days starting at birth hospital discharge
N = 15248, 34% missing LOS
 n = 1915 have down syndrome, 29% missing LOS
First outpatient visit/care is informative
 Earliest of these sets boundary on discharge date for birth hospitalization  the discharge must have occurred before their earliest healthcare date
Can data structure for imputation model differ from data structure for analytic model?
How do we incorporate a boundary on LOS into the MI model?
 `aregImpute` for when data are NOT longitudinal
 Predictive mean matching  find suitable donors
 Can specify exclusion of donors that don't meet date condition
 Check mice package to see if they already have such an option
Consult Stef Van Buren's "Bible"
"Reason for missing is more important than proportion of missing"
If administrative missingness, MI would be less scary
 Assumption of MI: missingness depends on things that are measured
Develop logistic regression model for P(LOS missing)
 Include variables you hope are irrelevant to make sure they are
"Redistribution to the right"
Alexandra Flemington (Eddie Qian), Internal Medicine
Using the PILOT study, which assessed low, intermediate, and high oxygen goals and mortality, but now restratifying to look at the effect in patients with anemia. We would like guidance on the best biostatistical analysis strategy.
Oxygen saturation levels relating to invasive mechanical ventilation
Primary outcome: days free of MV & days alive
Conduct effect modification analysis  see if difference between groups is the same if we
stratify by anemia
Planning to do secondary analysis using hemoglobin as a continuous variable
 See if difference in groups by hemoglobin level
PO model with interaction effect for time to control for seasonality
Frank: "don't say stratify > effect modification"
Would look at hemoglobin upon enrollment
Another project could take hospital course as baseline and do landmark analysis
 Everyone in the hospital for x amount of days  describe by hemoglobin at presentation, standard deviation, and slope over time
"Response feature analysis"
 fully conditional
Timedependent covariate analysis  always updating, results are more difficult to interpret
Spline functions & knots
 Use AIC to determine # of knots to use
Cindy: if landmark analysis, seek expert for interpretation
Bill: As long as the conditional population is of interest  must be healthy enough to make it through that initial period of time
Planning to pursue VICTR voucher
2023 May 11
Erica Carballo (Courtney Penn), Gynecologic Oncology
We seek to estimate the annual percentage of patients with advancedstage epithelial ovarian cancer in the United States who are eligible for and will derive benefit from PARP inhibitor therapy based on US FDAapproved indications. We will compare the rates of eligibility and expected benefit, then analyze these trends over time.
We accomplished the above using similar methods to this JAMA Oncology article:
https://pubmed.ncbi.nlm.nih.gov/29710180/ We don't understand how they did their statistics/sensitivity analysis. I'd be happy to send our data/ methods so far before or after the meeting. I have an accepted abstract to a regional meeting, but this was without anything past basic descriptive statistics. We will need a more robust analysis for a publication.
% Benefitting over 2 years progressionfree survival vs % Benefitting overall survival
Hoping to conduct a sensitivity analysis
Frank: In the absence of a Bayesian analysis, pay more attention to CI rather than point estimates
 Equated benefit with % benefitting  can't tell patientlevel benefit without crossover study
 With parallel group design, can't make same determination (certain % of patients benefitted)
 Without knowledge of heterogeneity of treatment effect, default conclusion would be that everyone is benefitting at least a little bit (100%)
Benefit > benefitting is a big difference
Progression at two years could be random
Can use Wilson CI  what is the probability that a person is eligible?
Response rate of the drug  imaging findings before & after treatment if certain criteria are met
Set up like a prepost assessment  pretty noisy, better if tumor size is less random and measured accurately
Refining question: interested in who got the drug, and how many are benefitting
Response probabilities have a lot of noise
Looking at different publications
Get standard errors algebraically: (23  8.6) / (1.96); square root of sum of squares of the two standard errors * 1.96
Can't do analysis with hazard ratio  relative instantaneous risk of having event
Program "digitizer"  can reproduce KM curves from publications
"Treatment difference", "efficacy estimate"
 Avoid "% benefitting"
Don't segregate study based on pvalue being small or large
Twoyear overall survival  mortality is too low
Progression freesurvival  discussion on datamethods.org
 Use statetransition model  Markov
 One weakness  what to do with nonrelated death?
2023 May 4
Nicholas Ward, Kim Petrie, Abby Brown, Biomedical Research Education and Training (BRET)
We are looking to follow up on our previous meeting (April 20, 2023) to refine our analysis strategy for two parts of our project.
First, in our prior discussion, we discussed the strategy of analyzing time spent in postdoctoral training using cumulative incidence curves. One issue we are trying to consider is how to handle alumni who graduated and pursued more than one postdoctoral training position. We know that some researchers do address recurrence in timetoevent analyses, but that this may make our own analysis more complicated. As an alternative, for those who have pursued postdoctoral training, we are considering a less complicated, but still very informative measure: the time to first nontraining position. This is a onetime event that would eliminate the need to consider more complex analyses involving recurrence (i.e., the scenario of multiple postdoctoral training positions). We are hoping to discuss the data we have, and what the best option(s) for analysis would be.
Second, we were advised to use simple logistic regression to analyze whether or not a downward trend existed in graduates pursuing postdocs immediately after graduation. We have modeled this using year of graduation as the independent variable and choice to pursue a postdoc as the dependent variable (coded as either 0 for not pursuing a postdoc, and 1 for pursuing a postdoc). Given our need only to detect a significance in trend in this data, we are hoping to understand which analysis outputs are needed to support this when writing up our results (e.g., odds ratios with CIs, likelihood ratio test, area under ROC curve, goodness of fit).
Main Q: When looking across years: do we see a decline in percent of student pursuing postdoc immediatley after graduation? > simple logistic regression
 Appears model fit and utility for prediction isn't fantastic, but decreasing trend is statistically significant
B1 = 0.9659 (0.9468, 0.9852) > significant
LR pvalue better than Wald pvalue (LR pvalues behave better)
Super impose predicted values from the model on first plot
When grouping students by career goals at graduation, do we see a difference in length of postdoc training? > Time to event analysis like cumulative incidence curves; allows for rightcensored data
 Accounting for recurrent events and discontinuous risk intervals (someone who takes a break after a postdoc)
Another complication: someone who seeks out a shorter postdoc after a really long one
How many have graduated, and how many have > 1 postdoc? If starting a second postdoc is extremely rare, that would simplify things (use right censoring)
Multistate model allows you to estimate things in more interesting ways
 Could call second postdoc a different state (adds more parameters and makes model more unstable)
Event used in time to event = conclusion of postdoc
 If use time to first nontraining position, model would be agnostic to how many postdocs, any gaps
 Atypical scenarios could arise
How many students start postdoc before graduation? Small, but nonnegligible #; problem is that you would be in different states at the same time
 What is time 0?
Making event = time of defense could solve issues
 Could solve problem of student who takes faculty position after graduation
 MDPhD students differ greatly from
PhD only and will not be included
Could do statetransition model over continuous time; with discrete, you would have to determine time unit (in this case, month)
State transition model handles censoring when records for participant stop appearing
P(transitioning in month 13  post doc at month 12)  transition probabilities that are natural from the data
Stacked bar charts
How to handle folks with unknown career goals? Baseline variable  Use side by side stacked bar charts
Resources/packages for state transition models: mstate or msm
Challenging to put multiple confidence bands on graph with multiple survival curves  would want to make separate plots
hbiostat.org/attach/z.pdf
 model all transitions; put it all together
 jointly model with two time variables  internal and external
2023 April 27
Shelby Meier, Alex Cheng, VICTR
We've recently finished a study investigating the effects of promised compensation offers on participant enrollment and study task completion. Participants were given information about what joining the entirely remote study would involved, and were then asked if they would join the study based on a randomly generated compensation offer between $0 and $50 ($5 increments). If they joined the study, they were asked to download the
MyCap study app and complete up to 30 tasks over a 2 week period. One set of tasks was administered twice, and two other sets of tasks were administered daily. We also collected data on participant experience in the study. We have drafted a publication, but we would like feedback on the best way to analyze and present some of the data we collected.
Data has already been collected
Aim 1: Determine rate of study enrollment by level of compensation offered
Aim 2: Determine rate of study task adherence by level of compensation
Weekly three question checklist, two daily tasks
Participants recruited through
ResearchMatch
9,986 invites, 492 answered, 412 enrolled, 284 completed study (those who enrolled but did not download the app excluded)
$ amount revealed after answering invite
"There is some element of deception in this study"
Primary Q: Does compensation have an effect on enrollment?
 Proportion of participants enrolled in the study by Loess regression curve (Frank suggests using overall confidence bands rather than individual; use Wilson CI than default)
Interpreting plot: would need to run a statistical test to determine if there is a trend; could use logistic regression model if this is of interest, test if coefficient is zero
Is there a different effect observed between income groups?
 Why was split made at $65k? No natural split
 Conduct logistic regression with two variables (promised compensation and categorical variables of income bucket)
 Predict how much of enrollment can be predicted from compensation amount, adjusted for income
Is there a different effect observed between racial groups? (selfreported)
Enrollment was almost too high to learn from
Appropriate by necessity > need to limit # of variables
Put income groups, sex, age, and three buckets for race/ethnicity into logistic regression model
 Conduct some form of redundancy analysis to determine if any variables are inseparable
Does compensation have any effect on task completion?
Curve increases from 020, but eventually plateaus
 Wilson CI does not go above 100%
 Proportions are noisy because they are based on a tenth of the sample size
In previous study, Loess was chosen over logistic regression
Loess is really good with confidence bands, but doesn't give an overall assessment of "flatness"
 Just use confidence bands, no need for dots
Logistic regression supersedes Loess, just need to make sure it does not assume too much
Dots are medians, which cannot be used unless you have truly continuous variable (ties are a problem)
Want to make analysis more unified; don't want to use loess to visualize and logistic regression to analyze
Is there a difference in completion rates between lowfrequency and highfrequency tasks
Use quadratic effect for compensation (use compensation and square of compensation in the model)
2023 April 20
Nicholas Ward, Kim Petrie, Biomedical Research Education and Training (BRET)
We are analyzing how long alumni from our biomedical graduate programs spend in postdoctoral training. For these analyses, we are grouping the data based on two reference points: 1) the student's career goal as identified at graduation, or 2) the career that the student ultimately ended up in 10 years after graduation. For each of these two reference points, we are looking to see if length of postdoctoral training differed between 6 different career paths. We hope to make pairwise comparisons between each of the career paths. We have a cohort of 325 students who identified a career goal at graduation (group 1 from above), and 509 students for whom we know the career outcome 10 years after graduation (group 2 from above). There are 214 students who belong to both of these groups. We are looking for advice on how to make these multiple comparisons given that some students belong to both of these groups, while others only belong to one of the two groups. We also have a second data set detailing by graduation year how many students pursued a postdoc (percentage calculated as # pursuing postdoc divided by total # of graduates). We seem to note a downward trend in these percentages. We hope to examine direction and statistical significance of the trend over time. Is it appropriate to use a MannKendall test on the percentages or is another approach more advisable?
Interested in career outcomes of postdocs
Q1: In terms of length of time in postdoc:
 Do we see difference between groups when grouping by a student's career goal at graduation?
 Do we see a difference between groups when grouping by the student's actual career outcome 10 years after graduation?
Q2: looking across years do we see a decline in the percent of students pursuing a postdoc immediately after graduation?
Five categories at graduation: Academic Research, Forprofit research, govt/nonprofit research, AMO, undecided
Cass: Do you collect data on people whose career path changed dramatically?
Nick: We have data collected at 1,3,5,10 years  we do see lots of folks make that transition
Documentation exists for how particular careers map to the specified categories
For some trainees, we have goal at graduation, but do not yet have Y10 career  not 10 years postPhD
 Need to distinguish between missing you should have obtained (NA) versus missing because not yet 10 years postPhD (administrative missingness)
 Treat as rightcensored
KaplanMeier curve => Cumulative incidence curve
Create extra category for administrative missingness
111 with goal at grad known but not Y10 career (cohort C), 295 with Y10 career known but not goal at grad (cohort D), and 214 with both known
Goal: within the two cohorts, determine if there are differences in postdoc length
Cohort C is the one that would be most affected by right censoring
Time to event problem  try to have one analysis per question
 How did probability of having postdoc position vary with time and with goal at graduation?
Call those who didn't have exit survey a group  bookkeeping and to complete the picture
Equal opportunity surveying a concern
Think about raw data rather than groups: 2, 3+, 3, 4+, etc.
Analysis conditional on start of postdoc
7 cumulative incidence curves, each with their own colored confidence bands
Transitioning to Q2... do we see decrease in % of postdocs in more recent years?
Do we see folks taking a break after graduation before postdoc? Some, but not frequent
Raw data would be year of graduation and yes/no => produce plot to visualize time trend with confidence bands
 could also use logistic regression to answer a variety of questions
 time to postdoc would be right censored for those who do not have Y10 career
multistate model with time to event analysis as a special case
Market forces change over time: unemployment, job openings, economic conditions  timedependent covariates
2023 April 13
Eesha Singh (Matthew Meriweather), Neurology
I am applying to access the Get with the Guidelines stroke database to analyze the correlation between certain socioeconomic variables and diagnostic testing done as part of patient workup. Mentor confirmed.
Health equity/access
Frank: possible some folks are not communicative, making interaction shorter?
Eesha: Health literacy could have something to do with it. Literacy measure & provider characteristics & patient address/social vulnerability index not available (zip code is available, but could capture a wide range of socioeconomic statuses)
Goals are descriptive  is there a correlation/disparity
Study population  All patients that present with ischemic/hemorragic: 2000 hospitals, n = 5 million patient records
 Would shy away from hypothesis testing with a sample size that big; do estimation instead
 Magnitude of correlation is of interest, less so the pvalue
Secondary analysis: variable clustering  see which variables run together, helps to understand true dimensionality
 Helps to identify if some variables are restatements of others
Correlate number of tests ordered with anything you're interested in
Stroke workup should not vary significantly by ethnicity/economic status
Group by suspected mechanism
Another type of analysis: characterize how people of a certain ethnicity differ from others
 For a particular ethnicity, if older folks do not seek out care, age distribution for that ethnicity would differ from others
 Predict ethnicity of a person based on other covariates
Neurology dept does not have a biostat person at present
VICTR voucher is an option if you don't want to do analysis yourself (preferable)
 ~12 week timeline to get started; don't apply too early  can extend, but we try to discourage this
 One thing that can hold up a voucher: data cleaning
Deadline for analysis: 9/29/2023, expected to present 2/2024
 Should start no later than end of July
2023 April 6
Lyndsay Nelson (Lindsay Mayberry), Medicine/General Internal Med and Public Health
As part of an upcoming grant submission, we are proposing to evaluate the effectiveness and implementation of a mobile health intervention for diabetes selfmanagement support. We will implement the intervention in 12 clinic sites in middle TN. We would like to review our plans to evaluate effectiveness (effects on
HbA1c) using a prepost/interrupted time series design. Mentor confirmed.
REACH study  text messaging intervention, overcoming barriers to medication adherence
Results are published  improved adherence and A1c (for those with more room to improve)
 Effect waned over time  clinicians indicated results were "over the top effective"
12 clinical sites interested in implementing  outcome will be A1c improvement
Consistent body of evidence that text message intervention improves adherence
Q: Can clinics implement this, and what would it look like?
We think the best thing is an implementation study (hybrid type 2)
Clinics are opposed to randomization  don't want to retest based on overwhelming body of evidence
Even short term reductions are meaningful
Robert: Population was more homogeneous (smaller range of A1c), so could get away with a simpler model
Frank: Impact of 0.5 reduction is dependent on baseline measure
Figure: In control group, saw some regression toward the mean
Frank: mean might not be the best summary measure for A1c
 Worry for loss of effect from figure
Robert: mechanism of action is removing barriers to adherence
 Not unsual to see measures regress to baseline after making progress (akin to person trying to quit smoking)
 Potential need for repeated exposures for sustained behavioral change
REACH intervention effect size  overlay histogram and fit spline
Frank: excitement over subgroup needs to be tempered by context of bigger picture
Robert: what would be most persuasive analysis of data when patients selfselect?
 Time Zero = informative
Frank: measure attention & engagement is not sufficient to show treatment is working, but it is necessary
Robert: measure of engagement  people can choose to respond whether they took meds or not
Robert: want model that can capture +/ 6 month comparison
 By design: ignore values +/ 2 months from start
 Include calendar time in model to account for seasonal effects
 Worried about medications a participant is prescribed at time A1c measure is taken
Frank: How far back from enrollment will A1c measurements go?
Robert: Could get several years for participant who has been going to that clinic for that length of time
 Would be surprised if 90% did not have at least a year
Frank: emphasize slopes or averages?
 Concern of dropout from textmessaging or don't return to clinic for A1c measurement
Hanging hat on A1c = risky
"We want to use the best available model for observational studies of this type to characterize longterm A1c success of a group of patients, accounting for past history of A1c"
 What is A1c a function of?
Lindsay: Do we leverage participants in clinic who don't sign up?
Robert: No; people who sign up are distinct from those who do not
Andrew: I would still be interested in having that information, even if we're not incorporating in the model
The problem: no signup date
 Frank: If you can infer what it would have been +/ a few weeks, could give basis for Andrew's comparison
McKenzie: What about participants who did not sign up, but then did a few weeks later?
Instrumental variable analysis
2023 March 30
Ronak Mistry (Benjamin Tillman), Hematology/Oncology
We are trying to determine if the quantitative Ddimer and fibrinogen improve as platelets improve in patients with heparininduced thrombocytopenia. Assistance with determining correlation of individual patient data and then composite data. Mentor confirmed.
Can serologic markers help to predict platelet improvement?
 Which of the two serologic markers (fibrinogen, Ddimer) can best predict improvement?
 Determine correlation coefficient for each?
13 patients, repeated measures
Interested in markers as being a preview of platelets (can we reliably say if fibrinogen improves, then platelets will also improve)
 What is the largest lag such that the correlation is preserved (no worse than 0.6)? How far of a lookahead can you get?
 Is the most recent serologic value more informative, or the trajectory/slope?
Multivariable regression  calculate R^2 for fibrinogen and Ddimer, assess whether majority of correlation for one with platelets is already accounted for with the other
"Crosscorrelation analysis"  spaghetti plots & regression/R^2, looking at various lags
Day to day measurements, few gaps
Knowing why data are missing is important
Exclusion criteria: negative HIT assay
Going to get the ball rolling on a VICTR voucher
2023 March 16
Kai Wang (Andrea Birch), Radiology
We would like help to analyze the text within our survey responses, and further help to analyze our other data we collected from the survey. Survey has 125 responses. Mentor confirmed.
Wrote grant last year with intention of enhancing services to women of color
Questionnaire to ascertain demographic info & openended questions; distributed at churches, nail salons, grocery stores, sorority, etc.
Enrollees all African American women ages 4064 (no disease history required)
4 Nashville zip codes where population is > 40% black
Qualtrics used to analyze data
How many received an invitation for the survey?
Selfselection/nonresponse bias an issue
 Respondents with Vanderbilt email are more likely to receive care at Vanderbilt (stratify)
Could collapse zip code to another dimension (rurality of zip code, median income of household, distance from center of zip code to closest center of excellence/Vanderbilt)
 Translate from categorical to numerical scale
Open field questions: "Is there anything else you would like for us to know?"
 Required field to complete the survey
Rank correlation coefficient for age ranked in order 15
 Don't stress pvalues
Yaa KumahCrystal, Biomedical Informatics
I am doing some general evaluation about
ChatGPT and its reliability in science. I was curious how it would do with a statistical question. I provided some information and it gave me an answer but I am not sure if the answer is actually correct or if it just “sounds good” as these tools are known to “hallucinate.” Can someone let me know if this analysis is appropriate or completely made up? I know that there are nuances that should be taken to account when choosing a statistical test like how the data is distributed, etc. – but overall is the approach and explanation given by
ChatGPT below generally reasonable, or completely wrong?
Search "ChatGPT" on datamethods.org to get Frank's thoughts  prone to leading questions
 Frank's experience: fast code for getting the wrong answer
Dr. KumahCrystal conducted a chi square test using
ChatGPT
Assess whether student performed significantly worse on any one of five categories of questions
ChatGPT ignored chi square test's independence assumption
"Significant" language does not belong in a null hypothesis
2023 March 2
Emily Harriott (Laurie Cutting, Laura Barquero), Neuroscience
I have a dataset of 202 participants, combined across 2 studies. I want to predict standardized reading and math assessment scores using metrics of white matter structure for 18 white matter tracts (1 measure of structure for each tract, 18 tracts). Currently, I am using lasso regressions (throw all 18 tracts into the model, see which ones are important predictors, then put those selected predictors into a normal OLS regression model) to do this. I am not sure if I am using these regressions correctly, or even if I should be using them at all (should I use ridge or elastic net or relaxed lasso instead?). If I am to use lasso regressions as I have been doing, I am particularly concerned about if I did them correctly because I was told that the coefficients from the lasso regressions should match the coefficients from the OLS regressions of the selected predictors, and they do not match.
I am conducting these analyses for a poster presentation at a conference at the end of March. Mentor confirmed.
Research Q: Which tract(s) as measured by FA best predicts reading and math scores?
 Build regression model (ridge, lasso, elastic net, or something else?)
 Ridge more reliable than lasso & elastic net
 Fatal flaw of lasso: trying to do selection => low probability of selecting right signals, not enough information
4 outcome measures: 2 lower level (reading, math) & 2 higher level (reading & math)
Multicollinearity in tracts
Can investigate redundancy and correlation (see Frank's analysis file)
Explore PCA
2023 February 23
Terrin Tamati, Otolaryngology
I would like to analyze performance (accuracy, response time) among a behavioral assessment among our patient group. I plan to use mixed models, and I'll like to ask about the appropriate fixed and random effects structure.
Sentence verification test (48 sentences: half true, half false)  measure accuracy and response time
Interested in effect of talker on accuracy and effect of talker on response time
Manipulate who speaker was (three male and three female)
Logistic mixed effects model
Sample size: 13
Set up as repeated measure (don't want to treat as completely new instance)
 Subject ID
Random effects: something you can't control (different sites)
Worried about overfitting
Look into VICTR voucher/R clinic (
https://biostat.app.vumc.org/wiki/Main/RClinic) for coding questions
Steven Allon, General Internal Medicine and Public Health
Our study is a multicenter, comparative efficacy cluster RCT of different journal club formats for internal medicine residents. Primary outcomes will be (1) subjective engagement and (2) critical appraisal skills.
Goals: 1. Revise existing subjective engagement questionnaire with your input to insure unbiased questions and appropriate scaling. 2. Discuss process for achieving content validity in questionnaire (feedback from manuscript reviewer from previous trial). 3. Discuss process for creating and validating a novel measure to assess critical appraisal skills that can be integrated into existing journal club sessions.
Critical appraisal by Berlin questionnaire
Tools and outcome measures we care about don't exist  our aim is to create some!
Five centers, 140 participants  all sites using gamified journal club curriculum
Hypothesis 1: do different JC formats yield different levels of engagement?
Hypothesis 2: do different JC formats yield different improvements in critical appraisal?
VICTR voucher an option
Design looks reasonable (no obvious methodologic flaws)
Put out an email to connect with someone to consult with for Education research
No measure of JC engagement in literature
CREATE framework: used to capture engagement
 Psychologist would be helpful in assessing appropriateness of instrument design
Embed questions throughout journal club proceedings
Randomization for which site begins with which format
Coprimary endpoints
REDCap app for crosssite data collection
2023 February 16
Jacob Jo (Douglas Terry), Neurosurgery
Hope to discuss a new project idea. Currently configuring the methods section. In short, we hope to see how variability in Post Concussion Symptom Severity Scores (PCSS) (0128, over 22 questions/symptoms) affect outcomes. Mentor confirmed.
Questionnaire administered after concussion  physical, cognitive, sleep, psychological
PCSS has proven robust in terms of recovery time from concussion
How does variability affect outcomes? 6 x 6 + 0 x 16 = 36, but 2 x 15 + 1 x 6 + 0 x 1 = 36
 A few bad symptoms or lots of mild symptoms: which profile is worse?
Formulate scale in different ways  have a contest on which one predicts time to recovery the best
Idea: Hierarchical scale with clinical overrides
Analysis ideas:
 Variable clustering analysis  rank correlation (determine which questions have overlapping information)
 Predict total in forward/stepwise fashion
 Principle components/sparse principle components
Instead of scoring each item 16, score as S1S6
 Run regression analysis, solve for S's to derive appropriate weighting
 Compete with original scale to see if more predictive
Five additional opportunities: use adjusted R^2
Redundancy analysis: which question answers can be predicted by other question answers?
2023 February 2
Stacy McIntyre (James Tolle), Pulmonary Critical Care
Retrospective analysis of outcomes of transitions of care from the pediatric to the adult CF clinic within our institution. Outcomes include primarily lung function (FEV1), BMI, and number of hospitalizations/exacerbations in the year prior compared to the year after transition. Mentor confirmed.
Sample is in last five years
~ 85% transitioned at age 18 (a few a little older, a few a little younger)  we know age of transition for everyone
Make data long: Column A = patient identifier, Column B = Date, Column C on = measures
Spaghetti plot: you see
all data (no summary involved)
Most participants will have E/T/I date, which increases lung function
 Only a few participants not eligible by genotype
 We will know when they started and assume compliance (rate of noncompliance = very low)
 Include two variables  did they use medication (yes/no) and if yes, date
N = 60; type of modeling and conclusions are limited
 Can say, "Of people with lung function that declined
_ amount, __ % ...
 "This is what we observed in our population"
No one uninsured in the population  so question would be is private insurance > medicaid
Stratify across variables of interest
If you have dates of clinic visits/hospitalizations, calculate intervals between them
2023 January 26
Megan Wright (Jessica Gillon), Pharmacy
My project is looking at the impact of requiring stop dates on antibiotic orders at the children's hospital. We have 4 time periods that we are looking at and would like help understanding what statistical tests should be run. Mentor confirmed.
Raw data: individual dose administered; what it was indicated for and what day it was given
Gap where things are uncertain; remove that time interval from analysis
Duration varies by season; COVID also happened during all of this
 Seasonal variation is smooth, covid is more discontinuous
Use general regression model which accounts for all forces in play
 On/off variable that is discontinuous (are you before policy change or after policy change)
You're interested in signal after subtracting off other effects
Want to pull all data, not just JuneNovember
 Might be unlucky on what months you pulled
Days from a starting point for the whole study (more resolution)
Time trends: seasonal, but could have other trends (staff adherence)
Reason for weird ANOVA variances  possibly outliers
Apply for VICTR voucher?
Bradley Hall (Lauren Connor), Plastic Surgery
We would like to compare suture technique for patients undergoing a surgical procedure to determine if there is any difference in outcomes between the two techniques. Some believe that one technique may lead to higher complication rates, but we do not believe that is the case. And if that's true then that technique would have a number of other benefits including less time in the OR, less cost, less risk to providers, and potentially fewer postoperative issues. Mentor confirmed,
did not attend.
Equivalence trial, or noninferiority (both high sample sizes, latter slightly lower); looking for 5% difference
How many would we need in each study arm?
Yes/No => lowest resolution, need highest sample
 N = 100200; rules out Yes/No entirely (no close calls)
get consensus on outcome severity => hierarchical levels; get more out of your sample size
 Want levels to be well populated
Not appropriate for VICTR voucher
RCTs very involved; should get a dedicated statistician
Statisticians should be embedded from beginning of the study to the end
How do you use REDCap for randomization? 1:1? Blocked?
Next steps:
 Involve dept chair: have him/her contact Yu Shyr
2023 January 19
Stacy McIntyre (James Tolle), Pulmonary Critical Care
Our project is a retrospective chart review analyzing outcomes of cystic fibrosis (CF) patients who have transitioned from our pediatric clinic to our adult CF clinic. We would like to discuss biostats needed to evaluate associations between patient factors and outcomes. Mentor confirmed.
Sample of about 60
"Port CF" database  deidentified excel sheet in
OneDrive  prospective registry
Data need to be cleaned (remove comments)
Biggest burning question: Do outcomes change after transition?
Analyze BMI using age as a covariate; subtract out the effect of age to look at effects of other vars
Try to avoid percentile approach; makes too many assumptions (linearity, normality)
 Stay as close to raw data as possible; pull original BMI data
Would learn a lot from
spaghetti plot (red before transition, green after); can see data gaps
Tall and thin data set; date + BMI
Some patients will have less data after transition (one year) > analysis will account for that
Problem with mean change = regression to the mean (someone could have good or bad day or measurement error)
Testing of significance must account for data pairing
Big picture: look at time continuously when possible
Living situation data not well defined; employment status, student status, & health insurance best we have (must deal with missingness, though)
Leon Scott, Orthopaedic Surgery
I want to set up a pilot study evaluating the effect of a lowenergy diet (LED) intervention on measures related to weight, osteoarthritis, hypertension, and diabetes. The pilot study is to (a) test the intervention on a small scale before requesting funding for a sufficiently powered study and (b) ensure I have the infrastructure to execute the more extensive study effectively. My question for Biostatistics Clinic is, "are the statistical measures in my specific aims appropriate?"
Specific Aims and Hypothesis Aim 1: To evaluate the effect of an LED diet intervention, including preprepared meals, on weight. Hypothesis: Subjects will demonstrate a clinically significant reduction in weight (15%) at 12 weeks compared to their baseline. Approach: This aim is designed to compare mean differences in weight at the onset and endpoint of the study. As such, the data is from two paired datasets. The mean weight difference will have a normal distribution derived from a parametric variable. A paired ttest will be used to measure the difference between groups. Secondary outcomes will be the proportion of subjects that reach a 10% and 20% weight loss threshold. Too few subjects will be included to perform regression analysis of which variables (e.g., gender, the initial level of obesity, age) predict meeting those weight loss thresholds. This is a pilot study with five subjects. In the future, the sample size will be powered to determine a difference with a betaerror of 0.2 using a % weight loss standard deviation of 3.9%.
Aim 2: To evaluate the effect of diet intervention on knee osteoarthritis patientreported outcomes measures. Hypothesis: Subjects will demonstrate a clinically significant improvement in the Visual Analog Score (VAS) for pain (2 points) at 12 weeks compared to their baseline. Approach: This study compares paired mean differences of pre and postVAS scores at the onset and endpoint of the study. Since the mean differences off a nonparametric VAS score have a nonnormal distribution, a WilcoxonRankSum Test will be used to measure the difference. Similar evaluation will be performed for secondary outcomes of KOOS subscales, WOMAC, and SF12 scales. This is a pilot study with five subjects. In the future, the sample size will be powered to determine a difference with a betaerror of 0.2 using a VAS standard deviation of 1.1.
Aim 3: To evaluate if a LED diet intervention has a clinically significant change in markers of hypertension and
T2D. Hypothesis: Between the onset of the study and the conclusion, subjects will experience improvements in systolic blood pressure, diastolic blood pressure, and
HgbA1C. Regarding blood pressure, we hypothesize that 100% of the subjects will experience a 50% improvement in their baseline systolic and diastolic blood pressures and the goal of 120/80 mmHg. This compares baseline and endpoint datasets in a single population with nonnormal distribution since we are evaluating proportions that meet a blood pressure goal, not the blood pressure numbers themselves. The statistical measure that will show significant change is a WilcoxonRankSum test. Regarding the
HgbA1C, we hypothesize an average 1.0% point change at three months. Our statistical measure for if the group demonstrates this degree of change is a paired ttest (the expected standard deviation for a 1.29% change is 1.32%).
Aim 4: To evaluate if a LED diet intervention, including preprepared meals, reduces the proportion of patients using nonprotocol interventions. Hypothesis: The proportion of subjects that use a nonprotocol intervention (e.g., oral/topical NSAIDs, other oral/topical analgesics, corticosteroid injections in the previous three months, braces, units of shortacting insulin & units of longacting insulin, etc.) will be lower and reach statistical significance (p<0.05) after the intervention compared to those subjects paired values at the onset of the study. Approach: This aim is to compare two proportions at various time points with two data sets. Statistical significance will be measured using the
McNemar test. Furthermore, using nonprotocol interventions will be evaluated to see if they predict the clinical changes in patientreported outcomes and weight using logistic regression.
Sample size of 300 at minimum for binary outcomes
 Scatterplot?
Paired ttest would be fine; Wilcoxon signed rank test more robust
BMI > 35 to be included in study
 Regression to the mean an issue (caught person on good/bad day, measurement error)
Admit patients if maintained stable BMI for a particular period of time? Unstable correlates with having higher/lower BMI...
Fidelity to the diet is less of a concern
Example: reliability of selfreported food intake = not good
Recalling intake might increase possibility of cheating
Signed rank test will work for aim 2
Wilcoxon rank difference test good for paired data  same pvalue no matter how you transform the data; robust
Rank difference test for aim 3
 Make patient their own control; prepost, not against 120/80
 Wait x minutes, measure; wait, measure; use same instrument, keep other factors constant
Meals will be delivered to participant's house
2023 January 12
Doug Bryant (W. Evan Rivers), Physical Medicine and Rehabilitation
Endoscopic rhizotomy systematic review  follow up from last meeting on May 5th, 2022. Data collection is complete, would like to further discuss VICTR application process. Mentor confirmed.
Six studies identified that met inclusion criteria  these were determined to meet acceptable standard of care
Preprocedural screening
Any populationlevel differences across the studies should be adjusted for
Are studies randomized? Of the ones with comparison groups, one was randomized, others were cohort studies
Next steps: assess amalgomated effect
Meta analysis can properly account for study to study variation
Simple pooled analysis (CI will be falsely narrow)
 How should you weight?
Accounting for Time Zero across studies
Randomized trials = the best
"Surgery before?" could be key covariate
Nail down grouping and modeling in further dialogue
 Make goals, candidate studies, and assessed outcomes clear
VICTR voucher good for a year
"Spreadsheet from hell" on website: things to avoid
Brett Kroncke, Medicine/Clin Pharm
Testing genetic features' ability to predict risk of cardiac events. Recorded data are age at first event, frequency of subsequent events (some are age at subsequent event), and use (start date and duration) of controlling medication. I would like to use these data to evaluate the ability of genetic features to predict these events, controlling for other clinical features and use of medications.
Predict risk of event given carrier heterozygous status
Control for known clinical features: corrected QT interval, age at first event, event rate, age at all subsequent events, age at Beta blocker use, ...
100 people might have genetic variant (no variation in that marker)
 Like doseresponse relationship without linearity assumption
How to handle multiple, distinct events (largely one type: syncope)?
 State transition model, allowing patients to move in and out of various states
If data are sparse (most don't have syncope, and if they do, only once)
 Cox model (time to event)
Frank: "why your collaborator is wrong":
https://hbiostat.org/glossary
 Look under "dependent variable" and click on the "other information" tab under that.
Explore imputation approaches
Key issue in arrhythmia research: access to EKG or summary of EKG
 Barrier will be quite high, but p
ayoff could be worth it
· Analysis: Propensity score weighting
· Cox proportional hazards: recurrent / competing events
· Frank: May be two reasons to change medication:
o Planning, vs. reaction
o Is causal analysis needed to get rid of this feedback loop?
· Frank: internal timedependent covariates are what is present here.
o External version would be crossover study where everyone must switch drug at a certain time.
o Interpretation is harder with internal.
o If covariates aren’t updated frequently enough, what we are trying to learn from our change variable will be difficult to interpret
o Propensity adjustment may not be sufficient for that.
§Investigator: propensity was chosen because exposure groups may be vastly different. Ashley wanted to account for that via a number of covariates (about 10). Sample size is estimated at 70,000, but many have not switched at all (lots of 0’s).
§Static Propensity score: looking at baseline or characteristics at one point in time, then we are not accounting for people switching back and forth (increasing dosage and then decreasing, for example).
§This situation is more dynamic; timedependent covariates are important.
§Frank: How do you want to word your conclusion? We learned something that gives the recipe for required changes to affect better outcomes (causal), or a noncausal conclusion?
§Goal of observational study is understanding the system;
§Frank suggests going without the propensity score: Miguel Hernan (observational… 2008) , cohort from observation data, estrogen in case of hormone therapy or not. Can get same results as RCT if timedependent covariates were well understood and updated frequently.
§He also has a great book on observational data
§Confounders would need to be measured within days / weeks of switch
§Andrew Spieker, Bryan Shepard (a few others as well) specializing in causal inference. We could ask someone to join clinic.
§Medical natural experiments: surgical conferences covered by junior surgeons while away. What happens to patient outcomes during this is a natural experiment (for example)