Clinical and Health Research Clinic

Click here for 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, and before.

Current Notes (2023)

2023 December 7

Tara Helmer, VICTR

We are working with a PI who wants to use My Health at Vanderbilt to recruit patients for her survey study. We would like to randomize patients to receiving the study invitation via a central mechanism (MHAV Recruitment Requests) or direct messaging by the study team (similar to clinical communications). We often get asked if one mechansim is more effective than another and want to explore this research on research opportunity.

In attendance: Frank Harrell, Terri Scott, Tara Helmer, Jackson Resser, Cass Johnson

Is it feasible to study the difference in response rates between MHAV recruitment request and MHAV messages?

Caveats: Patients can manage notifications for both, and may set different preferences for each. Notifications may look different, and message appearance would be different as well.

Study of interest: Impact of Restricted Medication Access on Care of Multiple Myeloma Patients (PI: Autumn Zuckerman). About 90 patients will be enrolled. PI asked if there was difference in response rate between the two mechanisms.

Patients would need a MHAV account.

Can it be determined whether a patient has viewed the message? Response rates may be highly dependent on patient notification settings, which we may not be able to confirm.

We can determine when the message was sent and when a patient showed or refused interest in the study.

Tara may consider randomization of whether a patient receives message via recruitment request or MHAV message. Currently, the choice is largely determined by study team size and available time, as MHAV messages are more manual than the recruitment request.

All patients will come from the same report (must be marked OK to contact). Randomization could occur from there. Eligibility criteria would have already been accounted for (fairly basic inclusion / exclusion criteria for this study). Also possible additional screening is required.

Also limited to "active" patients (per Epic classification).

Frank also notes on sample size: if you recruit 90, the more even split you can get the better. The margin of error for estimating the difference may be around .15 (15%) at that size; not very specific, but may still give useful insight.

Goals would be to have a response for investigators as they determine how they would lik ethe study to be messaged; VICTR may also want a manuscript as a result.

As long as denominators stay constant, the information should provide meaning.

2023 November 16

Kelly Vittetoe (Alexander Gelbard), Otolaryngology

clinic trial studying effect of peripheral nerve block on post-operative opioid requirements after head and neck cancer resection with free flap reconstruction

Current pain management: narcotics/IV meds

Investigate if treatment will reduce pain scores and overall satisfaction

Items for today: sample size, optimal study design

150 free flaps a year

Best way to randomize

Literature review: effect of IV tylenol

Frank: previous studies provide measures of patient to patient variability

- Power of two sample t-test depends on standard deviation of the outcome

- More variability, power goes down, more sample size needed

- Test-retest unreliability noise - bad characteristic for outcome

Want response to have little variability within the patient

Is administration of morphine standardized?

- Pain score correlates with dosage

Pain level is logical primary outcome, secondary could be narcotic utilization

- Pain score would be reliably captured in EMR (0-10 scale; pain of 1-7 = oxy5, 8-10 = oxy10)

- Want to be able to distinguish level 7 from level 10

Placebo effect: surprisingly absent for a lot of medical outcomes

Reliability of pain assessment and ability to be assessed by blind observer

- Bed-side nurses rate the pain scale and administer the medicine

- In most patients, nurse will not know whether block was done or not

Improve power: measure calibration between patients

Covariate adjustment (no downside to adjusting for a covariate if it turns out it doesn't predict anything):

- neuro assessment to adjust for risk

- cancer site pain

- age

- smoking status (most would be smokers and drinkers)

Sample size calculation using:

- relative frequencies of pain level

- Minimal clinically importance difference in pain level (difference you don't want to miss)

Group-level randomization vs participant-level randomization

- intracluster correlation coefficient

Will need at least 20 clusters in a cluster randomized trial

- Fewer leaves you susceptible to chance imbalance

Individual randomization has problems, but gives you the highest power

- Person on the ground, for every case, defining block Y/N

- Other option: everyone gets a block in January, no one gets a block in February, and flip flop

Next step: Learning Health System to drill down trial design

General lead time on VICTR grant: 8 weeks

2023 November 9

Stephen Camarata, Hearing and Speech Sciences

Analysis of RCT data on speech, language, literacy, and hearing data.

- Advances in cochlear implant technology; improve auditory signals

Strictly randomized crossover study -- N = 48, one withdrew after baseline -> 47

Treatment intervention: is there a benefit to cochlear implant tune-up in hearing status? Does this predict performance changes?

- Problem: a lot of measures, will need to subset

Measures NOT amalgamated in the literature; kept distinct

- Pitch aggregation method

15 measures of spectro-resolution

Q1: Is there an improvement? Q2: what predicts that improvement?

Frank: desirability of outcome ranking (DOOR) -- which treatment resulted in the most desirable outcome? No clinical interpretation -- relative, but projected into one dimension

- Rank-difference test > signed rank-test, respects pairing

Randomization test or a priori way to aggregate variables that share same variance?

- PCA, sparse PCA (combo of clustering and PCA) - find out which variables move together across kids

SAP drafted, first 20 participants unblinded (regression-oriented model)

- So many variables that family-wise error will be a problem

Frank: two challenges

1) number of parameters

2) unblinded part of data -- once unblinded, you need to use statistical plan without modification

Univariate approach worth exploring

Another research question: given standard scores & percentile ranks

Z-scores makes assumptions: assumes reference population selected at random

- SD needs to be meaningful (measure must be symmetrically distributed)

Percentiling is like assessing speed of runner relative to other runners rather than using a stopwatch

Frank's preference is to do things on raw data scale (raw scores will have different meaning for different ages)

Raw score regardless of age is like a stopwatch

Jennifer Duke (Samira Shojaee), Interventional Pulmonary

consideration of multi-state analysis for endpoints in a RCT looking at patient outcomes of chest tube flushing vs no flushing.

No RCTs evaluating role of regular chest tube flushing in the setting of pleural space infection for optimal drainage and treatment outcomes

Many studies of pleural space infection do not report a chest tube flush protocol

Hypothesis: regular flushing of catheters leads to early tube removal

Target recruitment: N = 96

Stat considerations:

- Multi-state model for time to event analysis

- Cumulative failure of therapy/drainage curve will be calculated from the Kaplan-Meier method and compared using log-rank test

- MV linear regression to evaluate treatment effect on outcomes measured 1 week after treatment randomization adjusting for measurements at baseline and covariates

tPA group vs non-tPA group; chest tube = clogged or patent; chest tube +- tpA was considered treatment failure or not

8 state transition model with 17 total transitions -- some states extremely unlikely

Original plan: endpoint = time to test tube removal

- time to something good is only interrupted by something bad (never good)

- time to event is hard to interpret, hard to handle censoring

Put patient paths in a hierachy (how bad was the worst thing that happened to a patient in a given day)

- One parameter: odds ratio, measures odds of transitioning to something worse

- Ordinal transition model requires clinical concensus on what is better and what is worse

- Bad = level X or worse

Ordinal transition model -- rare states are not a problem

Next step: order levels for transition model by severity

What is frequency after you take overrides into account?

Literature is missing several states (Jen developed CRF so information is captured on daily basis)

- Be clear on primary endpoint

- Proportion of patients getting level 3 or level 4 outcome? State levels you want to calculate probabilities of from multi-state model

-Frank may seek permission to share Vancomycin protocol (or at least statistical analysis portion) with the investigators

2023 November 2

Garrett Booth, Pathology

Preliminary stage.
I am seeking feedback on the feasibility of performing a meta-analysis on gender award conferral rates. To date, several studies have been conducted demonstrating gender inequities in different medical society award conferral rates.
For example, Am J Clin Pathol
. 2022 Oct 6;158(4):499-505. doi: 10.1093/ajcp/aqac076., that highlighted significant gender inequities within my field of practice, pathology.
My primary aim is not to clear up a controversy, rather I would like to see if it is possible to collect (and how best to collect) data on the effect sizes and compare them across different studies that have looked at gender inequities in medical awards.

Garrett Booth, Frank Harrell, and Cass Johnson in attendance:

· Can effect size of authorship and authorship attribution be used in a meta-analysis?

· Does double-counting of folks on different clinical practice guidelines count?

· Frank: There is nothing about this context that makes meta-analysis less useful than for a medical attribute, for example.

· Time-oriented flowchart; would that look like a denominator where a woman enters as a pathologist, and we are interested in what happens from there?

o Garrett: Primary question is more how equitable is representation across all these smaller fields of study. Meta-analysis could look at authorship attributes and ideally determine the effect of the bias.

· Frank: If available, longitudinal data is some of the most effective data (apart from randomized); or, if age of each person (compared w/ average age of entering the profession) could be determined, perhaps use that in place of longitudinal data?

o Unlikely to have that information right now.

· Cass emailed previous investigators on 11/2/2023 for information on the meta-analysis management software they used and will reach out as soon as a response is received. If needed, can also provide an example template for how meta-analysis data may be organized so that it can be easily used in statistical analysis.

· Additional resource on meta-analysis: Welcome! | Doing Meta-Analysis in R (

Ashley Leech (Shawn Garbett), Health Policy

I would like feedback on the analysis below.

d. Research Design. The study design will be a retrospective observational cohort study of pregnant individuals with either a diagnosis of opioid use disorder or evidence of medication use for opioid use disorder at least three months prior to pregnancy. We will require continuous enrollment three months prior to pregnancy and up to 28 days postpartum. For sensitivity analyses, we intend to relax the study/continuous enrollment criteria for the pre-pregnancy period, and how we define the threshold of pre/post-pregnancy based on estimated LMP and gestation. Our research goal is to estimate the effect of medication switching from pre- to post-pregnancy on the incidence rate of adverse events related to opioid use disorder and pregnancy complications. Our design will answer the following research question: What are the potential risks of switching OUD medications from pre-pregnancy to pregnancy for (1) Individuals who switch “down” (i.e., methadone to buprenorphine), (2) Individuals who switch “up” (i.e., buprenorphine to methadone), (3) Individuals who stopped MOUD from pre-pregnancy to pregnancy (defined as a medication gap greater than 14 days); and (4) Individuals who did not switch medication from pre-pregnancy to pregnancy; measured up to gestational week 19.

e. Exposure. The exposure for Aim 2 will be individuals who had any switch in medications from pre-pregnancy to pregnancy (from three months before pregnancy through gestational week 19), stratified by: (1) Exposure (a): Individuals who switch “down” (i.e., methadone to buprenorphine) and (2) Exposure (b): Individuals who switch “up” (i.e., buprenorphine to methadone). The comparator groups will include, (3) Individuals who stopped MOUD from pre-pregnancy to pregnancy (defined as a medication gap greater than 14 days); and (4) Individuals who did not switch medication from pre-pregnancy to pregnancy. We will take an intent-to-treat approach where we will focus on an individual’s first switch due to our preliminary data findings showing that downstream switches could be a result of the first switching decision.

f. Outcome. The outcomes will be measured from week 20 through the neonatal period (within 28 days post-delivery). We will measure two outcomes: (1) Complications of OUD; and (2) Complications of pregnancy. Complications of OUD will be defined as overdose, hospitalizations, infections such as endocarditis, abscess, osteomyelitis, and maternal death, while complications of pregnancy will be defined as hemorrhage, primigravida cesarean section, preeclampsia, and chorioamnionitis.

g. Covariates. A priori covariates in our model will be measured 90 days prior to estimated LMP (pre-pregnancy period) and will include demographic variables such as age, race/ethnicity, income, zipcode/census divisions, eligibility group code, pharmacotherapy dose prior to switching, days between previous medication to switched medication, median travel distance to medication prescriber prior to switching (Aim 1), opioid use disorder severity (e.g., opioid-related emergency department and inpatient visits), nonopioid substance use, other mental health conditions, chronic comorbidities, and prenatal care engagement. We will account for relapse-related indicators such as opioid-related emergency department and inpatient visits and gaps in medication use prior to the first medication switch (operationalized as a binary yes/no variable). Additionally, we will account for the following factors in our analysis: The number of medication switches during the exposure period, whether individuals who stopped MOUD in early pregnancy later switched back to MOUD at any point during the exposure period, and the “type of switches”, i.e., antagonist or partial agonist to partial or full agonist; full or partial agonist to partial or antagonist.

h. Analysis. We will begin by performing several descriptive analyses to gain a better understanding of our sample cohort. These analyses will include examining demographic and other characteristics, quantifying the individuals within each exposure group, and, for those who switched medications, calculating the median number of switches during the exposure period. We will then employ a propensity score method with overlap weighting to balance important patient characteristics across treatment groups. Since we expect that our medication comparator groups could be quite different, overlap weighting is particularly advantageous when comparator groups are initially very different. Following our propensity score design, we will report both risk ratios and an extended Cox proportional hazards survival model, particularly, the Prentice, Williams, and Peterson (PWP) (total and gap times since the previous event), to account for recurrent and competing outcome events. Reporting both risk ratios and time-to-event outcomes accounting for competing events provides a more comprehensive understanding of the relative risk of an event occurring across groups, while also providing detailed information about when events occur and the impact/interplay of multiple potential events over time.

Ashley Leech, Shawn Garbett, Frank Harrell, and Cass Johnson in attendance:

· Analysis: Propensity score weighting, Cox proportional hazards with recurrent / competing events

· Feedback from Frank:

o May be two reasons to change medication:

§Planned change vs. reactionary change

§Is causal analysis needed to get rid of this feedback loop?

· Andrew Spieker, Bryan Shepard (a few others as well) specializing in causal inference within the department. We could ask someone to join clinic.

o Internal time-dependent covariates are present here.

§External version would be crossover study where everyone must switch drug at a certain time.

§Interpretation is made more difficult with internal time-dependent covariates.

§If covariates aren’t updated frequently enough, what we are trying to learn from our change variable will be difficult to interpret.

§Propensity adjustment may not be sufficient for that.

· Ashley: Propensity score weighting was chosen because exposure groups may be vastly different. Ashley wanted to account for that via several covariates (about 10). Sample size is estimated at 70,000, but many have not switched at all (lots of 0’s).

o Static Propensity score: if we’re looking at baseline or characteristics at one point in time, then we are not accounting for people switching back and forth (increasing dosage and then decreasing, for example).

§The present situation is more dynamic; time-dependent covariates are important.

§Confounders would need to be measured within days / weeks of switch

o How do you want to word your conclusion? We learned something that gives the recipe for required changes to affect better outcomes (causal), or a non-causal conclusion?

§Frank suggests going without the propensity score weighting

§Miguel Hernan ( Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease - PMC ( could serve as a great case study to look into. Can get same results as RCT if time-dependent covariates were well understood and updated frequently.

· He also has a great book on causal inference ( Causal Inference: What If (the book) | Miguel Hernan's Faculty Website | Harvard T.H. Chan School of Public Health)

2023 October 26

Mert Demirci (Annet Kirabo), Nephrology

We would like to add a substudy to our R01-funded main study investigating health equity (race differences in salt sensitivity), and I will be the primary investigator. I submitted my grant to VICTR for funding.

As a new research fellow, I have limited knowledge of biostatistics. I would like to discuss biostatistical analysis options for our small sample size substudy.

Main study conducted with NIH funding for three years. N = 24 (6 black, 18 white), still enrolling

- Recent published paper shows no difference in race (but enrolling black patients is difficult)

- Limited budget

- Existing research shows black people are more salt-sensitive

- Add substudy to target enrollment of black patients

New aim 1

Chose Mann-Whitney U Test because small sample. Power analysis?

Frank: Issues investigators are experiencing are very common

- Fisher: When p-value is large, you need more data. P-values do not handle small sample sizes well

You already know blacks are more salt-sensitive. The question is how much

- Use confidence intervals

With small sample size, need to choose one parameter. "Put all your eggs in one basket"

- Calculate single confidence interval for association you are most interested in

To cut margin of error in half, you need 4x as many participants

- Precision = square root of sample size

Also interested in female vs male

Width of CI will narrow with more data, but center will move around

Precisely measured variables lends itself better to small sample sizes than a variable that is not precisely measured (race)

Mert: if we run some correlation studies between ADMA and salt sensitivity, can we compare groups?

- Frank: no free lunch -- to have MOE ? +/- 0.1, you need 400 patients. Would need even more patients for that

Annet: Come up with genes related to nitric oxide

Frank: only way to separate race and sex would be to have a balanced dataset (difficult to achieve)

- Boost sample size with repeated measurements? Study patients under different conditions

False discovery rate irrelevant without false non-discovery rate

- Procedure could have zero power to detect genetic characteristics on account of FNDR

Research with small sample size is tough

- One chapter on statistical inference

- One chapter on high-dimensional work, check robustness of findings

Canceled (Jennifer Duke (Samira Shojaee), Pulmonary and Critical Care)

We are planning a pragmatic RCT for flushing with saline vs no flushing in patients with infected pleural spaces requiring chest tube placement. Primary outcome will be days until chest tube removal. Primary question would be to review plan of multistate analysis for data.

2023 October 19

Marina Aweeda (Alexander Langerman), Otolaryngology

We are working on a video-based coaching project to teach a surgical procedure to residents. We are interested in asking about validated survey tools (such as OSATS, OCHRA, SURG-TLX) and which ones would be the most appropriate for statistical analysis. We’d also like to ask for recommendations on what types of data to collect for the statistical analysis.

No formal feedback system to give residents

Video-based coaching intervention (highlight reel of most important steps of operation)

- Was this useful, feasible, repeatable?

Unable to power study with max N = 10 residents (up to 8 attending)

- Simplest thing you can learn (proportion of residents who responded a certain way)

- Minimum to estimate proportion ~ 96

- Cannot generalize to population of residents with reduced sample (margin of error would be wide)

Can learn a little bit (quantify uncertainty with uncertainty intervals)

Which validated survey tools to use?

Accounting for observer variability is important

Ideal set up with different observers: each resident scored by several different observers

- Minimize observer variability by averaging

If one observer for all participants: uniformity within and between residents, but preferred over different observer for each resident

Timing of intervention would give away whether participant was in pre or post

- Could hold off on evals completely until the end

Some experts that could help with that (sociology dept)

OSATS: quantitative data (number score)

OCHRA: scored on performance in surgery

SURG-TLX: evaluates mental workload, how demanding, etc.

One way ANOVA used?

With N = 12, statistical test would be more misleading than helpful

Better and safer to quantify what you have with confidence limits

- Confidence limits penalize for small sample size

- Wouldn't talk about power; instead use confidence limits for mean differences pre & post

VICTR voucher almost not needed, but could help with some stats if needed

- 90 hours, $5k grant

Also VICTR studio -- free, invite experts from other fields

Design ideas:

- Increase sample size, could give more options

- Randomize residents -> half get intervention, half don't -> parallel group randomized trial

- Could attribute differences in groups to the intervention

- Strongest way to determine that intervention caused the result

Hybrid approach: delay intervention, assess people randomized to have intervention late at same time you assess someone randomized to have the intervention earlier

Keep focus on feasiblity with reduced sample size (shy away from p-values, use confidence limits)

How are data being collected?

- Survey data in REDCap

Could bring draft of the REDCap to another clinic to get statistician input

2023 September 28

Jennifer Quinde (Carissa Cascio), Psychiatry

We have zero-inflated/semi-continuous data to model. Our DV (dependent variable) is a percentage of video frames during which a participant’s facial expression likelihood was above a set threshold in response to two stimuli conditions. One of the conditions elicited little to no facial activity and thereby our output for those trials is 0.

Are hurdle models better than zero-inflated models? Are there alternative models to consider?

Two stimulus temperatures (warm/hot) and record facial expressions

Record muscle activity in the face (engagement)

Plotted median engagement against pain rating

- Potential three way interactions between groups

Zero-inflated data - how to best model?

- Has threshold of 25% been validated? Would take a lot of data. Assumes discontinuity, does not capture close calls

What is the area under the curve when you are in some danger zone?

- AUC captures severity of hemoglobin and how long you stay in the bad zone

Difference between being correlated with the truth and fully reflecting the truth

- Consequence of using threshold when assumption of discontinuity does not hold: artifacts & edge effects, amplify measurement error

- Removing threshold = more powerful statistical analysis

Ordinal regression will handle any distribution and will give you probability of being above a given threshold via back-end calculation

No need to transform Y when using semi-parametric model

Use robust sandwich variance estimator

Model with rms package

Emily Wooder (Amelia Maiga), Surgery/Acute Care Surgery

We are using trauma video review to analyze the impact of communication patterns on the resuscitation efficiency of bleeding trauma patients. Specific exposures of interest include the percent of overlapping communication during key moments. We would specifically like input into an analysis plan that may inform our study design and data collection plan. This is a VMS RI project that I am supervising. Mentor confirmed.

Interested in overlapping communication, interruptions during EMS handoff, and speaking during BP measurement

Outcome: total resuscitation time (wheels in to wheels out of trauma bay)

Want to do linear regression

Add age to covariates?

Frame as survival problem? Time to successful resuscitation?

- Right censoring if no resuscitation

Assumption: Resuscitation is independent of dying

Covariates are not defined during resuscitation itself

- Avoid circularity - separate effects

Several other discrete timepoints between when patient comes in and when patient leaves

Regarding competing risk of death -- plan is to exclude those who die

- Whether a participant will be excluded cannot be defined at time zero

Time period of wheels in to wheels out: 25 minutes +/-

Capture extent of shock at fixed time would be key

First five minutes = landmark observation period

- To qualify for analysis, have to survive for at least five minutes

- Those who don't survive 5 minutes likely came in deceased

Failed resuscitation in trauma day takes ~ 20 minutes

Predicting participant's status five minutes from now

Want to collect data so all options are on the table

- Collect data so that you don't make decisions you can't undo

2023 September 14

Ashley Leech (Shawn Garbett), Health Policy

The study design is a retrospective observational cohort study of pregnant individuals with a substance use disorder. Our design will answer the following research questions: (1) What are the potential risks of switching opioid use disorder medications from pre-pregnancy to pregnancy for (Group a) antagonist or partial agonist to partial or full agonist; (Group b) full or partial agonist to partial or antagonist, up to gestational week 19; and (2) for new initiators in pregnancy, what are the potential risks of switching medications any time during pregnancy for (Group a) antagonist or partial agonist to partial or full agonist; (Group b) full or partial agonist to partial or antagonist. The comparator group will encompass individuals who have zero switches or switch less frequently; we will include the effect size of the switching count.

Exposures. The study exposure will be the count of medication switching; either from (a) to (b) and/or (b) to (a); from pre-pregnancy to pregnancy up to gestational week 19 and anytime during pregnancy. The exposure time will be censored by the observation period (accounting for the exposure time in the model).

Outcomes. The primary outcomes will include medication discontinuation, severe maternal complications, all-cause hospitalizations, and ED visits.

  1. Confirm whether a negative binomial regression is feasible, especially given sample size constraints.
  2. How to phrase a power analysis statement give what we know from previous analyses:
  • In TN, 25% of 14,000 pregnant individuals with opioid use disorder or on medications for OUD had at least one medication fill, which = 3,500 people in total.

  • Based on our persistence study, 6% of individuals switched from buprenorphine to naltrexone over the one-year study period

  • Based on our 3,500 pregnant individuals on medication, this roughly equates to 210 individuals (but this is just considering buprenorphine to naltrexone and no other combinations, including methadone).

  • If our Medicaid sample includes 10 states, this would roughly equate to 2,100 individuals with a medication switch during pregnancy.

K1 award based on decision model; putting in for R01 soon

Two aims: using large medicaid sample to answer questions

Question today: sample size concern -- what design maximizes info?

Care about effect of switching medication relating to several outcomes

Negative binomial regression (more individuals that don't switch; 5-8% that do switch)

- Extra parameter accounting for overdispersion

Poisson regression? Have more parameters than Poisson would need

Count outcomes could have 0-1 inflation

Rule of thumb for sample size? Frank might have one

Features to adjust for in regression (given observational study):

- if time-fixed confounder, add as covariate in model

- Covariates: cotherapy dose

Possible trigger: relapse

For treatment switches: interested in looking at both number of treatment switches & high-low vs low-high

Treatment covariate feedback loops? Might not be relevant in this field

Covariates will need to be measured at different times during the exposure period (the pregnancy)

Time-varying covariates

- Multiple rows for each participant where covariate corresponds to a value at a given time

- Would this work for a negative binomial model?

Survival model that would allow for recurrent events

Mixed model would depend on how data are structured

Bryan: negative binomial model might not use the data most efficiently, but could be more interpretable to reviewers

Recurrent event survival analysis is a little more complicated -- Anderson-Gill model

- More power, account for time-varying covariates

Benefits of time-varying covariates:

- Adjusting for time-varying as if fixed leaves you susceptible to confounder treatment feedback loops

Continuous variable dichtomized = 5% of information you had before

Negative binomial could work, need to adjust for covariates

- No closed form formula for effect size/power; could estimate using sims (use preliminary data)

- passed R package has negative binomial for 2-sample; not quite what we're looking for

Recurrent event model would have more power, but would be more complicated & might be more difficult to interpret

Zero-inflated regression options: adds parameter to adjust for lots of zeroes

2023 August 31

William Tucker (Whitney Gannon, Matthew Bacchetta, Jonathan Casey), Cardiac Surgery

We plan to conduct a retrospective observational study examining doses of anticoagulation in patients supported with venoarterial extracorporeal membrane oxygenation during lung transplantation. Over the past 5 years, anticoagulation dosing for this population has evolved to be considerably less. We hypothesize that lower anticoagulation dosing is associated with the presence of thromboembolic complications and correlated with a decrease in blood transfusion requirements. In particular, we are interested in examining dose-response relationships between anticoagulation dose during surgery and blood transfusions required.

Research Q: What is the optimal dose of heparin for intraoperative support?

Hypothesis: Heparin exposure associated with blood product transfusions during lung transplantation supported on venoarterial ECMO

Inclusion: bilateral lung transplant between 2018 and July 2023, intraoperative VA ECMO support, age > 18. Several exclusions...

Exposures: heparin bolus dosing, heparin drip rate, ...

outcomes of interest: blood products transfused, thromboembolic events

Flowsheet -- N of > 180

Pharmacology of heparin nailed down such that, for example, optimum dose is known as a function of body mass?

- In lung transplant surgery, ACT is not checked & additional heparin not provided based on any ACTs to follow

Aim of project: quality of patient's overall outcome with regard to coagulate-related outcomes

With lots of events, you might find different doses optimize different events

- Advantageous to have ordinal scale outcome to assess what dose makes patients do well overall

- Clinical overrides, increases power

For dose x, did patients have worse outcomes than dose y?

Patient could have two different bad outcomes -- ignored if only looking at one outcome, not if you use ordinal scale

Ordinal scale could have as many levels as you want

Many ways to break ties to make scale more clinically relevant with more power

Ordinal scale won't be disturbed by infrequent bad outcomes

Good to crowdsource -- REDCap survey where clinicians vote/rank outcomes

- Declare winners based on small groups of runners, then put all of that together

Accounting for confounding: include covariates in the model

- Clinical experience super important to identifying confounders -- which features/lab values do clinical experts use?

Could assess dose over time (apply smoother)

- Cultural shift to less and less heparin over time

Likewise assess how ordinal scale changes over time

Will reviewers want ordinal scale validated? Sometimes, but Frank says alternative is almost always worse

- Some reviewers do not like the PO assumption in the PO model... often they are already make worse assumptions (Frank blog articles)

Time zero: patients have to make it to this time to make it into the study

ECMO is to some extent a time-dependent covariate

Patients have to get ECMO to be in analysis -- have to live long enough to get ECMO, so time zero might be initiation with ECMO

Bolus given just before canulation -- for intents and purposes, a simultaneous act

With sample of 150 or so and rarity of events, would descriptive paper be more effective?

Frank: descriptive analysis suffers the most from sample size, causes measures to be noisy

Is sample big enough to do most powerful analysis?

More uniform the scale, more statistical info

Not slam dunk: depends on signal to noise ratio

What will ordinal scale look like?

- Example 1: some lab value + 10 times the number of transfusions needed (transfusions puts you in a different part of the scale, more transfusions = bad)

- Example 2: 0 - 50 did not need angina meds ; 51 - 89 did need angina meds; 90 = death

Don't need to write out every level, but define zones

- Hierarchical scale -- if participant has one minor event and one major event, assumes you don't care about the minor event

VICTR voucher -- 90 hours of biostat help, finishes in a year

Expect that data will be complete.

Comorbidities not much of an issue here (maybe BMI) -- not the practice

- BMI can't be extreme -- have to be healthy enough to receive transplant

2023 August 17

Alexandra Abu-Shmais (Ivelin Georgiev), Pathology Microbiology Immunology

We obtained ~2700 B cell receptor sequences specific for several antigens that we have grouped into 5 different categories (by viral family) We have analyzed the data with respect to several sequence features (V gene usage, HC:LC pairs, somatic hypermutation, CDRH3 length) and want to compare the data across the viral family categories. Need to know which statistical tests to use. Number of sequences in each category are not equal.

Data generation phase completed

2700 sequences across 10 individuals -- no interdonor analysis

- uneven with respect to participant and viral group

If studying left & right eye of a patient, patient contributes an experimental unit of roughly 1.5

- Minimum of 10 experimental units here (10 participants)

How much do receptor sequences correlate within the participant?

Single-cell data, multidimensional

Ivelin: sequences are independent, some correlation for sequences coming from the same participants

Germ-line gene usage clustered by viral group specificity

- Relative frequencies

Stacked bar chart -- percent of antigen specific repertoire x IGHV

Hypothesis testing is the aim

Difficult to conceptualize generalizability beyond the single individual

Most of the data is dominated by 3-4 individuals

Could reproduce stacked bar chart by individual

Other ways to present proportions than stacked bar chart?

Unfamiliar with single-cell data

- many Biostat faculty members work on single cell data. Could invite some on a special Thursday morning

2023 July 27

Brittney Snyder (Tina Hartert, Pingsheng Wu), Department of Medicine, Division of Allergy, Pulmonary and Critical Care Medicine

We are performing a retrospective, population-based cohort study utilizing an administrative database (TennCare). We use a marginal structural model with inverse probability of treatment and censoring weights to estimate the effect of an asthma medication on influenza. Each individual’s follow-up time is divided into periods to capture changes in medication usage, etc., so an individual may have multiple rows and corresponding weights in our dataset. How can we create a weighted Table 1 to assess for covariate balance if individuals could have multiple weights? Would it make sense to use a self-controlled design as a sensitivity analysis to mitigate issues of potential unmeasured confounding?

Exposure: LMA protected periods

Outcome: severe influenza illness

Analysis: restricted to time within influenza seasons based on regional virologic surveillance

MSMs with IPTW and IPCWs were used to estimate the effect of LMA use on severe flu illness

1 year of continuous follow up to assess baseline vars followed by flu season

Want to control for vars changing over time that can be both mediators and confounders

Estimating weights: calculated stabilized IPTWs and IPCWs for each person period using methods from Fewell et al

- Logistic regression

- Numerator models included only non-time varying covariates

- Denominator models included time-varying and non-time varying covariates

Inverse probabilty weighted model

Reviewer suggested weighted table 1

- Question: should table be presented by person-period instead of by person-season?

- Issue: does not account for # of periods per person -> could bias

Table 1 should not have been included in submission

- Conditions backwards on outcome -- not beneficial in cohort study

Just do baseline characteristics for population (no columns)

Literature does not show precedent for Table 1 with marginal structural model

Measures of relative explained variation

Table of odds ratios for LMA use

- Calendar time is important

Figure with odds ratios

Good to look for alternatives to weighted analyses (lose a lot of power)

What goes wrong when you do a standard time dependent covariate analysis such that you need weights and MSM?

- You have variables acting as confounders and mediators (danged if you do, danged if you don't)

Landmark analysis -- keep starting the clock over and doing covariate analyses

Case crossover study -- within subject design

Matching weights -- create population like you're matched, don't lose a lot of efficiency

- Matching equivalent for marginal structural model?

Could include in markov model whether participant had flu in the previous year

Adrienne Marler (Lea Davis, Kevin Niswender), Pediatric Endocrinology, GME

We will be performing a retrospective analysis of lipid levels in transgender adults, comparing those who are on gender-affirming hormone therapy (GAHT) to those who have never been on it. We would ideally track changes in lipids over time, but I am concerned our population of adults never on GAHT will be too small to make meaningful comparison. We’ve alternatively discusses a cross-sectional model or using individuals as their own controls. I also hope to minimize confounders that independently impact lipid levels - age, weight, diabetes status, smoking. I am requesting assistance with planning this analysis & creating a statistical model. I also plan to apply for VICTR voucher for continued assistance with analysis.

H1: identify additional variables that may impact lipid metabolism as well as perceived risks of benefits of sex/gender minority research.

H2: Transgender men on GAHT will have higher rates of and more severe dyslipidemia than transgender men not on GAHT

H3: Transgender women on GAHT will have higher rates of and more severe dyslipidemia than transgender women not on GAHT

Experimental group: trans adults on GAHT for 2+years

Control: trans adults not on therapy

Outcomes: cholesterol levels (continuous)

Known & anticipated limitaitons:

- currently do not know how many TG adults never on GAHT have had lipid panels at VUMC

- Inconsistent documentation of risk factors (smoking, alc use, etc)

- GAHT divided into masculinizing and feminizing therapy, as regimes vary

- After 2 years, patients more likely on maintenance dosing

- Psychotropic medications may also variably impact lipids

- Different regimens, duration of therapy

Inclusion: 18+ adults self-identifying as transgender/nonbinary

Exclusion: known familial hypercholesterolemia, no lipid panes at VUMC

Covariates: age, race, weight, smoking status, alcohol use, A1c


1) longitudinal, retrospective

2) cross-section, retrospective

Design Qs:

What is longitudinal time zero?

- Baseline, before starting therapy for those in experimental group, first time at VUMC for control group

For patient landing at nine months, can still get good info (just would have incomplete profile from time zero to nine months)

Characterize longitudinal profile with as much resolution as data allow, rather than require two years for entry into study

- Mixed effects model may not reflect correlation pattern within patient (maybe more serial correlation)

Continuous time

If few people are followed for seven years, these people could have a lot of leverage/influence on estimates

One tool: variogram (semi-variogram) -- calculate correlation between all pairs of measurements from same patient (uses all available data) -- assumes correlation is isotropic

Mixed effects model assumes variogram is flat

Getting correlation structure right has a # of advantages (small one is software runs faster)

Include variables like `on psychotropic meds`, `started on lipid-lowering Rx` in analysis?

Often discourage folks from matching on similar covariates (throwing away info)

Interested in applying for VICTR voucher. Another clinic meeting would be helpful to continue brainstorming

Voucher includes analysis work. Investigators & analyst finalize analysis plan together at the beginning

Involve statistician early

2023 July 20

Alison Swartz (Heidi Silver), Gastroenterology

We are using a curated SFRN dataset from the synthetic derivative of 886,899 patients who have cleaned weights and BMIs over the period of time from 1997-2020 (23 years).
We plan to: a) identify weight cyclers versus non weight cyclers (weight stable or weight loser); b) determine the characteristics of being a weight cycler versus non-weight cycler; c) determine the demographic and clinical risk factors that predict being a weight cycler; and d) determine the cardiovascular and other medical outcomes of being a weight cycler. We will define a weight cycle as gain and subsequent loss of 10% from weight extrema (or vice versa).

  1. How to prepare for VICTR voucher application for biostatistics support for the project?
  2. If approved for a VICTR voucher, what is the process of requesting statistics support?
  3. Do you have any memory-efficient recommendations for computation of this very large dataset?
  4. What are your suggestions for optimizing usage of this large dataset?
  5. Any particular pitfalls or concerns when using large datasets?
Little experience with gigantic datasets

Initial analysis done in Python -- interested in using R for stats

We have weights, heights, BMI, etc.

Weight cyclers vs non-weight cyclers -- longitudinal

- Debate on whether weight cycling increases risk

- Frank: weight cycling has likely been defined, but not validated (no consensus on definition)

- Ask more general question: "what extent of weight cycling correlates with what extent of clinical outcome?"

Characterize weight cyclers vs non-weight cyclers

- Frank: don't want to label as cyclers vs non-cyclers -- eliminates close calls

- How frequently is weight measured?

Cross-correlation problem or "predict the future" problem?

- More interested in landmark study (the latter)

23 years of data, though scattered

- Qualification period

- In other study, those who cycled more frequently were shown to have worse outcomes

10 randomly-chosen weights for each year

Citation: Frank Harrell & Shi Huang paper on weight

Can pre-specify contests -- just don't throw the kitchen sink (multiplicity problem)

Bryan: longitudinal model and looking at association with time-varying covariates

- Concern: generalizability when restricting sample to folks with ten measures in 10 years of data (subject matter expertise to decide cut-off)

How much does collinearity between lagged variables present issues?

Need to demonstrate dose-response relationship

Two dimensions of weight volatility -- "Genie's" mean difference

"What is the 90th percentile of changes, adjusted for height?"

Prepare data with date, weight, and height for each measure

Frank's book: Search "R workflow"

To keep in mind with large datasets: possibility of numeric overflow

Avoid p-values with extremely large N -> use relative explained variation in outcome

datatable package handles memory really well

Use clinical considerations to cut down on pool of variables

Appropriate variable transformation -- not always clear

VICTR studio: any biostatisticians to invite?

- Shi Huang & Frank

Ask who has experience with height data

How to settle on granularity of weight measure? Year, 6 months, etc?

- Most patients are outpatient

If gaps in the data, process to correct for uneven gap times?

Need adjustment for variation in gaps

Hexagonal binning

For longitudinal analysis: spaghetti plots are useful (take random sample)

How many measurements there are on each patient, non-parametric curve, clusters patients (catch data problems)

Colleen Niswender, Pharmacology -- NO SHOW

We have performed a preliminary PheWAS study in collaboration with VICTR that has linked two SNPs in the GRM7 gene with a PheWAS code for neurofibromatosis, and we hypothesize that this gene is a genetic modifier in the NF1 background. There are now 2,482 neurofibromatosis patients in Vanderbilt’s BioVU database, and approximately 416 have a banked DNA sample of high quality. We plan to amplify and sequence the rs9870680/rs779710 SNPs from these individuals at Azenta. Using this larger cohort of data samples will allow for a more robust association between GRM7 rs9870680/rs779710 and NF1. To maximize this sample size and increase the rigor of our analyses, we plan to sequence all available BioVU samples associated with the ICD.9/10 code for NF (237.7/Q85.0), including patients of all genders, races, and age ranges. This would amount to a total of 416 samples. We would then like to mine BioVU patient data to determine if patients of specific genotypes at these SNPs are enriched for specific co-morbidities in NF1, most notably learning disabilities and ADHD as we know that the GMR7 gene is linked to learning and memory. We need assistance in determining how to mine patient records from a statistical viewpoint.

2023 July 13

Luis Okamoto, Clinical Pharmacology

This is our follow-up biostats clinic to the one we attended on 6/22/23. We are designing a pilot trial to test the efficacy of guanfacine treatment for chronic fatigue syndrome patients. We would like biostatistics support for a VRR.

Objective: gather preliminary data on efficacy of guanfacine vs placebo on hyperadrenergic ME/CFS

Hypothesize that treatment with central sympatholytic guanfacine will improve fatigue & function (disability)

No biomarker that has efficiently identified this subset of patients with high sympatholytic activity

Study design: Proposed double-blind, randomized, crossover study with placebo vs guanfacine

- Enrichment withdrawal design

Responder analysis: assess perceived response to guanfacine for POTS symptoms

- see which clinical characteristics were associated with improvement

Improved approach: correlation between scores & different clinical characteristics

Positive trend between PGIC change in hyperadrenergic symptoms frequency

- Frank: add rank correlation coefficient

- Bubble plot to represent sample size for coincident points

T score is normalized

P-value is "too easy" -- include rank correlation coefficient instead (ranks can be compared, p-values cannot)

Plan to use wilcoxon rank sum test in primary study

Are relationships strong enough to base clinical trial on?

How to leverage information to identify who should be treated?

- Not seeing "smoking gun" in plots

Bryan: does guanfacine lend itself to study with respect to blinding? For instance, can participant identify whether they are being treated with drug or placebo based on symptoms?

Italo: guanfacine does make you drowsy... but drug that makes you drowsy can improve fatigue

Washout period -- two weeks is good balance between getting rid of drug and study feasibility

- Patients tend to be reluctant to being taken off the drug

Ideal to put confidence intervals when doing correlations

2023 June 22

Luis Okamoto, Clinical Pharmacology

We are planning a pilot study to test the efficacy of guanfacine treatment in patients with chronic fatigue syndrome or postural orthostatic tachycardia syndrome. Our preliminary data suggests a subset of patients with central sympathetic activation could benefit from treatment with a central sympatholytic like guanfacine. We would like to discuss a statistical analysis plan for our study.

No diagnostic test/FDA-approved treatment for ME/CFS (myalgic encephalomyelitis/chronic fatigue syndrome).

In previous work, hyperadrenergic phenotype associated with more severe disease & greater autonomic symptoms with sympathetic overmodulation and lowest quality of life

Clonidine has been tried as treatment, but all studies have failed to improve fatigue & function

- Potential reasons for failure: not selecting for hyperadrenergic ME/CFS patients & side effects of clonidine

We propose central sympatholytic therapy with guanfacine would be effective for treatment of fatigue & function in CFS patients with phenotype

- Preliminary study conducted: patients rate their impression of change

- 26% non-responders, 74% responders (to guanfacine)

Found improvement was related to the frequency and severity of hyperadrenic symptoms & ...

Compared to non-responders, overall improvement associated with improvement in ...

- Similarities between responders & non-responders in age, disease duration, post exertional malaise, impairment in sleep & memory, hyperadrenergic symptoms, orthostatic intolerance (responders tended to lower baseline tolerance)

- Responders tended to have more severe fatigue

Predictors of response to guanfacine: head-up tilt (HUT) and Valsalva maneuver

- Responders: > DBP increase at 1 & 3 min of HUT, > BP increase during late phase 2 of Valsalva

Target population: CFS patients with hyperadrenergic phenotype

Overall goal: grant proposal to assess efficacy of guanfacine for the treatment of CFS symptoms

Objectives: conduct small study to gather preliminary data

- Efficacy of guanfacine (vs placebo) on hyperadrenergic ME/CFS

- To estimate sample size & power

Study Design: double-blind, randomized, cross-over study with placebo vs guanfacine added to standard of care for 2 weeks

- Enrichment withdrawal design (suggested by FDA for initial assessment of efficacy of ampreloxetine in improving OH in patients with autonomic failure)

- Outcomes assessed at home (questionnaires, actigraphy, orthostatic vitals

Study design advantages: patients already identified, minimal involvement from patient, clear stopping rules, low-cost alternative

Outcomes assessed at second week of treatment: fatigure (CIS, primary outcome), POTS & CFS symptoms & PGIC, subjective function (SF-36), objective function

Analysis: outcomes assessed on the 2nd week outcomes: placebo vs guanfacine

- Hypothesis: fatigue score (CIS, primary outcome) after 2 weeks of placebo > guanfacine

- Assessed via Wilcoxon signed rank test or paired t-test

Frank concerns: problems with patients not remembering very well

- Measuring change is difficult because it is dependent on patient's initial reading

Not good to base responder analysis on change

- Responder analysis = minimum information analysis (doesn't capture close calls)

- Responder analysis with 19 patients = noise (need ~ 180 for anything meaningful)

Italo: Need more preliminary data

Frank: Need high signal:noise ratio, especially with small sample size (would need to demonstrate dose-response relationship -- rank correlation)

- Difficult to distinguish minimal to no change

Delta scores boxplots need to be revisited (raw data display = scatterplot)

- Boxplots are combining unlikes

- No need to categorize responders -- basic rank correlation and scatterplots

Responders & non-responders is low information measure -- "what is rank correlation between disease duration & amount of global change?" would be better

- Use six levels on x-axis, you may find that non-responders in best non-response group are more similar to lower end of responders than to the next group of non-responders

- If so, this would cast doubt on the choice threshold

Italo Challenge: some patients are already on the drug and believe it is helping

- There aren't any patients who are on the drug and don't believe it is helping

Frank Challenge: threshold was decided on without validating it was the right one

Design aspects are state of the art

- Washout period for observational phase (study team determined two weeks was sufficient)

- Patients who have been miserable for so long are reluctant to commit more than two weeks for washout

- Frank mentioned need for washout period before crossover

Wilcoxon rank difference test better for paired data than ordinary wilcoxon -- same p-value even if you transformed outcome

FDA says there are too few randomized withdrawal studies

- No run-in period, randomly tell half the folks they can't take drug anymore

Patient-oriented outcomes -- great payoff in terms of power when outcome has high resolution

2023 June 15

Leon Scott, Orthopaedics

"The impact of weight loss programs on the survival of a native joint."

P:Patients with BMI 35 at the time of diagnosis of knee osteoarthritis. I: Weight loss. C: Usual care. O:
Aim 1: To evaluate the impact of weight loss on the time to joint arthroplasty. Hypothesis: Subjects that demonstrate weight loss over time will have a statistically significant reduction in joint arthroplasties over time. The analysis will include multivariate analysis adjusted for age, gender, prior arthroscopy, radiographic grading of osteoarthritis, unilateral vs bilateral disease, a percentage of weight lost.

- Cox PH model

Aim 2: To determine certain interventions can predict the maximal amount of weight lost. Hypothesis: Patients who underwent bariatric surgery will demonstrate the greatest amount of weight loss. A statistical difference between groups of usual care, diet counseling, pharmacologic intervention, and bariatric surgical intervention will be measured, possibly using a Repeated Measures Analysis of Variance.

- 5% total weight loss = significant change

Aim 3: To determine if weight loss improved patients reported measures of knee pain and function. Hypothesis: Patients who achieve weight loss will have a statistically significant change in patient outcome measures including the Promis Physical Function test, NRS pain, and KOOS Jr. The data will be adjusted for the percentage of weight lost & time to achieve weight loss, I think this statistical measurement will employ a non-parametric regression test.

Time zero = time when diagnosis was first entered into the chart

- Someone could have osteoarthritis that started a while ago -- one's "time zero" might not be really time zero

- No way to extrapolate when the person's symptoms began

Trade off of doing controlled study on few patients vs uncontrolled study on many patients

Circularity between function and weight -- can it be disentangled?

VICTR studio -- multidisciplinary, could avoid certain pitfalls

- Could help define tight criteria to extract from EHR & tighten aims

Loss of 10 pounds for person who is carrying 100 pounds of extra weight means less than a person carrying around less extra weight

Generalize aims to look at absolute weight at time zero and other times

Kevin Seitz (Jonathan Casey), Allergy, Pulmonary, and Critical Care Medicine

We are conducting a secondary-analysis of a cluster-randomized cluster-crossover trial (PILOT trial, PI: Matt Semler, Biostats: Li Wang), with a subgroup of patients. Given the complexity of the statistical analysis, we are seeking a VICTR voucher to fund the data analysis. We would like to get a quote and notes for submission of the VICTR voucher. Mentor confirmed.

Interested in route to getting VICTR voucher

Subgroup analysis of the PILOT trial among survivors of cardiac arrest

Parent trial -- cluster-randomized (entire ICU randomized to a treatment group for two months), cluster-crossover clinical trial

All adults who received MV in medical ICU at VUMC 7/18-8/21

Intervention -- 3 groups for target of oxygenation (lower, intermediate, higher)

Secondary analysis -- subgroup from PILOT (patients who survived cardiac arrest prior to enrollment, N = 339)

- Primary outcome: 28 day in-hospital mortality

- Secondary outcome: survival to hospital discharge with a favorable neurologic outcome

Analysis Plan:

- Assess separation between groups in SpO2

- Analyze primary outcome -- logistic regression with independent covariates of group assignment and time

- Analyze secondary outcome -- logistic regression with independent covariates of group assignment and time

- Test for effect modification by characteristics of cardiac arrest

How are intra-unit correlations handled?

- In PILOT study, only adjusted for period cluster

- 18 time clusters (3 groups, 6 times, order is random)

Ordinal outcome always has a place to put death

"Multivariate" vs "Multivariable"

Don't use outcome from original study (don't use death -1; can't summarize results using median)

- Median is designed for truly continuous variables (bad with ties)

- Analyze raw data (what is your status on a given day? On ventilator, dead, not on ventilator, etc.)

VICTR: one size fits all, $5000, 90 hours over a year

2023 June 8

Brian Hou (Lauren Porras), Department of Orthopedics & Sports Medicine

Evidence based recommendations are needed to define which patients, if any, should be considered at risk for these short-term hyperglycemic episodes as well as evaluating the long-term effects on glucose levels after a single administration of corticosteroid. The purpose of this study is to look at how a diabetic person’s blood glucose levels change over time with a steroid medicine injection. It is believed that steroids may briefly elevate a person’s blood sugar levels in the immediate time period after receiving a steroid injection. Significantly high blood sugar levels may be dangerous and can lead to a range of effects from fatigue and vomiting to confusion and coma.

The Shade Tree Clinic (STC) is a comprehensive, free health clinic run by Vanderbilt University medical students for Nashville residents with limited resources. The Shade Tree Orthopedic Clinic is a subspecialty clinic at the STC that provides quality care for acute and chronic orthopedic conditions. At the STC Orthopedic Clinic, corticosteroid injections are heavily relied upon as a treatment for management of pain and other orthopedic conditions given that the patient population often lacks access to surgical treatment.

The purpose of this study is to assess the glycemic effects of methylprednisone in patients with diabetes, pre-diabetes, or no diabetes given the utility of both types of injections. Our questions are determining an adequate sample size for the study, as well as help with a VICTR grant. Mentor confirmed.

Attendance: Brian Hou, Lauren Porras, Frank Harrell, Cassie Johnson

Meeting Notes:

May look to balance diabetic and non-diabetic patients. We may realistically end up with more non-diabetic patients.

Looking at change from baseline may be less meaningful than “If we adjust for baseline blood glucose (as covariate, perhaps non-llinearly), for a given starting point, where do you end up after injection?”. Looking at absolute glucose, instead of change in glucose. We may look at slope or area-under-the-curve, in this case.

Descriptive tool to start with – spaghetti plot. Allows you to view raw data trajectories without assuming anything.

Question from Dr. Porras: Diabetic patients that we are enrolling are well controlled, so will likely have baseline blood glucose that are similar to non-diabetics. So are we really answering the primary question regarding glucose sensitivity?

Likely approach – interact baseline with diabetes status. Requires a larger sample size, but addresses this concern.

Sample size: could be a concern solely from the orthopedic side. Could get primary care involved, or could change the question to just include non-diabetics. IRB preferred an observational study, though Frank warns that not having a control group can require a leap of faith when it comes to results.

If an observational study is required, Frank would require a clinic at Vanderbilt where they have proticalized the collection of fasting blood glucose among the population getting these injections. (Blood glucose are reliably collected without missing many measurements)

If this data isn’t already being collected in a clinical setting, may need to pay for non-diabetics via grant.

Dr. Porras provides the following paper: Systemic effects of epidural and intra-articular glucocorticoid injections in diabetic and non-diabetic patients - ScienceDirect

Frank: To learn a lot about these types of relationships, a minimum of 70 patients would be required. For correlations, this number may be closer to 300.

2023 May 25

Pingsheng Wu, Medicine/Allergy, Pulmonary, and Critical Care Medicine

We try to multiple imputation of a key variable to define/determine the start of follow up time. It has 40% missing. In a proportion of subjects, we do have another variable that informs this missing variable (set boundary for the missing variable such that the missing variable can only be in certain values).

Outcome is RSV LRTI, exposure is RSV immunoprophylaxis

Causal inference is of interest

Missingness of LOS related to year (decreases with later birth years), GA (less missing for older GA), BW (less missing for larger BW), NICU admit (less missing for no NICU)

Measure exposure to RSV immunoprophylaxis:

- children born Apr-Oct: every 30 days during the winter RSV season for a max 5 doses

- children born Nov-Mar: every 30 days starting at birth hospital discharge

N = 15248, 34% missing LOS

- n = 1915 have down syndrome, 29% missing LOS

First out-patient visit/care is informative

- Earliest of these sets boundary on discharge date for birth hospitalization - the discharge must have occurred before their earliest healthcare date

Can data structure for imputation model differ from data structure for analytic model?

How do we incorporate a boundary on LOS into the MI model?

- `aregImpute` for when data are NOT longitudinal

- Predictive mean matching - find suitable donors

- Can specify exclusion of donors that don't meet date condition

- Check mice package to see if they already have such an option

Consult Stef Van Buren's "Bible"

"Reason for missing is more important than proportion of missing"

If administrative missingness, MI would be less scary

- Assumption of MI: missingness depends on things that are measured

Develop logistic regression model for P(LOS missing)

- Include variables you hope are irrelevant to make sure they are

"Redistribution to the right"

Alexandra Flemington (Eddie Qian), Internal Medicine

Using the PILOT study, which assessed low, intermediate, and high oxygen goals and mortality, but now restratifying to look at the effect in patients with anemia. We would like guidance on the best biostatistical analysis strategy.

Oxygen saturation levels relating to invasive mechanical ventilation

Primary outcome: days free of MV & days alive

Conduct effect modification analysis -- see if difference between groups is the same if we stratify by anemia

Planning to do secondary analysis using hemoglobin as a continuous variable

- See if difference in groups by hemoglobin level

PO model with interaction effect for time to control for seasonality

Frank: "don't say stratify -> effect modification"

Would look at hemoglobin upon enrollment

Another project could take hospital course as baseline and do landmark analysis

- Everyone in the hospital for x amount of days -- describe by hemoglobin at presentation, standard deviation, and slope over time

"Response feature analysis"

- fully conditional

Time-dependent covariate analysis -- always updating, results are more difficult to interpret

Spline functions & knots

- Use AIC to determine # of knots to use

Cindy: if landmark analysis, seek expert for interpretation

Bill: As long as the conditional population is of interest -- must be healthy enough to make it through that initial period of time

Planning to pursue VICTR voucher

2023 May 11

Erica Carballo (Courtney Penn), Gynecologic Oncology

We seek to estimate the annual percentage of patients with advanced-stage epithelial ovarian cancer in the United States who are eligible for and will derive benefit from PARP inhibitor therapy based on US FDA-approved indications. We will compare the rates of eligibility and expected benefit, then analyze these trends over time.

We accomplished the above using similar methods to this JAMA Oncology article: We don't understand how they did their statistics/sensitivity analysis. I'd be happy to send our data/ methods so far before or after the meeting. I have an accepted abstract to a regional meeting, but this was without anything past basic descriptive statistics. We will need a more robust analysis for a publication.

% Benefitting over 2 years progression-free survival vs % Benefitting overall survival

Hoping to conduct a sensitivity analysis

Frank: In the absence of a Bayesian analysis, pay more attention to CI rather than point estimates

- Equated benefit with % benefitting -- can't tell patient-level benefit without cross-over study

- With parallel group design, can't make same determination (certain % of patients benefitted)

- Without knowledge of heterogeneity of treatment effect, default conclusion would be that everyone is benefitting at least a little bit (100%)

Benefit -> benefitting is a big difference

Progression at two years could be random

Can use Wilson CI -- what is the probability that a person is eligible?

Response rate of the drug -- imaging findings before & after treatment if certain criteria are met

Set up like a pre-post assessment -- pretty noisy, better if tumor size is less random and measured accurately

Refining question: interested in who got the drug, and how many are benefitting

Response probabilities have a lot of noise

Looking at different publications

Get standard errors algebraically: (23 - 8.6) / (1.96); square root of sum of squares of the two standard errors * 1.96

Can't do analysis with hazard ratio -- relative instantaneous risk of having event

Program "digitizer" -- can reproduce K-M curves from publications

"Treatment difference", "efficacy estimate"

- Avoid "% benefitting"

Don't segregate study based on p-value being small or large

Two-year overall survival -- mortality is too low

Progression free-survival -- discussion on

- Use state-transition model -- Markov

- One weakness -- what to do with non-related death?

2023 May 4

Nicholas Ward, Kim Petrie, Abby Brown, Biomedical Research Education and Training (BRET)

We are looking to follow up on our previous meeting (April 20, 2023) to refine our analysis strategy for two parts of our project.

First, in our prior discussion, we discussed the strategy of analyzing time spent in postdoctoral training using cumulative incidence curves. One issue we are trying to consider is how to handle alumni who graduated and pursued more than one postdoctoral training position. We know that some researchers do address recurrence in time-to-event analyses, but that this may make our own analysis more complicated. As an alternative, for those who have pursued postdoctoral training, we are considering a less complicated, but still very informative measure: the time to first non-training position. This is a one-time event that would eliminate the need to consider more complex analyses involving recurrence (i.e., the scenario of multiple postdoctoral training positions). We are hoping to discuss the data we have, and what the best option(s) for analysis would be.

Second, we were advised to use simple logistic regression to analyze whether or not a downward trend existed in graduates pursuing postdocs immediately after graduation. We have modeled this using year of graduation as the independent variable and choice to pursue a postdoc as the dependent variable (coded as either 0 for not pursuing a postdoc, and 1 for pursuing a postdoc). Given our need only to detect a significance in trend in this data, we are hoping to understand which analysis outputs are needed to support this when writing up our results (e.g., odds ratios with CIs, likelihood ratio test, area under ROC curve, goodness of fit).

Main Q: When looking across years: do we see a decline in percent of student pursuing postdoc immediatley after graduation? --> simple logistic regression

- Appears model fit and utility for prediction isn't fantastic, but decreasing trend is statistically significant

B1 = 0.9659 (0.9468, 0.9852) -> significant

LR p-value better than Wald p-value (LR p-values behave better)

Super impose predicted values from the model on first plot

When grouping students by career goals at graduation, do we see a difference in length of postdoc training? --> Time to event analysis like cumulative incidence curves; allows for right-censored data

- Accounting for recurrent events and discontinuous risk intervals (someone who takes a break after a postdoc)

Another complication: someone who seeks out a shorter post-doc after a really long one

How many have graduated, and how many have > 1 postdoc? If starting a second postdoc is extremely rare, that would simplify things (use right censoring)

Multistate model allows you to estimate things in more interesting ways

- Could call second postdoc a different state (adds more parameters and makes model more unstable)

Event used in time to event = conclusion of postdoc

- If use time to first non-training position, model would be agnostic to how many postdocs, any gaps

- Atypical scenarios could arise

How many students start postdoc before graduation? Small, but non-negligible #; problem is that you would be in different states at the same time

- What is time 0?

Making event = time of defense could solve issues

- Could solve problem of student who takes faculty position after graduation

- MD-PhD students differ greatly from PhD only and will not be included

Could do state-transition model over continuous time; with discrete, you would have to determine time unit (in this case, month)

State transition model handles censoring when records for participant stop appearing

P(transitioning in month 13 | post doc at month 12) -- transition probabilities that are natural from the data

Stacked bar charts

How to handle folks with unknown career goals? Baseline variable -- Use side by side stacked bar charts

Resources/packages for state transition models: mstate or msm

Challenging to put multiple confidence bands on graph with multiple survival curves -- would want to make separate plots

- model all transitions; put it all together

- jointly model with two time variables -- internal and external

2023 April 27

Shelby Meier, Alex Cheng, VICTR

We've recently finished a study investigating the effects of promised compensation offers on participant enrollment and study task completion. Participants were given information about what joining the entirely remote study would involved, and were then asked if they would join the study based on a randomly generated compensation offer between $0 and $50 ($5 increments). If they joined the study, they were asked to download the MyCap study app and complete up to 30 tasks over a 2 week period. One set of tasks was administered twice, and two other sets of tasks were administered daily. We also collected data on participant experience in the study. We have drafted a publication, but we would like feedback on the best way to analyze and present some of the data we collected.

Data has already been collected

Aim 1: Determine rate of study enrollment by level of compensation offered

Aim 2: Determine rate of study task adherence by level of compensation

Weekly three question checklist, two daily tasks

Participants recruited through ResearchMatch

9,986 invites, 492 answered, 412 enrolled, 284 completed study (those who enrolled but did not download the app excluded)

$ amount revealed after answering invite

"There is some element of deception in this study"

Primary Q: Does compensation have an effect on enrollment?

- Proportion of participants enrolled in the study by Loess regression curve (Frank suggests using overall confidence bands rather than individual; use Wilson CI than default)

Interpreting plot: would need to run a statistical test to determine if there is a trend; could use logistic regression model if this is of interest, test if coefficient is zero

Is there a different effect observed between income groups?

- Why was split made at $65k? No natural split

- Conduct logistic regression with two variables (promised compensation and categorical variables of income bucket)

- Predict how much of enrollment can be predicted from compensation amount, adjusted for income

Is there a different effect observed between racial groups? (self-reported)

Enrollment was almost too high to learn from

Appropriate by necessity -> need to limit # of variables

Put income groups, sex, age, and three buckets for race/ethnicity into logistic regression model

- Conduct some form of redundancy analysis to determine if any variables are inseparable

Does compensation have any effect on task completion?

Curve increases from 0-20, but eventually plateaus

- Wilson CI does not go above 100%

- Proportions are noisy because they are based on a tenth of the sample size

In previous study, Loess was chosen over logistic regression

Loess is really good with confidence bands, but doesn't give an overall assessment of "flatness"

- Just use confidence bands, no need for dots

Logistic regression supersedes Loess, just need to make sure it does not assume too much

Dots are medians, which cannot be used unless you have truly continuous variable (ties are a problem)

Want to make analysis more unified; don't want to use loess to visualize and logistic regression to analyze

Is there a difference in completion rates between low-frequency and high-frequency tasks

Use quadratic effect for compensation (use compensation and square of compensation in the model)

2023 April 20

Nicholas Ward, Kim Petrie, Biomedical Research Education and Training (BRET)

We are analyzing how long alumni from our biomedical graduate programs spend in postdoctoral training. For these analyses, we are grouping the data based on two reference points: 1) the student's career goal as identified at graduation, or 2) the career that the student ultimately ended up in 10 years after graduation. For each of these two reference points, we are looking to see if length of postdoctoral training differed between 6 different career paths. We hope to make pairwise comparisons between each of the career paths. We have a cohort of 325 students who identified a career goal at graduation (group 1 from above), and 509 students for whom we know the career outcome 10 years after graduation (group 2 from above). There are 214 students who belong to both of these groups. We are looking for advice on how to make these multiple comparisons given that some students belong to both of these groups, while others only belong to one of the two groups. We also have a second data set detailing by graduation year how many students pursued a postdoc (percentage calculated as # pursuing postdoc divided by total # of graduates). We seem to note a downward trend in these percentages. We hope to examine direction and statistical significance of the trend over time. Is it appropriate to use a Mann-Kendall test on the percentages or is another approach more advisable?

Interested in career outcomes of postdocs

Q1: In terms of length of time in postdoc:

- Do we see difference between groups when grouping by a student's career goal at graduation?

- Do we see a difference between groups when grouping by the student's actual career outcome 10 years after graduation?

Q2: looking across years do we see a decline in the percent of students pursuing a postdoc immediately after graduation?

Five categories at graduation: Academic Research, For-profit research, govt/non-profit research, AMO, undecided

Cass: Do you collect data on people whose career path changed dramatically?

Nick: We have data collected at 1,3,5,10 years -- we do see lots of folks make that transition

Documentation exists for how particular careers map to the specified categories

For some trainees, we have goal at graduation, but do not yet have Y10 career -- not 10 years post-PhD

- Need to distinguish between missing you should have obtained (NA) versus missing because not yet 10 years post-PhD (administrative missingness)

- Treat as right-censored

Kaplan-Meier curve => Cumulative incidence curve

Create extra category for administrative missingness

111 with goal at grad known but not Y10 career (cohort C), 295 with Y10 career known but not goal at grad (cohort D), and 214 with both known

Goal: within the two cohorts, determine if there are differences in postdoc length

Cohort C is the one that would be most affected by right censoring

Time to event problem -- try to have one analysis per question

- How did probability of having postdoc position vary with time and with goal at graduation?

Call those who didn't have exit survey a group -- bookkeeping and to complete the picture

Equal opportunity surveying a concern

Think about raw data rather than groups: 2, 3+, 3, 4+, etc.

Analysis conditional on start of post-doc

7 cumulative incidence curves, each with their own colored confidence bands

Transitioning to Q2... do we see decrease in % of postdocs in more recent years?

Do we see folks taking a break after graduation before postdoc? Some, but not frequent

Raw data would be year of graduation and yes/no => produce plot to visualize time trend with confidence bands

- could also use logistic regression to answer a variety of questions

- time to postdoc would be right censored for those who do not have Y10 career

multistate model with time to event analysis as a special case

Market forces change over time: unemployment, job openings, economic conditions -- time-dependent covariates

2023 April 13

Eesha Singh (Matthew Meriweather), Neurology

I am applying to access the Get with the Guidelines stroke database to analyze the correlation between certain socioeconomic variables and diagnostic testing done as part of patient workup. Mentor confirmed.

Health equity/access

Frank: possible some folks are not communicative, making interaction shorter?

Eesha: Health literacy could have something to do with it. Literacy measure & provider characteristics & patient address/social vulnerability index not available (zip code is available, but could capture a wide range of socioeconomic statuses)

Goals are descriptive -- is there a correlation/disparity

Study population -- All patients that present with ischemic/hemorragic: 2000 hospitals, n = 5 million patient records

- Would shy away from hypothesis testing with a sample size that big; do estimation instead

- Magnitude of correlation is of interest, less so the p-value

Secondary analysis: variable clustering -- see which variables run together, helps to understand true dimensionality

- Helps to identify if some variables are restatements of others

Correlate number of tests ordered with anything you're interested in

Stroke workup should not vary significantly by ethnicity/economic status

Group by suspected mechanism

Another type of analysis: characterize how people of a certain ethnicity differ from others

- For a particular ethnicity, if older folks do not seek out care, age distribution for that ethnicity would differ from others

- Predict ethnicity of a person based on other covariates

Neurology dept does not have a biostat person at present

VICTR voucher is an option if you don't want to do analysis yourself (preferable)

- ~12 week timeline to get started; don't apply too early - can extend, but we try to discourage this

- One thing that can hold up a voucher: data cleaning

Deadline for analysis: 9/29/2023, expected to present 2/2024

- Should start no later than end of July

2023 April 6

Lyndsay Nelson (Lindsay Mayberry), Medicine/General Internal Med and Public Health

As part of an upcoming grant submission, we are proposing to evaluate the effectiveness and implementation of a mobile health intervention for diabetes self-management support. We will implement the intervention in 12 clinic sites in middle TN. We would like to review our plans to evaluate effectiveness (effects on HbA1c) using a pre-post/interrupted time series design. Mentor confirmed.

REACH study -- text messaging intervention, overcoming barriers to medication adherence

Results are published -- improved adherence and A1c (for those with more room to improve)

- Effect waned over time -- clinicians indicated results were "over the top effective"

12 clinical sites interested in implementing -- outcome will be A1c improvement

Consistent body of evidence that text message intervention improves adherence

Q: Can clinics implement this, and what would it look like?

We think the best thing is an implementation study (hybrid type 2)

Clinics are opposed to randomization -- don't want to retest based on overwhelming body of evidence

Even short term reductions are meaningful

Robert: Population was more homogeneous (smaller range of A1c), so could get away with a simpler model

Frank: Impact of 0.5 reduction is dependent on baseline measure

Figure: In control group, saw some regression toward the mean

Frank: mean might not be the best summary measure for A1c

- Worry for loss of effect from figure

Robert: mechanism of action is removing barriers to adherence

- Not unsual to see measures regress to baseline after making progress (akin to person trying to quit smoking)

- Potential need for repeated exposures for sustained behavioral change

REACH intervention effect size -- overlay histogram and fit spline

Frank: excitement over subgroup needs to be tempered by context of bigger picture

Robert: what would be most persuasive analysis of data when patients self-select?

- Time Zero = informative

Frank: measure attention & engagement is not sufficient to show treatment is working, but it is necessary

Robert: measure of engagement -- people can choose to respond whether they took meds or not

Robert: want model that can capture +/- 6 month comparison

- By design: ignore values +/- 2 months from start

- Include calendar time in model to account for seasonal effects

- Worried about medications a participant is prescribed at time A1c measure is taken

Frank: How far back from enrollment will A1c measurements go?

Robert: Could get several years for participant who has been going to that clinic for that length of time

- Would be surprised if 90% did not have at least a year

Frank: emphasize slopes or averages?

- Concern of dropout from text-messaging or don't return to clinic for A1c measurement

Hanging hat on A1c = risky

"We want to use the best available model for observational studies of this type to characterize long-term A1c success of a group of patients, accounting for past history of A1c"

- What is A1c a function of?

Lindsay: Do we leverage participants in clinic who don't sign up?

Robert: No; people who sign up are distinct from those who do not

Andrew: I would still be interested in having that information, even if we're not incorporating in the model

The problem: no sign-up date

- Frank: If you can infer what it would have been +/- a few weeks, could give basis for Andrew's comparison

McKenzie: What about participants who did not sign up, but then did a few weeks later?

Instrumental variable analysis

2023 March 30

Ronak Mistry (Benjamin Tillman), Hematology/Oncology

We are trying to determine if the quantitative D-dimer and fibrinogen improve as platelets improve in patients with heparin-induced thrombocytopenia. Assistance with determining correlation of individual patient data and then composite data. Mentor confirmed.

Can serologic markers help to predict platelet improvement?

- Which of the two serologic markers (fibrinogen, D-dimer) can best predict improvement?

- Determine correlation coefficient for each?

13 patients, repeated measures

Interested in markers as being a preview of platelets (can we reliably say if fibrinogen improves, then platelets will also improve)

- What is the largest lag such that the correlation is preserved (no worse than 0.6)? How far of a look-ahead can you get?

- Is the most recent serologic value more informative, or the trajectory/slope?

Multivariable regression -- calculate R^2 for fibrinogen and D-dimer, assess whether majority of correlation for one with platelets is already accounted for with the other

"Cross-correlation analysis" -- spaghetti plots & regression/R^2, looking at various lags

Day to day measurements, few gaps

Knowing why data are missing is important

Exclusion criteria: negative HIT assay

Going to get the ball rolling on a VICTR voucher

2023 March 16

Kai Wang (Andrea Birch), Radiology

We would like help to analyze the text within our survey responses, and further help to analyze our other data we collected from the survey. Survey has 125 responses. Mentor confirmed.

Wrote grant last year with intention of enhancing services to women of color

Questionnaire to ascertain demographic info & open-ended questions; distributed at churches, nail salons, grocery stores, sorority, etc.

Enrollees all African American women ages 40-64 (no disease history required)

4 Nashville zip codes where population is > 40% black

Qualtrics used to analyze data

How many received an invitation for the survey?

Self-selection/non-response bias an issue

- Respondents with Vanderbilt email are more likely to receive care at Vanderbilt (stratify)

Could collapse zip code to another dimension (rurality of zip code, median income of household, distance from center of zip code to closest center of excellence/Vanderbilt)

- Translate from categorical to numerical scale

Open field questions: "Is there anything else you would like for us to know?"

- Required field to complete the survey

Rank correlation coefficient for age ranked in order 1-5

- Don't stress p-values

Yaa Kumah-Crystal, Biomedical Informatics

I am doing some general evaluation about ChatGPT and its reliability in science. I was curious how it would do with a statistical question. I provided some information and it gave me an answer but I am not sure if the answer is actually correct or if it just “sounds good” as these tools are known to “hallucinate.” Can someone let me know if this analysis is appropriate or completely made up? I know that there are nuances that should be taken to account when choosing a statistical test like how the data is distributed, etc. – but overall is the approach and explanation given by ChatGPT below generally reasonable, or completely wrong?

Search "ChatGPT" on to get Frank's thoughts -- prone to leading questions

- Frank's experience: fast code for getting the wrong answer

Dr. Kumah-Crystal conducted a chi square test using ChatGPT

Assess whether student performed significantly worse on any one of five categories of questions

ChatGPT ignored chi square test's independence assumption

"Significant" language does not belong in a null hypothesis

2023 March 2

Emily Harriott (Laurie Cutting, Laura Barquero), Neuroscience

I have a dataset of 202 participants, combined across 2 studies. I want to predict standardized reading and math assessment scores using metrics of white matter structure for 18 white matter tracts (1 measure of structure for each tract, 18 tracts). Currently, I am using lasso regressions (throw all 18 tracts into the model, see which ones are important predictors, then put those selected predictors into a normal OLS regression model) to do this. I am not sure if I am using these regressions correctly, or even if I should be using them at all (should I use ridge or elastic net or relaxed lasso instead?). If I am to use lasso regressions as I have been doing, I am particularly concerned about if I did them correctly because I was told that the coefficients from the lasso regressions should match the coefficients from the OLS regressions of the selected predictors, and they do not match. I am conducting these analyses for a poster presentation at a conference at the end of March. Mentor confirmed.

Research Q: Which tract(s) as measured by FA best predicts reading and math scores?

- Build regression model (ridge, lasso, elastic net, or something else?)

- Ridge more reliable than lasso & elastic net

- Fatal flaw of lasso: trying to do selection => low probability of selecting right signals, not enough information

4 outcome measures: 2 lower level (reading, math) & 2 higher level (reading & math)

Multicollinearity in tracts

Can investigate redundancy and correlation (see Frank's analysis file)

Explore PCA

2023 February 23

Terrin Tamati, Otolaryngology

I would like to analyze performance (accuracy, response time) among a behavioral assessment among our patient group. I plan to use mixed models, and I'll like to ask about the appropriate fixed and random effects structure.

Sentence verification test (48 sentences: half true, half false) -- measure accuracy and response time

Interested in effect of talker on accuracy and effect of talker on response time

Manipulate who speaker was (three male and three female)

Logistic mixed effects model

Sample size: 13

Set up as repeated measure (don't want to treat as completely new instance)

- Subject ID

Random effects: something you can't control (different sites)

Worried about overfitting

Look into VICTR voucher/R clinic ( for coding questions

Steven Allon, General Internal Medicine and Public Health

Our study is a multicenter, comparative efficacy cluster RCT of different journal club formats for internal medicine residents. Primary outcomes will be (1) subjective engagement and (2) critical appraisal skills.

Goals: 1. Revise existing subjective engagement questionnaire with your input to insure unbiased questions and appropriate scaling. 2. Discuss process for achieving content validity in questionnaire (feedback from manuscript reviewer from previous trial). 3. Discuss process for creating and validating a novel measure to assess critical appraisal skills that can be integrated into existing journal club sessions.

Critical appraisal by Berlin questionnaire

Tools and outcome measures we care about don't exist -- our aim is to create some!

Five centers, 140 participants -- all sites using gamified journal club curriculum

Hypothesis 1: do different JC formats yield different levels of engagement?

Hypothesis 2: do different JC formats yield different improvements in critical appraisal?

VICTR voucher an option

Design looks reasonable (no obvious methodologic flaws)

Put out an email to connect with someone to consult with for Education research

No measure of JC engagement in literature

CREATE framework: used to capture engagement

- Psychologist would be helpful in assessing appropriateness of instrument design

Embed questions throughout journal club proceedings

Randomization for which site begins with which format

Co-primary endpoints

REDCap app for cross-site data collection

2023 February 16

Jacob Jo (Douglas Terry), Neurosurgery

Hope to discuss a new project idea. Currently configuring the methods section. In short, we hope to see how variability in Post Concussion Symptom Severity Scores (PCSS) (0-128, over 22 questions/symptoms) affect outcomes. Mentor confirmed.

Questionnaire administered after concussion -- physical, cognitive, sleep, psychological

PCSS has proven robust in terms of recovery time from concussion

How does variability affect outcomes? 6 x 6 + 0 x 16 = 36, but 2 x 15 + 1 x 6 + 0 x 1 = 36

- A few bad symptoms or lots of mild symptoms: which profile is worse?

Formulate scale in different ways -- have a contest on which one predicts time to recovery the best

Idea: Hierarchical scale with clinical overrides

Analysis ideas:

- Variable clustering analysis -- rank correlation (determine which questions have overlapping information)

- Predict total in forward/stepwise fashion

- Principle components/sparse principle components

Instead of scoring each item 1-6, score as S1-S6

- Run regression analysis, solve for S's to derive appropriate weighting

- Compete with original scale to see if more predictive

Five additional opportunities: use adjusted R^2

Redundancy analysis: which question answers can be predicted by other question answers?

2023 February 2

Stacy McIntyre (James Tolle), Pulmonary Critical Care

Retrospective analysis of outcomes of transitions of care from the pediatric to the adult CF clinic within our institution. Outcomes include primarily lung function (FEV1), BMI, and number of hospitalizations/exacerbations in the year prior compared to the year after transition. Mentor confirmed.

Sample is in last five years

~ 85% transitioned at age 18 (a few a little older, a few a little younger) -- we know age of transition for everyone

Make data long: Column A = patient identifier, Column B = Date, Column C on = measures

Spaghetti plot: you see all data (no summary involved)

Most participants will have E/T/I date, which increases lung function

- Only a few participants not eligible by genotype

- We will know when they started and assume compliance (rate of non-compliance = very low)

- Include two variables -- did they use medication (yes/no) and if yes, date

N = 60; type of modeling and conclusions are limited

- Can say, "Of people with lung function that declined _ amount, __ % ...

- "This is what we observed in our population"

No one uninsured in the population -- so question would be is private insurance > medicaid

Stratify across variables of interest

If you have dates of clinic visits/hospitalizations, calculate intervals between them

2023 January 26

Megan Wright (Jessica Gillon), Pharmacy

My project is looking at the impact of requiring stop dates on antibiotic orders at the children's hospital. We have 4 time periods that we are looking at and would like help understanding what statistical tests should be run. Mentor confirmed.

Raw data: individual dose administered; what it was indicated for and what day it was given

Gap where things are uncertain; remove that time interval from analysis

Duration varies by season; COVID also happened during all of this

- Seasonal variation is smooth, covid is more discontinuous

Use general regression model which accounts for all forces in play

- On/off variable that is discontinuous (are you before policy change or after policy change)

You're interested in signal after subtracting off other effects

Want to pull all data, not just June-November

- Might be unlucky on what months you pulled

Days from a starting point for the whole study (more resolution)

Time trends: seasonal, but could have other trends (staff adherence)

Reason for weird ANOVA variances -- possibly outliers

Apply for VICTR voucher?

Bradley Hall (Lauren Connor), Plastic Surgery

We would like to compare suture technique for patients undergoing a surgical procedure to determine if there is any difference in outcomes between the two techniques. Some believe that one technique may lead to higher complication rates, but we do not believe that is the case. And if that's true then that technique would have a number of other benefits including less time in the OR, less cost, less risk to providers, and potentially fewer postoperative issues. Mentor confirmed, did not attend.

Equivalence trial, or non-inferiority (both high sample sizes, latter slightly lower); looking for 5% difference

How many would we need in each study arm?

Yes/No => lowest resolution, need highest sample

- N = 100-200; rules out Yes/No entirely (no close calls)

get consensus on outcome severity => hierarchical levels; get more out of your sample size

- Want levels to be well populated

Not appropriate for VICTR voucher

RCTs very involved; should get a dedicated statistician

Statisticians should be embedded from beginning of the study to the end

How do you use REDCap for randomization? 1:1? Blocked?

Next steps:

- Involve dept chair: have him/her contact Yu Shyr

2023 January 19

Stacy McIntyre (James Tolle), Pulmonary Critical Care

Our project is a retrospective chart review analyzing outcomes of cystic fibrosis (CF) patients who have transitioned from our pediatric clinic to our adult CF clinic. We would like to discuss biostats needed to evaluate associations between patient factors and outcomes. Mentor confirmed.

Sample of about 60

"Port CF" database -- deidentified excel sheet in OneDrive -- prospective registry

Data need to be cleaned (remove comments)

Biggest burning question: Do outcomes change after transition?

Analyze BMI using age as a covariate; subtract out the effect of age to look at effects of other vars

Try to avoid percentile approach; makes too many assumptions (linearity, normality)

- Stay as close to raw data as possible; pull original BMI data

Would learn a lot from spaghetti plot (red before transition, green after); can see data gaps

Tall and thin data set; date + BMI

Some patients will have less data after transition (one year) -> analysis will account for that

Problem with mean change = regression to the mean (someone could have good or bad day or measurement error)

Testing of significance must account for data pairing

Big picture: look at time continuously when possible

Living situation data not well defined; employment status, student status, & health insurance best we have (must deal with missingness, though)

Leon Scott, Orthopaedic Surgery

I want to set up a pilot study evaluating the effect of a low-energy diet (LED) intervention on measures related to weight, osteoarthritis, hypertension, and diabetes. The pilot study is to (a) test the intervention on a small scale before requesting funding for a sufficiently powered study and (b) ensure I have the infrastructure to execute the more extensive study effectively. My question for Biostatistics Clinic is, "are the statistical measures in my specific aims appropriate?"

Specific Aims and Hypothesis Aim 1: To evaluate the effect of an LED diet intervention, including pre-prepared meals, on weight. Hypothesis: Subjects will demonstrate a clinically significant reduction in weight (15%) at 12 weeks compared to their baseline. Approach: This aim is designed to compare mean differences in weight at the onset and endpoint of the study. As such, the data is from two paired datasets. The mean weight difference will have a normal distribution derived from a parametric variable. A paired t-test will be used to measure the difference between groups. Secondary outcomes will be the proportion of subjects that reach a 10% and 20% weight loss threshold. Too few subjects will be included to perform regression analysis of which variables (e.g., gender, the initial level of obesity, age) predict meeting those weight loss thresholds. This is a pilot study with five subjects. In the future, the sample size will be powered to determine a difference with a beta-error of 0.2 using a % weight loss standard deviation of 3.9%.

Aim 2: To evaluate the effect of diet intervention on knee osteoarthritis patient-reported outcomes measures. Hypothesis: Subjects will demonstrate a clinically significant improvement in the Visual Analog Score (VAS) for pain (2 points) at 12 weeks compared to their baseline. Approach: This study compares paired mean differences of pre- and post-VAS scores at the onset and endpoint of the study. Since the mean differences off a non-parametric VAS score have a non-normal distribution, a Wilcoxon-Rank-Sum Test will be used to measure the difference. Similar evaluation will be performed for secondary outcomes of KOOS sub-scales, WOMAC, and SF-12 scales. This is a pilot study with five subjects. In the future, the sample size will be powered to determine a difference with a beta-error of 0.2 using a VAS standard deviation of 1.1.

Aim 3: To evaluate if a LED diet intervention has a clinically significant change in markers of hypertension and T2D. Hypothesis: Between the onset of the study and the conclusion, subjects will experience improvements in systolic blood pressure, diastolic blood pressure, and HgbA1C. Regarding blood pressure, we hypothesize that 100% of the subjects will experience a 50% improvement in their baseline systolic and diastolic blood pressures and the goal of 120/80 mmHg. This compares baseline and endpoint datasets in a single population with non-normal distribution since we are evaluating proportions that meet a blood pressure goal, not the blood pressure numbers themselves. The statistical measure that will show significant change is a Wilcoxon-Rank-Sum test. Regarding the HgbA1C, we hypothesize an average 1.0% point change at three months. Our statistical measure for if the group demonstrates this degree of change is a paired t-test (the expected standard deviation for a 1.29% change is 1.32%).

Aim 4: To evaluate if a LED diet intervention, including pre-prepared meals, reduces the proportion of patients using non-protocol interventions. Hypothesis: The proportion of subjects that use a non-protocol intervention (e.g., oral/topical NSAIDs, other oral/topical analgesics, corticosteroid injections in the previous three months, braces, units of short-acting insulin & units of long-acting insulin, etc.) will be lower and reach statistical significance (p<0.05) after the intervention compared to those subjects paired values at the onset of the study. Approach: This aim is to compare two proportions at various time points with two data sets. Statistical significance will be measured using the McNemar test. Furthermore, using non-protocol interventions will be evaluated to see if they predict the clinical changes in patient-reported outcomes and weight using logistic regression.

Sample size of 300 at minimum for binary outcomes

- Scatterplot?

Paired t-test would be fine; Wilcoxon signed rank test more robust

BMI > 35 to be included in study

- Regression to the mean an issue (caught person on good/bad day, measurement error)

Admit patients if maintained stable BMI for a particular period of time? Unstable correlates with having higher/lower BMI...

Fidelity to the diet is less of a concern

Example: reliability of self-reported food intake = not good

Recalling intake might increase possibility of cheating

Signed rank test will work for aim 2

Wilcoxon rank difference test good for paired data -- same p-value no matter how you transform the data; robust

Rank difference test for aim 3

- Make patient their own control; pre-post, not against 120/80

- Wait x minutes, measure; wait, measure; use same instrument, keep other factors constant

Meals will be delivered to participant's house

2023 January 12

Doug Bryant (W. Evan Rivers), Physical Medicine and Rehabilitation

Endoscopic rhizotomy systematic review - follow up from last meeting on May 5th, 2022. Data collection is complete, would like to further discuss VICTR application process. Mentor confirmed.

Six studies identified that met inclusion criteria - these were determined to meet acceptable standard of care

Pre-procedural screening

Any population-level differences across the studies should be adjusted for

Are studies randomized? Of the ones with comparison groups, one was randomized, others were cohort studies

Next steps: assess amalgomated effect

Meta analysis can properly account for study to study variation

Simple pooled analysis (CI will be falsely narrow)

- How should you weight?

Accounting for Time Zero across studies

Randomized trials = the best

"Surgery before?" could be key covariate

Nail down grouping and modeling in further dialogue

- Make goals, candidate studies, and assessed outcomes clear

VICTR voucher good for a year

"Spreadsheet from hell" on website: things to avoid

Brett Kroncke, Medicine/Clin Pharm

Testing genetic features' ability to predict risk of cardiac events. Recorded data are age at first event, frequency of subsequent events (some are age at subsequent event), and use (start date and duration) of controlling medication. I would like to use these data to evaluate the ability of genetic features to predict these events, controlling for other clinical features and use of medications.

Predict risk of event given carrier heterozygous status

Control for known clinical features: corrected QT interval, age at first event, event rate, age at all subsequent events, age at Beta blocker use, ...

100 people might have genetic variant (no variation in that marker)

- Like dose-response relationship without linearity assumption

How to handle multiple, distinct events (largely one type: syncope)?

- State transition model, allowing patients to move in and out of various states

If data are sparse (most don't have syncope, and if they do, only once)

- Cox model (time to event)

Frank: "why your collaborator is wrong":

- Look under "dependent variable" and click on the "other information" tab under that.

Explore imputation approaches

Key issue in arrhythmia research: access to EKG or summary of EKG

- Barrier will be quite high, but payoff could be worth it

· Analysis: Propensity score weighting

· Cox proportional hazards: recurrent / competing events

· Frank: May be two reasons to change medication:

o Planning, vs. reaction

o Is causal analysis needed to get rid of this feedback loop?

· Frank: internal time-dependent covariates are what is present here.

o External version would be crossover study where everyone must switch drug at a certain time.

o Interpretation is harder with internal.

o If covariates aren’t updated frequently enough, what we are trying to learn from our change variable will be difficult to interpret

o Propensity adjustment may not be sufficient for that.

§Investigator: propensity was chosen because exposure groups may be vastly different. Ashley wanted to account for that via a number of covariates (about 10). Sample size is estimated at 70,000, but many have not switched at all (lots of 0’s).

§Static Propensity score: looking at baseline or characteristics at one point in time, then we are not accounting for people switching back and forth (increasing dosage and then decreasing, for example).

§This situation is more dynamic; time-dependent covariates are important.

§Frank: How do you want to word your conclusion? We learned something that gives the recipe for required changes to affect better outcomes (causal), or a non-causal conclusion?

§Goal of observational study is understanding the system;

§Frank suggests going without the propensity score: Miguel Hernan (observational… 2008) , cohort from observation data, estrogen in case of hormone therapy or not. Can get same results as RCT if time-dependent covariates were well understood and updated frequently.

§He also has a great book on observational data

§Confounders would need to be measured within days / weeks of switch

§Andrew Spieker, Bryan Shepard (a few others as well) specializing in causal inference. We could ask someone to join clinic.

§Medical natural experiments: surgical conferences covered by junior surgeons while away. What happens to patient outcomes during this is a natural experiment (for example)
Topic attachments
I Attachment Action Size Date Who Comment
Aims-10-21-15.docxdocx Aims-10-21-15.docx manage 12.5 K 29 Oct 2015 - 10:00 LiWang  
BotScoringSheet.pdfpdf BotScoringSheet.pdf manage 337.6 K 18 Nov 2016 - 13:51 AmyPerkins  
Gregory_Health_Sciences_Protocol_for__iSLEEP.docdoc Gregory_Health_Sciences_Protocol_for__iSLEEP.doc manage 33.5 K 27 Oct 2016 - 10:08 AmyPerkins  
J_Durlacher_Tables_for_Biostats_Clinic.docxdocx J_Durlacher_Tables_for_Biostats_Clinic.docx manage 61.5 K 03 Nov 2016 - 09:28 AmyPerkins  
Pilot_Full_Protocol_4.7.16.docxdocx Pilot_Full_Protocol_4.7.16.docx manage 236.2 K 13 Apr 2016 - 08:38 JonKropski Kropski - Phase Ib Trial protocol
Preterm_babies_with_CHD_database.pdfpdf Preterm_babies_with_CHD_database.pdf manage 55.3 K 07 Apr 2015 - 11:54 MeridithBlevins For April 9 Consultation
SplinesWithInteraction.dodo manage 6.3 K 13 Mar 2015 - 16:19 WilliamDupont  
Survey_Research_Protocol.docxdocx Survey_Research_Protocol.docx manage 17.2 K 18 Oct 2016 - 16:42 AmyPerkins  
VICTR.Proposal.v1.pdfpdf VICTR.Proposal.v1.pdf manage 170.3 K 03 Nov 2016 - 09:31 AmyPerkins  
VICTR_Research_Proposal_Roman-no_CPT_2016.docxdocx VICTR_Research_Proposal_Roman-no_CPT_2016.docx manage 42.5 K 14 Nov 2016 - 13:03 AmyPerkins  
VPA_poster_AES_2008_v5_12-3-08.pdfpdf VPA_poster_AES_2008_v5_12-3-08.pdf manage 201.5 K 27 Jan 2017 - 12:07 AmyPerkins  
VR22383_Pre-review_questions_for_PI.docxdocx VR22383_Pre-review_questions_for_PI.docx manage 78.3 K 31 Oct 2016 - 14:08 AmyPerkins  
VR7850.R1_Pre-review_Questions_Round_2_10.25.16.docxdocx VR7850.R1_Pre-review_Questions_Round_2_10.25.16.docx manage 505.1 K 14 Nov 2016 - 13:03 AmyPerkins  
reserve_biostat_clinic_Thursday__January_19th.htmlhtml reserve_biostat_clinic_Thursday__January_19th.html manage 9.9 K 12 Jan 2017 - 11:10 AmyPerkins  
z.pdfpdf z.pdf manage 64.5 K 22 Oct 2015 - 11:21 ShiHuang  
Topic revision: r1030 - 07 Dec 2023, CassieJohnson

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback