We are working with a PI who wants to use My Health at Vanderbilt to recruit patients for her survey study. We would like to randomize patients to receiving the study invitation via a central mechanism (MHAV Recruitment Requests) or direct messaging by the study team (similar to clinical communications). We often get asked if one mechansim is more effective than another and want to explore this research on research opportunity.
In attendance: Frank Harrell, Terri Scott, Tara Helmer, Jackson Resser, Cass Johnson
Is it feasible to study the difference in response rates between MHAV recruitment request and MHAV messages?
Caveats: Patients can manage notifications for both, and may set different preferences for each. Notifications may look different, and message appearance would be different as well.Study of interest: Impact of Restricted Medication Access on Care of Multiple Myeloma Patients (PI: Autumn Zuckerman). About 90 patients will be enrolled. PI asked if there was difference in response rate between the two mechanisms.
Patients would need a MHAV account.
Can it be determined whether a patient has viewed the message? Response rates may be highly dependent on patient notification settings, which we may not be able to confirm.
We can determine when the message was sent and when a patient showed or refused interest in the study.
Tara may consider randomization of whether a patient receives message via recruitment request or MHAV message. Currently, the choice is largely determined by study team size and available time, as MHAV messages are more manual than the recruitment request.
All patients will come from the same report (must be marked OK to contact). Randomization could occur from there. Eligibility criteria would have already been accounted for (fairly basic inclusion / exclusion criteria for this study). Also possible additional screening is required.
Also limited to "active" patients (per Epic classification).
Frank also notes on sample size: if you recruit 90, the more even split you can get the better. The margin of error for estimating the difference may be around .15 (15%) at that size; not very specific, but may still give useful insight.
Goals would be to have a response for investigators as they determine how they would lik ethe study to be messaged; VICTR may also want a manuscript as a result.
As long as denominators stay constant, the information should provide meaning.clinic trial studying effect of peripheral nerve block on post-operative opioid requirements after head and neck cancer resection with free flap reconstruction
Current pain management: narcotics/IV meds
Investigate if treatment will reduce pain scores and overall satisfaction
Items for today: sample size, optimal study design
150 free flaps a year
Best way to randomize
Literature review: effect of IV tylenol
Frank: previous studies provide measures of patient to patient variability
- Power of two sample t-test depends on standard deviation of the outcome
- More variability, power goes down, more sample size needed
- Test-retest unreliability noise - bad characteristic for outcome
Want response to have little variability within the patient
Is administration of morphine standardized?
- Pain score correlates with dosage
Pain level is logical primary outcome, secondary could be narcotic utilization
- Pain score would be reliably captured in EMR (0-10 scale; pain of 1-7 = oxy5, 8-10 = oxy10)
- Want to be able to distinguish level 7 from level 10
Placebo effect: surprisingly absent for a lot of medical outcomes
Reliability of pain assessment and ability to be assessed by blind observer
- Bed-side nurses rate the pain scale and administer the medicine
- In most patients, nurse will not know whether block was done or not
Improve power: measure calibration between patients
Covariate adjustment (no downside to adjusting for a covariate if it turns out it doesn't predict anything):
- neuro assessment to adjust for risk- cancer site pain
- age
- smoking status (most would be smokers and drinkers)
Sample size calculation using:
- relative frequencies of pain level
- Minimal clinically importance difference in pain level (difference you don't want to miss)
Group-level randomization vs participant-level randomization
- intracluster correlation coefficient
Will need at least 20 clusters in a cluster randomized trial
- Fewer leaves you susceptible to chance imbalance
Individual randomization has problems, but gives you the highest power
- Person on the ground, for every case, defining block Y/N
- Other option: everyone gets a block in January, no one gets a block in February, and flip flop
Next step: Learning Health System to drill down trial design
General lead time on VICTR grant: 8 weeksAnalysis of RCT data on speech, language, literacy, and hearing data.
- Advances in cochlear implant technology; improve auditory signals
Strictly randomized crossover study -- N = 48, one withdrew after baseline -> 47
Treatment intervention: is there a benefit to cochlear implant tune-up in hearing status? Does this predict performance changes?
- Problem: a lot of measures, will need to subset
Measures NOT amalgamated in the literature; kept distinct
- Pitch aggregation method
15 measures of spectro-resolution
Q1: Is there an improvement? Q2: what predicts that improvement?
Frank: desirability of outcome ranking (DOOR) -- which treatment resulted in the most desirable outcome? No clinical interpretation -- relative, but projected into one dimension
- Rank-difference test > signed rank-test, respects pairing
Randomization test or a priori way to aggregate variables that share same variance?
- PCA, sparse PCA (combo of clustering and PCA) - find out which variables move together across kids
SAP drafted, first 20 participants unblinded (regression-oriented model)- So many variables that family-wise error will be a problem
Frank: two challenges
1) number of parameters
2) unblinded part of data -- once unblinded, you need to use statistical plan without modification
Univariate approach worth exploring
Another research question: given standard scores & percentile ranks
Z-scores makes assumptions: assumes reference population selected at random
- SD needs to be meaningful (measure must be symmetrically distributed)
Percentiling is like assessing speed of runner relative to other runners rather than using a stopwatch
Frank's preference is to do things on raw data scale (raw scores will have different meaning for different ages)
Raw score regardless of age is like a stopwatchconsideration of multi-state analysis for endpoints in a RCT looking at patient outcomes of chest tube flushing vs no flushing.
No RCTs evaluating role of regular chest tube flushing in the setting of pleural space infection for optimal drainage and treatment outcomes
Many studies of pleural space infection do not report a chest tube flush protocol
Hypothesis: regular flushing of catheters leads to early tube removal
Target recruitment: N = 96
Stat considerations:
- Multi-state model for time to event analysis
- Cumulative failure of therapy/drainage curve will be calculated from the Kaplan-Meier method and compared using log-rank test
- MV linear regression to evaluate treatment effect on outcomes measured 1 week after treatment randomization adjusting for measurements at baseline and covariates
tPA group vs non-tPA group; chest tube = clogged or patent; chest tube +- tpA was considered treatment failure or not
8 state transition model with 17 total transitions -- some states extremely unlikely
Original plan: endpoint = time to test tube removal
- time to something good is only interrupted by something bad (never good)
- time to event is hard to interpret, hard to handle censoring
Put patient paths in a hierachy (how bad was the worst thing that happened to a patient in a given day)
- One parameter: odds ratio, measures odds of transitioning to something worse
- Ordinal transition model requires clinical concensus on what is better and what is worse
- Bad = level X or worse
Ordinal transition model -- rare states are not a problem
Next step: order levels for transition model by severity
What is frequency after you take overrides into account?
Literature is missing several states (Jen developed CRF so information is captured on daily basis)
- Be clear on primary endpoint
- Proportion of patients getting level 3 or level 4 outcome? State levels you want to calculate probabilities of from multi-state model
-Frank may seek permission to share Vancomycin protocol (or at least statistical analysis portion) with the investigatorsGarrett Booth, Frank Harrell, and Cass Johnson in attendance:
· Can effect size of authorship and authorship attribution be used in a meta-analysis? · Does double-counting of folks on different clinical practice guidelines count? · Frank: There is nothing about this context that makes meta-analysis less useful than for a medical attribute, for example. · Time-oriented flowchart; would that look like a denominator where a woman enters as a pathologist, and we are interested in what happens from there? o Garrett: Primary question is more how equitable is representation across all these smaller fields of study. Meta-analysis could look at authorship attributes and ideally determine the effect of the bias. · Frank: If available, longitudinal data is some of the most effective data (apart from randomized); or, if age of each person (compared w/ average age of entering the profession) could be determined, perhaps use that in place of longitudinal data? o Unlikely to have that information right now. · Cass emailed previous investigators on 11/2/2023 for information on the meta-analysis management software they used and will reach out as soon as a response is received. If needed, can also provide an example template for how meta-analysis data may be organized so that it can be easily used in statistical analysis. · Additional resource on meta-analysis: Welcome! | Doing Meta-Analysis in R (bookdown.org)I would like feedback on the analysis below.
d. Research Design. The study design will be a retrospective observational cohort study of pregnant individuals with either a diagnosis of opioid use disorder or evidence of medication use for opioid use disorder at least three months prior to pregnancy. We will require continuous enrollment three months prior to pregnancy and up to 28 days postpartum. For sensitivity analyses, we intend to relax the study/continuous enrollment criteria for the pre-pregnancy period, and how we define the threshold of pre/post-pregnancy based on estimated LMP and gestation. Our research goal is to estimate the effect of medication switching from pre- to post-pregnancy on the incidence rate of adverse events related to opioid use disorder and pregnancy complications. Our design will answer the following research question: What are the potential risks of switching OUD medications from pre-pregnancy to pregnancy for (1) Individuals who switch down (i.e., methadone to buprenorphine), (2) Individuals who switch up (i.e., buprenorphine to methadone), (3) Individuals who stopped MOUD from pre-pregnancy to pregnancy (defined as a medication gap greater than 14 days); and (4) Individuals who did not switch medication from pre-pregnancy to pregnancy; measured up to gestational week 19.
e. Exposure. The exposure for Aim 2 will be individuals who had any switch in medications from pre-pregnancy to pregnancy (from three months before pregnancy through gestational week 19), stratified by: (1) Exposure (a): Individuals who switch down (i.e., methadone to buprenorphine) and (2) Exposure (b): Individuals who switch up (i.e., buprenorphine to methadone). The comparator groups will include, (3) Individuals who stopped MOUD from pre-pregnancy to pregnancy (defined as a medication gap greater than 14 days); and (4) Individuals who did not switch medication from pre-pregnancy to pregnancy. We will take an intent-to-treat approach where we will focus on an individuals first switch due to our preliminary data findings showing that downstream switches could be a result of the first switching decision.
f. Outcome. The outcomes will be measured from week 20 through the neonatal period (within 28 days post-delivery). We will measure two outcomes: (1) Complications of OUD; and (2) Complications of pregnancy. Complications of OUD will be defined as overdose, hospitalizations, infections such as endocarditis, abscess, osteomyelitis, and maternal death, while complications of pregnancy will be defined as hemorrhage, primigravida cesarean section, preeclampsia, and chorioamnionitis.
g. Covariates. A priori covariates in our model will be measured 90 days prior to estimated LMP (pre-pregnancy period) and will include demographic variables such as age, race/ethnicity, income, zipcode/census divisions, eligibility group code, pharmacotherapy dose prior to switching, days between previous medication to switched medication, median travel distance to medication prescriber prior to switching (Aim 1), opioid use disorder severity (e.g., opioid-related emergency department and inpatient visits), nonopioid substance use, other mental health conditions, chronic comorbidities, and prenatal care engagement. We will account for relapse-related indicators such as opioid-related emergency department and inpatient visits and gaps in medication use prior to the first medication switch (operationalized as a binary yes/no variable). Additionally, we will account for the following factors in our analysis: The number of medication switches during the exposure period, whether individuals who stopped MOUD in early pregnancy later switched back to MOUD at any point during the exposure period, and the type of switches, i.e., antagonist or partial agonist to partial or full agonist; full or partial agonist to partial or antagonist.
h. Analysis. We will begin by performing several descriptive analyses to gain a better understanding of our sample cohort. These analyses will include examining demographic and other characteristics, quantifying the individuals within each exposure group, and, for those who switched medications, calculating the median number of switches during the exposure period. We will then employ a propensity score method with overlap weighting to balance important patient characteristics across treatment groups. Since we expect that our medication comparator groups could be quite different, overlap weighting is particularly advantageous when comparator groups are initially very different. Following our propensity score design, we will report both risk ratios and an extended Cox proportional hazards survival model, particularly, the Prentice, Williams, and Peterson (PWP) (total and gap times since the previous event), to account for recurrent and competing outcome events. Reporting both risk ratios and time-to-event outcomes accounting for competing events provides a more comprehensive understanding of the relative risk of an event occurring across groups, while also providing detailed information about when events occur and the impact/interplay of multiple potential events over time.
Ashley Leech, Shawn Garbett, Frank Harrell, and Cass Johnson in attendance:
· Analysis: Propensity score weighting, Cox proportional hazards with recurrent / competing events · Feedback from Frank: o May be two reasons to change medication: §Planned change vs. reactionary change §Is causal analysis needed to get rid of this feedback loop? · Andrew Spieker, Bryan Shepard (a few others as well) specializing in causal inference within the department. We could ask someone to join clinic. o Internal time-dependent covariates are present here. §External version would be crossover study where everyone must switch drug at a certain time. §Interpretation is made more difficult with internal time-dependent covariates. §If covariates arent updated frequently enough, what we are trying to learn from our change variable will be difficult to interpret. §Propensity adjustment may not be sufficient for that. · Ashley: Propensity score weighting was chosen because exposure groups may be vastly different. Ashley wanted to account for that via several covariates (about 10). Sample size is estimated at 70,000, but many have not switched at all (lots of 0s). o Static Propensity score: if were looking at baseline or characteristics at one point in time, then we are not accounting for people switching back and forth (increasing dosage and then decreasing, for example). §The present situation is more dynamic; time-dependent covariates are important. §Confounders would need to be measured within days / weeks of switch o How do you want to word your conclusion? We learned something that gives the recipe for required changes to affect better outcomes (causal), or a non-causal conclusion? §Frank suggests going without the propensity score weighting §Miguel Hernan ( Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease - PMC (nih.gov)) could serve as a great case study to look into. Can get same results as RCT if time-dependent covariates were well understood and updated frequently. · He also has a great book on causal inference ( Causal Inference: What If (the book) | Miguel Hernan's Faculty Website | Harvard T.H. Chan School of Public Health)We would like to add a substudy to our R01-funded main study investigating health equity (race differences in salt sensitivity), and I will be the primary investigator. I submitted my grant to VICTR for funding.
As a new research fellow, I have limited knowledge of biostatistics. I would like to discuss biostatistical analysis options for our small sample size substudy.
Main study conducted with NIH funding for three years. N = 24 (6 black, 18 white), still enrolling
- Recent published paper shows no difference in race (but enrolling black patients is difficult)
- Limited budget
- Existing research shows black people are more salt-sensitive
- Add substudy to target enrollment of black patients
New aim 1
Chose Mann-Whitney U Test because small sample. Power analysis?
Frank: Issues investigators are experiencing are very common
- Fisher: When p-value is large, you need more data. P-values do not handle small sample sizes well
You already know blacks are more salt-sensitive. The question is how much
- Use confidence intervals
With small sample size, need to choose one parameter. "Put all your eggs in one basket"
- Calculate single confidence interval for association you are most interested in
To cut margin of error in half, you need 4x as many participants
- Precision = square root of sample size
Also interested in female vs male
Width of CI will narrow with more data, but center will move around
Precisely measured variables lends itself better to small sample sizes than a variable that is not precisely measured (race)
Mert: if we run some correlation studies between ADMA and salt sensitivity, can we compare groups?
- Frank: no free lunch -- to have MOE ? +/- 0.1, you need 400 patients. Would need even more patients for that
Annet: Come up with genes related to nitric oxide
Frank: only way to separate race and sex would be to have a balanced dataset (difficult to achieve)
- Boost sample size with repeated measurements? Study patients under different conditions
False discovery rate irrelevant without false non-discovery rate
- Procedure could have zero power to detect genetic characteristics on account of FNDR
Research with small sample size is tough
hbiostat.org/bbr
- One chapter on statistical inference
- One chapter on high-dimensional work, check robustness of findingsWe are working on a video-based coaching project to teach a surgical procedure to residents. We are interested in asking about validated survey tools (such as OSATS, OCHRA, SURG-TLX) and which ones would be the most appropriate for statistical analysis. Wed also like to ask for recommendations on what types of data to collect for the statistical analysis.
No formal feedback system to give residents
Video-based coaching intervention (highlight reel of most important steps of operation)
- Was this useful, feasible, repeatable?
Unable to power study with max N = 10 residents (up to 8 attending)
- Simplest thing you can learn (proportion of residents who responded a certain way)
- Minimum to estimate proportion ~ 96
- Cannot generalize to population of residents with reduced sample (margin of error would be wide)
Can learn a little bit (quantify uncertainty with uncertainty intervals)
Which validated survey tools to use?
Accounting for observer variability is important
Ideal set up with different observers: each resident scored by several different observers - Minimize observer variability by averaging If one observer for all participants: uniformity within and between residents, but preferred over different observer for each residentTiming of intervention would give away whether participant was in pre or post
- Could hold off on evals completely until the end
Some experts that could help with that (sociology dept)
OSATS: quantitative data (number score)
OCHRA: scored on performance in surgery
SURG-TLX: evaluates mental workload, how demanding, etc.
One way ANOVA used?
With N = 12, statistical test would be more misleading than helpful
Better and safer to quantify what you have with confidence limits
- Confidence limits penalize for small sample size
- Wouldn't talk about power; instead use confidence limits for mean differences pre & post
VICTR voucher almost not needed, but could help with some stats if needed
- 90 hours, $5k grant
Also VICTR studio -- free, invite experts from other fields
Design ideas:
- Increase sample size, could give more options
- Randomize residents -> half get intervention, half don't -> parallel group randomized trial
- Could attribute differences in groups to the intervention
- Strongest way to determine that intervention caused the result
Hybrid approach: delay intervention, assess people randomized to have intervention late at same time you assess someone randomized to have the intervention earlier
Keep focus on feasiblity with reduced sample size (shy away from p-values, use confidence limits)
How are data being collected?
- Survey data in REDCap
Could bring draft of the REDCap to another clinic to get statistician inputWe have zero-inflated/semi-continuous data to model. Our DV (dependent variable) is a percentage of video frames during which a participants facial expression likelihood was above a set threshold in response to two stimuli conditions. One of the conditions elicited little to no facial activity and thereby our output for those trials is 0.
Are hurdle models better than zero-inflated models? Are there alternative models to consider?
Two stimulus temperatures (warm/hot) and record facial expressions
Record muscle activity in the face (engagement)
Plotted median engagement against pain rating
- Potential three way interactions between groups
Zero-inflated data - how to best model?
- Has threshold of 25% been validated? Would take a lot of data. Assumes discontinuity, does not capture close calls
What is the area under the curve when you are in some danger zone?
- AUC captures severity of hemoglobin and how long you stay in the bad zone
Difference between being correlated with the truth and fully reflecting the truth
- Consequence of using threshold when assumption of discontinuity does not hold: artifacts & edge effects, amplify measurement error
- Removing threshold = more powerful statistical analysis
Ordinal regression will handle any distribution and will give you probability of being above a given threshold via back-end calculation
No need to transform Y when using semi-parametric model
Use robust sandwich variance estimator
Model with rms packageWe are using trauma video review to analyze the impact of communication patterns on the resuscitation efficiency of bleeding trauma patients. Specific exposures of interest include the percent of overlapping communication during key moments. We would specifically like input into an analysis plan that may inform our study design and data collection plan. This is a VMS RI project that I am supervising. Mentor confirmed.
Interested in overlapping communication, interruptions during EMS handoff, and speaking during BP measurement
Outcome: total resuscitation time (wheels in to wheels out of trauma bay)
Want to do linear regression
Add age to covariates?
Frame as survival problem? Time to successful resuscitation?
- Right censoring if no resuscitation
Assumption: Resuscitation is independent of dying
Covariates are not defined during resuscitation itself
- Avoid circularity - separate effects
Several other discrete timepoints between when patient comes in and when patient leaves
Regarding competing risk of death -- plan is to exclude those who die
- Whether a participant will be excluded cannot be defined at time zero
Time period of wheels in to wheels out: 25 minutes +/-
Capture extent of shock at fixed time would be key
First five minutes = landmark observation period
- To qualify for analysis, have to survive for at least five minutes
- Those who don't survive 5 minutes likely came in deceased
Failed resuscitation in trauma day takes ~ 20 minutes
Predicting participant's status five minutes from now
Want to collect data so all options are on the table
- Collect data so that you don't make decisions you can't undoThe study design is a retrospective observational cohort study of pregnant individuals with a substance use disorder. Our design will answer the following research questions: (1) What are the potential risks of switching opioid use disorder medications from pre-pregnancy to pregnancy for (Group a) antagonist or partial agonist to partial or full agonist; (Group b) full or partial agonist to partial or antagonist, up to gestational week 19; and (2) for new initiators in pregnancy, what are the potential risks of switching medications any time during pregnancy for (Group a) antagonist or partial agonist to partial or full agonist; (Group b) full or partial agonist to partial or antagonist. The comparator group will encompass individuals who have zero switches or switch less frequently; we will include the effect size of the switching count.
Exposures. The study exposure will be the count of medication switching; either from (a) to (b) and/or (b) to (a); from pre-pregnancy to pregnancy up to gestational week 19 and anytime during pregnancy. The exposure time will be censored by the observation period (accounting for the exposure time in the model).
Outcomes. The primary outcomes will include medication discontinuation, severe maternal complications, all-cause hospitalizations, and ED visits.
Questions:In TN, 25% of 14,000 pregnant individuals with opioid use disorder or on medications for OUD had at least one medication fill, which = 3,500 people in total.
Based on our persistence study, 6% of individuals switched from buprenorphine to naltrexone over the one-year study period
Based on our 3,500 pregnant individuals on medication, this roughly equates to 210 individuals (but this is just considering buprenorphine to naltrexone and no other combinations, including methadone).
If our Medicaid sample includes 10 states, this would roughly equate to 2,100 individuals with a medication switch during pregnancy.
Two aims: using large medicaid sample to answer questions
Question today: sample size concern -- what design maximizes info?
Care about effect of switching medication relating to several outcomes
Negative binomial regression (more individuals that don't switch; 5-8% that do switch)
- Extra parameter accounting for overdispersion
Poisson regression? Have more parameters than Poisson would need
Count outcomes could have 0-1 inflation
Rule of thumb for sample size? Frank might have one
Features to adjust for in regression (given observational study):
- if time-fixed confounder, add as covariate in model
- Covariates: cotherapy dose
Possible trigger: relapse
For treatment switches: interested in looking at both number of treatment switches & high-low vs low-high
Treatment covariate feedback loops? Might not be relevant in this field
Covariates will need to be measured at different times during the exposure period (the pregnancy)
Time-varying covariates
- Multiple rows for each participant where covariate corresponds to a value at a given time
- Would this work for a negative binomial model?
Survival model that would allow for recurrent events
Mixed model would depend on how data are structured
Bryan: negative binomial model might not use the data most efficiently, but could be more interpretable to reviewers
Recurrent event survival analysis is a little more complicated -- Anderson-Gill model
- More power, account for time-varying covariates
Benefits of time-varying covariates:
- Adjusting for time-varying as if fixed leaves you susceptible to confounder treatment feedback loops
Continuous variable dichtomized = 5% of information you had before
Negative binomial could work, need to adjust for covariates
- No closed form formula for effect size/power; could estimate using sims (use preliminary data)
- passed R package has negative binomial for 2-sample; not quite what we're looking for
Recurrent event model would have more power, but would be more complicated & might be more difficult to interpret
Zero-inflated regression options: adds parameter to adjust for lots of zeroesWe plan to conduct a retrospective observational study examining doses of anticoagulation in patients supported with venoarterial extracorporeal membrane oxygenation during lung transplantation. Over the past 5 years, anticoagulation dosing for this population has evolved to be considerably less. We hypothesize that lower anticoagulation dosing is associated with the presence of thromboembolic complications and correlated with a decrease in blood transfusion requirements. In particular, we are interested in examining dose-response relationships between anticoagulation dose during surgery and blood transfusions required.
Research Q: What is the optimal dose of heparin for intraoperative support?
Hypothesis: Heparin exposure associated with blood product transfusions during lung transplantation supported on venoarterial ECMO
Inclusion: bilateral lung transplant between 2018 and July 2023, intraoperative VA ECMO support, age > 18. Several exclusions...
Exposures: heparin bolus dosing, heparin drip rate, ...
outcomes of interest: blood products transfused, thromboembolic events
Flowsheet -- N of > 180
Pharmacology of heparin nailed down such that, for example, optimum dose is known as a function of body mass?
- In lung transplant surgery, ACT is not checked & additional heparin not provided based on any ACTs to follow
Aim of project: quality of patient's overall outcome with regard to coagulate-related outcomes
With lots of events, you might find different doses optimize different events
- Advantageous to have ordinal scale outcome to assess what dose makes patients do well overall
- Clinical overrides, increases power
For dose x, did patients have worse outcomes than dose y?
Patient could have two different bad outcomes -- ignored if only looking at one outcome, not if you use ordinal scale
Ordinal scale could have as many levels as you want
Many ways to break ties to make scale more clinically relevant with more power
Ordinal scale won't be disturbed by infrequent bad outcomes
Good to crowdsource -- REDCap survey where clinicians vote/rank outcomes
- Declare winners based on small groups of runners, then put all of that together
Accounting for confounding: include covariates in the model
- Clinical experience super important to identifying confounders -- which features/lab values do clinical experts use?
Could assess dose over time (apply smoother)
- Cultural shift to less and less heparin over time
Likewise assess how ordinal scale changes over time
Will reviewers want ordinal scale validated? Sometimes, but Frank says alternative is almost always worse
- Some reviewers do not like the PO assumption in the PO model... often they are already make worse assumptions (Frank blog articles)
Time zero: patients have to make it to this time to make it into the study
ECMO is to some extent a time-dependent covariate
Patients have to get ECMO to be in analysis -- have to live long enough to get ECMO, so time zero might be initiation with ECMO
Bolus given just before canulation -- for intents and purposes, a simultaneous act
With sample of 150 or so and rarity of events, would descriptive paper be more effective?Frank: descriptive analysis suffers the most from sample size, causes measures to be noisy
Is sample big enough to do most powerful analysis?
More uniform the scale, more statistical info
Not slam dunk: depends on signal to noise ratio
What will ordinal scale look like?
- Example 1: some lab value + 10 times the number of transfusions needed (transfusions puts you in a different part of the scale, more transfusions = bad)
- Example 2: 0 - 50 did not need angina meds ; 51 - 89 did need angina meds; 90 = death
Don't need to write out every level, but define zones
- Hierarchical scale -- if participant has one minor event and one major event, assumes you don't care about the minor event
VICTR voucher -- 90 hours of biostat help, finishes in a year
Expect that data will be complete.
Comorbidities not much of an issue here (maybe BMI) -- not the practice
- BMI can't be extreme -- have to be healthy enough to receive transplantWe obtained ~2700 B cell receptor sequences specific for several antigens that we have grouped into 5 different categories (by viral family) We have analyzed the data with respect to several sequence features (V gene usage, HC:LC pairs, somatic hypermutation, CDRH3 length) and want to compare the data across the viral family categories. Need to know which statistical tests to use. Number of sequences in each category are not equal.
Data generation phase completed
2700 sequences across 10 individuals -- no interdonor analysis
- uneven with respect to participant and viral group
If studying left & right eye of a patient, patient contributes an experimental unit of roughly 1.5
- Minimum of 10 experimental units here (10 participants)
How much do receptor sequences correlate within the participant?
Single-cell data, multidimensional
Ivelin: sequences are independent, some correlation for sequences coming from the same participants
Germ-line gene usage clustered by viral group specificity
- Relative frequencies
Stacked bar chart -- percent of antigen specific repertoire x IGHV
Hypothesis testing is the aim
Difficult to conceptualize generalizability beyond the single individual
Most of the data is dominated by 3-4 individuals
Could reproduce stacked bar chart by individual
Other ways to present proportions than stacked bar chart?
Unfamiliar with single-cell data
- many Biostat faculty members work on single cell data. Could invite some on a special Thursday morningExposure: LMA protected periods
Outcome: severe influenza illness
Analysis: restricted to time within influenza seasons based on regional virologic surveillance
MSMs with IPTW and IPCWs were used to estimate the effect of LMA use on severe flu illness
1 year of continuous follow up to assess baseline vars followed by flu season
Want to control for vars changing over time that can be both mediators and confounders
Estimating weights: calculated stabilized IPTWs and IPCWs for each person period using methods from Fewell et al
- Logistic regression
- Numerator models included only non-time varying covariates
- Denominator models included time-varying and non-time varying covariates
Inverse probabilty weighted model
Reviewer suggested weighted table 1
- Question: should table be presented by person-period instead of by person-season?
- Issue: does not account for # of periods per person -> could bias
Table 1 should not have been included in submission
- Conditions backwards on outcome -- not beneficial in cohort study
Just do baseline characteristics for population (no columns)
Literature does not show precedent for Table 1 with marginal structural model
Measures of relative explained variation
Table of odds ratios for LMA use
- Calendar time is important
Figure with odds ratios
Good to look for alternatives to weighted analyses (lose a lot of power)
What goes wrong when you do a standard time dependent covariate analysis such that you need weights and MSM?
- You have variables acting as confounders and mediators (danged if you do, danged if you don't)
Landmark analysis -- keep starting the clock over and doing covariate analyses
Case crossover study -- within subject design
Matching weights -- create population like you're matched, don't lose a lot of efficiency
- Matching equivalent for marginal structural model?
Could include in markov model whether participant had flu in the previous yearWe will be performing a retrospective analysis of lipid levels in transgender adults, comparing those who are on gender-affirming hormone therapy (GAHT) to those who have never been on it. We would ideally track changes in lipids over time, but I am concerned our population of adults never on GAHT will be too small to make meaningful comparison. Weve alternatively discusses a cross-sectional model or using individuals as their own controls. I also hope to minimize confounders that independently impact lipid levels - age, weight, diabetes status, smoking. I am requesting assistance with planning this analysis & creating a statistical model. I also plan to apply for VICTR voucher for continued assistance with analysis.
H1: identify additional variables that may impact lipid metabolism as well as perceived risks of benefits of sex/gender minority research.
H2: Transgender men on GAHT will have higher rates of and more severe dyslipidemia than transgender men not on GAHT
H3: Transgender women on GAHT will have higher rates of and more severe dyslipidemia than transgender women not on GAHT
Experimental group: trans adults on GAHT for 2+years
Control: trans adults not on therapy
Outcomes: cholesterol levels (continuous)
Known & anticipated limitaitons:
- currently do not know how many TG adults never on GAHT have had lipid panels at VUMC
- Inconsistent documentation of risk factors (smoking, alc use, etc)
- GAHT divided into masculinizing and feminizing therapy, as regimes vary
- After 2 years, patients more likely on maintenance dosing
- Psychotropic medications may also variably impact lipids
- Different regimens, duration of therapy
Inclusion: 18+ adults self-identifying as transgender/nonbinary
Exclusion: known familial hypercholesterolemia, no lipid panes at VUMC
Covariates: age, race, weight, smoking status, alcohol use, A1c
Analysis:
1) longitudinal, retrospective
2) cross-section, retrospective
Design Qs:
What is longitudinal time zero?
- Baseline, before starting therapy for those in experimental group, first time at VUMC for control group
For patient landing at nine months, can still get good info (just would have incomplete profile from time zero to nine months)
Characterize longitudinal profile with as much resolution as data allow, rather than require two years for entry into study
- Mixed effects model may not reflect correlation pattern within patient (maybe more serial correlation)
Continuous time
If few people are followed for seven years, these people could have a lot of leverage/influence on estimates
One tool: variogram (semi-variogram) -- calculate correlation between all pairs of measurements from same patient (uses all available data) -- assumes correlation is isotropic
Mixed effects model assumes variogram is flat
Getting correlation structure right has a # of advantages (small one is software runs faster)
Include variables like `on psychotropic meds`, `started on lipid-lowering Rx` in analysis?
Often discourage folks from matching on similar covariates (throwing away info)
Interested in applying for VICTR voucher. Another clinic meeting would be helpful to continue brainstorming
Voucher includes analysis work. Investigators & analyst finalize analysis plan together at the beginning
Involve statistician earlyInitial analysis done in Python -- interested in using R for stats
We have weights, heights, BMI, etc.
Weight cyclers vs non-weight cyclers -- longitudinal
- Debate on whether weight cycling increases risk
- Frank: weight cycling has likely been defined, but not validated (no consensus on definition)
- Ask more general question: "what extent of weight cycling correlates with what extent of clinical outcome?"
Characterize weight cyclers vs non-weight cyclers
- Frank: don't want to label as cyclers vs non-cyclers -- eliminates close calls
- How frequently is weight measured?
Cross-correlation problem or "predict the future" problem?
- More interested in landmark study (the latter)
23 years of data, though scattered
- Qualification period
- In other study, those who cycled more frequently were shown to have worse outcomes
10 randomly-chosen weights for each year
Citation: Frank Harrell & Shi Huang paper on weight
Can pre-specify contests -- just don't throw the kitchen sink (multiplicity problem)
Bryan: longitudinal model and looking at association with time-varying covariates
- Concern: generalizability when restricting sample to folks with ten measures in 10 years of data (subject matter expertise to decide cut-off)
How much does collinearity between lagged variables present issues?
Need to demonstrate dose-response relationship
Two dimensions of weight volatility -- "Genie's" mean difference
"What is the 90th percentile of changes, adjusted for height?"
Prepare data with date, weight, and height for each measure
Frank's book: Search "R workflow"
To keep in mind with large datasets: possibility of numeric overflow
Avoid p-values with extremely large N -> use relative explained variation in outcome
datatable package handles memory really well
Use clinical considerations to cut down on pool of variables
Appropriate variable transformation -- not always clear
VICTR studio: any biostatisticians to invite?
- Shi Huang & Frank
Ask who has experience with height data
How to settle on granularity of weight measure? Year, 6 months, etc?
- Most patients are outpatient
If gaps in the data, process to correct for uneven gap times?
Need adjustment for variation in gaps
Hexagonal binning
For longitudinal analysis: spaghetti plots are useful (take random sample)
How many measurements there are on each patient, non-parametric curve, clusters patients (catch data problems)This is our follow-up biostats clinic to the one we attended on 6/22/23. We are designing a pilot trial to test the efficacy of guanfacine treatment for chronic fatigue syndrome patients. We would like biostatistics support for a VRR.
Objective: gather preliminary data on efficacy of guanfacine vs placebo on hyperadrenergic ME/CFS
Hypothesize that treatment with central sympatholytic guanfacine will improve fatigue & function (disability)
No biomarker that has efficiently identified this subset of patients with high sympatholytic activity
Study design: Proposed double-blind, randomized, crossover study with placebo vs guanfacine
- Enrichment withdrawal design
Responder analysis: assess perceived response to guanfacine for POTS symptoms
- see which clinical characteristics were associated with improvement
Improved approach: correlation between scores & different clinical characteristics
Positive trend between PGIC change in hyperadrenergic symptoms frequency
- Frank: add rank correlation coefficient
- Bubble plot to represent sample size for coincident points
T score is normalized
P-value is "too easy" -- include rank correlation coefficient instead (ranks can be compared, p-values cannot)
Plan to use wilcoxon rank sum test in primary study
Are relationships strong enough to base clinical trial on?
How to leverage information to identify who should be treated?
- Not seeing "smoking gun" in plots
Bryan: does guanfacine lend itself to study with respect to blinding? For instance, can participant identify whether they are being treated with drug or placebo based on symptoms?
Italo: guanfacine does make you drowsy... but drug that makes you drowsy can improve fatigue
Washout period -- two weeks is good balance between getting rid of drug and study feasibility
- Patients tend to be reluctant to being taken off the drug
Ideal to put confidence intervals when doing correlationsWe are planning a pilot study to test the efficacy of guanfacine treatment in patients with chronic fatigue syndrome or postural orthostatic tachycardia syndrome. Our preliminary data suggests a subset of patients with central sympathetic activation could benefit from treatment with a central sympatholytic like guanfacine. We would like to discuss a statistical analysis plan for our study.
No diagnostic test/FDA-approved treatment for ME/CFS (myalgic encephalomyelitis/chronic fatigue syndrome).
In previous work, hyperadrenergic phenotype associated with more severe disease & greater autonomic symptoms with sympathetic overmodulation and lowest quality of life
Clonidine has been tried as treatment, but all studies have failed to improve fatigue & function
- Potential reasons for failure: not selecting for hyperadrenergic ME/CFS patients & side effects of clonidine
We propose central sympatholytic therapy with guanfacine would be effective for treatment of fatigue & function in CFS patients with phenotype
- Preliminary study conducted: patients rate their impression of change
- 26% non-responders, 74% responders (to guanfacine)
Found improvement was related to the frequency and severity of hyperadrenic symptoms & ...
Compared to non-responders, overall improvement associated with improvement in ...
- Similarities between responders & non-responders in age, disease duration, post exertional malaise, impairment in sleep & memory, hyperadrenergic symptoms, orthostatic intolerance (responders tended to lower baseline tolerance)
- Responders tended to have more severe fatigue
Predictors of response to guanfacine: head-up tilt (HUT) and Valsalva maneuver
- Responders: > DBP increase at 1 & 3 min of HUT, > BP increase during late phase 2 of Valsalva
Target population: CFS patients with hyperadrenergic phenotype
Overall goal: grant proposal to assess efficacy of guanfacine for the treatment of CFS symptoms
Objectives: conduct small study to gather preliminary data
- Efficacy of guanfacine (vs placebo) on hyperadrenergic ME/CFS
- To estimate sample size & power
Study Design: double-blind, randomized, cross-over study with placebo vs guanfacine added to standard of care for 2 weeks
- Enrichment withdrawal design (suggested by FDA for initial assessment of efficacy of ampreloxetine in improving OH in patients with autonomic failure)
- Outcomes assessed at home (questionnaires, actigraphy, orthostatic vitals
Study design advantages: patients already identified, minimal involvement from patient, clear stopping rules, low-cost alternative
Outcomes assessed at second week of treatment: fatigure (CIS, primary outcome), POTS & CFS symptoms & PGIC, subjective function (SF-36), objective function
Analysis: outcomes assessed on the 2nd week outcomes: placebo vs guanfacine
- Hypothesis: fatigue score (CIS, primary outcome) after 2 weeks of placebo > guanfacine
- Assessed via Wilcoxon signed rank test or paired t-test
Frank concerns: problems with patients not remembering very well
- Measuring change is difficult because it is dependent on patient's initial reading
Not good to base responder analysis on change
- Responder analysis = minimum information analysis (doesn't capture close calls)
- Responder analysis with 19 patients = noise (need ~ 180 for anything meaningful)
Italo: Need more preliminary data
Frank: Need high signal:noise ratio, especially with small sample size (would need to demonstrate dose-response relationship -- rank correlation)
- Difficult to distinguish minimal to no change
Delta scores boxplots need to be revisited (raw data display = scatterplot)
- Boxplots are combining unlikes
- No need to categorize responders -- basic rank correlation and scatterplots
Responders & non-responders is low information measure -- "what is rank correlation between disease duration & amount of global change?" would be better
- Use six levels on x-axis, you may find that non-responders in best non-response group are more similar to lower end of responders than to the next group of non-responders
- If so, this would cast doubt on the choice threshold
Italo Challenge: some patients are already on the drug and believe it is helping
- There aren't any patients who are on the drug and don't believe it is helping
Frank Challenge: threshold was decided on without validating it was the right one
Design aspects are state of the art
- Washout period for observational phase (study team determined two weeks was sufficient)
- Patients who have been miserable for so long are reluctant to commit more than two weeks for washout
- Frank mentioned need for washout period before crossover
Wilcoxon rank difference test better for paired data than ordinary wilcoxon -- same p-value even if you transformed outcome
FDA says there are too few randomized withdrawal studies
- No run-in period, randomly tell half the folks they can't take drug anymore
Patient-oriented outcomes -- great payoff in terms of power when outcome has high resolution"The impact of weight loss programs on the survival of a native joint."
P:Patients with BMI 35 at the time of diagnosis of knee osteoarthritis. I: Weight loss. C: Usual care. O:- Cox PH model
Aim 2: To determine certain interventions can predict the maximal amount of weight lost. Hypothesis: Patients who underwent bariatric surgery will demonstrate the greatest amount of weight loss. A statistical difference between groups of usual care, diet counseling, pharmacologic intervention, and bariatric surgical intervention will be measured, possibly using a Repeated Measures Analysis of Variance.
- 5% total weight loss = significant change
Aim 3: To determine if weight loss improved patients reported measures of knee pain and function. Hypothesis: Patients who achieve weight loss will have a statistically significant change in patient outcome measures including the Promis Physical Function test, NRS pain, and KOOS Jr. The data will be adjusted for the percentage of weight lost & time to achieve weight loss, I think this statistical measurement will employ a non-parametric regression test.
Time zero = time when diagnosis was first entered into the chart
- Someone could have osteoarthritis that started a while ago -- one's "time zero" might not be really time zero
- No way to extrapolate when the person's symptoms began
Trade off of doing controlled study on few patients vs uncontrolled study on many patients
Circularity between function and weight -- can it be disentangled?
VICTR studio -- multidisciplinary, could avoid certain pitfalls
- Could help define tight criteria to extract from EHR & tighten aims
Loss of 10 pounds for person who is carrying 100 pounds of extra weight means less than a person carrying around less extra weight
Generalize aims to look at absolute weight at time zero and other timesWe are conducting a secondary-analysis of a cluster-randomized cluster-crossover trial (PILOT trial, PI: Matt Semler, Biostats: Li Wang), with a subgroup of patients. Given the complexity of the statistical analysis, we are seeking a VICTR voucher to fund the data analysis. We would like to get a quote and notes for submission of the VICTR voucher. Mentor confirmed.
Interested in route to getting VICTR voucher
Subgroup analysis of the PILOT trial among survivors of cardiac arrest
Parent trial -- cluster-randomized (entire ICU randomized to a treatment group for two months), cluster-crossover clinical trialAll adults who received MV in medical ICU at VUMC 7/18-8/21
Intervention -- 3 groups for target of oxygenation (lower, intermediate, higher)
Secondary analysis -- subgroup from PILOT (patients who survived cardiac arrest prior to enrollment, N = 339)- Primary outcome: 28 day in-hospital mortality
- Secondary outcome: survival to hospital discharge with a favorable neurologic outcome
Analysis Plan:
- Assess separation between groups in SpO2- Analyze primary outcome -- logistic regression with independent covariates of group assignment and time
- Analyze secondary outcome -- logistic regression with independent covariates of group assignment and time
- Test for effect modification by characteristics of cardiac arrest
How are intra-unit correlations handled?
- In PILOT study, only adjusted for period cluster
- 18 time clusters (3 groups, 6 times, order is random)
Ordinal outcome always has a place to put death
"Multivariate" vs "Multivariable"
Don't use outcome from original study (don't use death -1; can't summarize results using median)
- Median is designed for truly continuous variables (bad with ties)
- Analyze raw data (what is your status on a given day? On ventilator, dead, not on ventilator, etc.)
VICTR: one size fits all, $5000, 90 hours over a yearEvidence based recommendations are needed to define which patients, if any, should be considered at risk for these short-term hyperglycemic episodes as well as evaluating the long-term effects on glucose levels after a single administration of corticosteroid. The purpose of this study is to look at how a diabetic persons blood glucose levels change over time with a steroid medicine injection. It is believed that steroids may briefly elevate a persons blood sugar levels in the immediate time period after receiving a steroid injection. Significantly high blood sugar levels may be dangerous and can lead to a range of effects from fatigue and vomiting to confusion and coma.
The Shade Tree Clinic (STC) is a comprehensive, free health clinic run by Vanderbilt University medical students for Nashville residents with limited resources. The Shade Tree Orthopedic Clinic is a subspecialty clinic at the STC that provides quality care for acute and chronic orthopedic conditions. At the STC Orthopedic Clinic, corticosteroid injections are heavily relied upon as a treatment for management of pain and other orthopedic conditions given that the patient population often lacks access to surgical treatment.
The purpose of this study is to assess the glycemic effects of methylprednisone in patients with diabetes, pre-diabetes, or no diabetes given the utility of both types of injections. Our questions are determining an adequate sample size for the study, as well as help with a VICTR grant. Mentor confirmed.
Attendance: Brian Hou, Lauren Porras, Frank Harrell, Cassie JohnsonMeeting Notes:
May look to balance diabetic and non-diabetic patients. We may realistically end up with more non-diabetic patients.
Looking at change from baseline may be less meaningful than If we adjust for baseline blood glucose (as covariate, perhaps non-llinearly), for a given starting point, where do you end up after injection?. Looking at absolute glucose, instead of change in glucose. We may look at slope or area-under-the-curve, in this case.Descriptive tool to start with spaghetti plot. Allows you to view raw data trajectories without assuming anything.
Question from Dr. Porras: Diabetic patients that we are enrolling are well controlled, so will likely have baseline blood glucose that are similar to non-diabetics. So are we really answering the primary question regarding glucose sensitivity?
Likely approach interact baseline with diabetes status. Requires a larger sample size, but addresses this concern.
Sample size: could be a concern solely from the orthopedic side. Could get primary care involved, or could change the question to just include non-diabetics. IRB preferred an observational study, though Frank warns that not having a control group can require a leap of faith when it comes to results.
If an observational study is required, Frank would require a clinic at Vanderbilt where they have proticalized the collection of fasting blood glucose among the population getting these injections. (Blood glucose are reliably collected without missing many measurements)
If this data isnt already being collected in a clinical setting, may need to pay for non-diabetics via grant.
Dr. Porras provides the following paper: Systemic effects of epidural and intra-articular glucocorticoid injections in diabetic and non-diabetic patients - ScienceDirect Frank: To learn a lot about these types of relationships, a minimum of 70 patients would be required. For correlations, this number may be closer to 300.We try to multiple imputation of a key variable to define/determine the start of follow up time. It has 40% missing. In a proportion of subjects, we do have another variable that informs this missing variable (set boundary for the missing variable such that the missing variable can only be in certain values).
Outcome is RSV LRTI, exposure is RSV immunoprophylaxis
Causal inference is of interest
Missingness of LOS related to year (decreases with later birth years), GA (less missing for older GA), BW (less missing for larger BW), NICU admit (less missing for no NICU)
Measure exposure to RSV immunoprophylaxis:
- children born Apr-Oct: every 30 days during the winter RSV season for a max 5 doses
- children born Nov-Mar: every 30 days starting at birth hospital discharge
N = 15248, 34% missing LOS
- n = 1915 have down syndrome, 29% missing LOS
First out-patient visit/care is informative
- Earliest of these sets boundary on discharge date for birth hospitalization - the discharge must have occurred before their earliest healthcare date
Can data structure for imputation model differ from data structure for analytic model?
How do we incorporate a boundary on LOS into the MI model?
- `aregImpute` for when data are NOT longitudinal
- Predictive mean matching - find suitable donors
- Can specify exclusion of donors that don't meet date condition
- Check mice package to see if they already have such an option
Consult Stef Van Buren's "Bible"
"Reason for missing is more important than proportion of missing"
If administrative missingness, MI would be less scary
- Assumption of MI: missingness depends on things that are measured
Develop logistic regression model for P(LOS missing)
- Include variables you hope are irrelevant to make sure they are
"Redistribution to the right"Using the PILOT study, which assessed low, intermediate, and high oxygen goals and mortality, but now restratifying to look at the effect in patients with anemia. We would like guidance on the best biostatistical analysis strategy.
Oxygen saturation levels relating to invasive mechanical ventilation
Primary outcome: days free of MV & days alive
Conduct effect modification analysis -- see if difference between groups is the same if we stratify by anemiaPlanning to do secondary analysis using hemoglobin as a continuous variable
- See if difference in groups by hemoglobin level
PO model with interaction effect for time to control for seasonality
Frank: "don't say stratify -> effect modification"
Would look at hemoglobin upon enrollment
Another project could take hospital course as baseline and do landmark analysis
- Everyone in the hospital for x amount of days -- describe by hemoglobin at presentation, standard deviation, and slope over time
"Response feature analysis"
- fully conditional
Time-dependent covariate analysis -- always updating, results are more difficult to interpret
Spline functions & knots
- Use AIC to determine # of knots to use
Cindy: if landmark analysis, seek expert for interpretation
Bill: As long as the conditional population is of interest -- must be healthy enough to make it through that initial period of time
Planning to pursue VICTR voucherWe seek to estimate the annual percentage of patients with advanced-stage epithelial ovarian cancer in the United States who are eligible for and will derive benefit from PARP inhibitor therapy based on US FDA-approved indications. We will compare the rates of eligibility and expected benefit, then analyze these trends over time.
We accomplished the above using similar methods to this JAMA Oncology article: https://pubmed.ncbi.nlm.nih.gov/29710180/ We don't understand how they did their statistics/sensitivity analysis. I'd be happy to send our data/ methods so far before or after the meeting. I have an accepted abstract to a regional meeting, but this was without anything past basic descriptive statistics. We will need a more robust analysis for a publication.% Benefitting over 2 years progression-free survival vs % Benefitting overall survival
Hoping to conduct a sensitivity analysis
Frank: In the absence of a Bayesian analysis, pay more attention to CI rather than point estimates
- Equated benefit with % benefitting -- can't tell patient-level benefit without cross-over study
- With parallel group design, can't make same determination (certain % of patients benefitted)
- Without knowledge of heterogeneity of treatment effect, default conclusion would be that everyone is benefitting at least a little bit (100%)
Benefit -> benefitting is a big difference
Progression at two years could be random
Can use Wilson CI -- what is the probability that a person is eligible?
Response rate of the drug -- imaging findings before & after treatment if certain criteria are met
Set up like a pre-post assessment -- pretty noisy, better if tumor size is less random and measured accurately
Refining question: interested in who got the drug, and how many are benefitting
Response probabilities have a lot of noise
Looking at different publications
Get standard errors algebraically: (23 - 8.6) / (1.96); square root of sum of squares of the two standard errors * 1.96
Can't do analysis with hazard ratio -- relative instantaneous risk of having event
Program "digitizer" -- can reproduce K-M curves from publications
"Treatment difference", "efficacy estimate"
- Avoid "% benefitting"
Don't segregate study based on p-value being small or large
Two-year overall survival -- mortality is too low
Progression free-survival -- discussion on datamethods.org
- Use state-transition model -- Markov
- One weakness -- what to do with non-related death?Main Q: When looking across years: do we see a decline in percent of student pursuing postdoc immediatley after graduation? --> simple logistic regression
- Appears model fit and utility for prediction isn't fantastic, but decreasing trend is statistically significant
B1 = 0.9659 (0.9468, 0.9852) -> significant
LR p-value better than Wald p-value (LR p-values behave better)
Super impose predicted values from the model on first plot
When grouping students by career goals at graduation, do we see a difference in length of postdoc training? --> Time to event analysis like cumulative incidence curves; allows for right-censored data
- Accounting for recurrent events and discontinuous risk intervals (someone who takes a break after a postdoc)
Another complication: someone who seeks out a shorter post-doc after a really long one
How many have graduated, and how many have > 1 postdoc? If starting a second postdoc is extremely rare, that would simplify things (use right censoring)
Multistate model allows you to estimate things in more interesting ways
- Could call second postdoc a different state (adds more parameters and makes model more unstable)
Event used in time to event = conclusion of postdoc
- If use time to first non-training position, model would be agnostic to how many postdocs, any gaps
- Atypical scenarios could arise
How many students start postdoc before graduation? Small, but non-negligible #; problem is that you would be in different states at the same time
- What is time 0?
Making event = time of defense could solve issues- Could solve problem of student who takes faculty position after graduation
- MD-PhD students differ greatly from PhD only and will not be includedCould do state-transition model over continuous time; with discrete, you would have to determine time unit (in this case, month)
State transition model handles censoring when records for participant stop appearing
P(transitioning in month 13 | post doc at month 12) -- transition probabilities that are natural from the data
Stacked bar charts
How to handle folks with unknown career goals? Baseline variable -- Use side by side stacked bar charts
Resources/packages for state transition models: mstate or msm
Challenging to put multiple confidence bands on graph with multiple survival curves -- would want to make separate plots
hbiostat.org/attach/z.pdf
- model all transitions; put it all together
- jointly model with two time variables -- internal and externalData has already been collected
Aim 1: Determine rate of study enrollment by level of compensation offered
Aim 2: Determine rate of study task adherence by level of compensation
Weekly three question checklist, two daily tasks
Participants recruited through ResearchMatch9,986 invites, 492 answered, 412 enrolled, 284 completed study (those who enrolled but did not download the app excluded)
$ amount revealed after answering invite
"There is some element of deception in this study"
Primary Q: Does compensation have an effect on enrollment?
- Proportion of participants enrolled in the study by Loess regression curve (Frank suggests using overall confidence bands rather than individual; use Wilson CI than default)
Interpreting plot: would need to run a statistical test to determine if there is a trend; could use logistic regression model if this is of interest, test if coefficient is zero
Is there a different effect observed between income groups?
- Why was split made at $65k? No natural split
- Conduct logistic regression with two variables (promised compensation and categorical variables of income bucket)
- Predict how much of enrollment can be predicted from compensation amount, adjusted for income
Is there a different effect observed between racial groups? (self-reported)
Enrollment was almost too high to learn from
Appropriate by necessity -> need to limit # of variables
Put income groups, sex, age, and three buckets for race/ethnicity into logistic regression model
- Conduct some form of redundancy analysis to determine if any variables are inseparable
Does compensation have any effect on task completion?
Curve increases from 0-20, but eventually plateaus
- Wilson CI does not go above 100%
- Proportions are noisy because they are based on a tenth of the sample size
In previous study, Loess was chosen over logistic regression
Loess is really good with confidence bands, but doesn't give an overall assessment of "flatness"
- Just use confidence bands, no need for dots
Logistic regression supersedes Loess, just need to make sure it does not assume too much
Dots are medians, which cannot be used unless you have truly continuous variable (ties are a problem)
Want to make analysis more unified; don't want to use loess to visualize and logistic regression to analyze
Is there a difference in completion rates between low-frequency and high-frequency tasks
Use quadratic effect for compensation (use compensation and square of compensation in the model)We are analyzing how long alumni from our biomedical graduate programs spend in postdoctoral training. For these analyses, we are grouping the data based on two reference points: 1) the student's career goal as identified at graduation, or 2) the career that the student ultimately ended up in 10 years after graduation. For each of these two reference points, we are looking to see if length of postdoctoral training differed between 6 different career paths. We hope to make pairwise comparisons between each of the career paths. We have a cohort of 325 students who identified a career goal at graduation (group 1 from above), and 509 students for whom we know the career outcome 10 years after graduation (group 2 from above). There are 214 students who belong to both of these groups. We are looking for advice on how to make these multiple comparisons given that some students belong to both of these groups, while others only belong to one of the two groups. We also have a second data set detailing by graduation year how many students pursued a postdoc (percentage calculated as # pursuing postdoc divided by total # of graduates). We seem to note a downward trend in these percentages. We hope to examine direction and statistical significance of the trend over time. Is it appropriate to use a Mann-Kendall test on the percentages or is another approach more advisable?
Interested in career outcomes of postdocs
Q1: In terms of length of time in postdoc:
- Do we see difference between groups when grouping by a student's career goal at graduation?
- Do we see a difference between groups when grouping by the student's actual career outcome 10 years after graduation?
Q2: looking across years do we see a decline in the percent of students pursuing a postdoc immediately after graduation?
Five categories at graduation: Academic Research, For-profit research, govt/non-profit research, AMO, undecided
Cass: Do you collect data on people whose career path changed dramatically?
Nick: We have data collected at 1,3,5,10 years -- we do see lots of folks make that transition
Documentation exists for how particular careers map to the specified categories
For some trainees, we have goal at graduation, but do not yet have Y10 career -- not 10 years post-PhD
- Need to distinguish between missing you should have obtained (NA) versus missing because not yet 10 years post-PhD (administrative missingness)
- Treat as right-censored
Kaplan-Meier curve => Cumulative incidence curve
Create extra category for administrative missingness
111 with goal at grad known but not Y10 career (cohort C), 295 with Y10 career known but not goal at grad (cohort D), and 214 with both known
Goal: within the two cohorts, determine if there are differences in postdoc length
Cohort C is the one that would be most affected by right censoring
Time to event problem -- try to have one analysis per question
- How did probability of having postdoc position vary with time and with goal at graduation?
Call those who didn't have exit survey a group -- bookkeeping and to complete the picture
Equal opportunity surveying a concern
Think about raw data rather than groups: 2, 3+, 3, 4+, etc.
Analysis conditional on start of post-doc
7 cumulative incidence curves, each with their own colored confidence bands
Transitioning to Q2... do we see decrease in % of postdocs in more recent years?
Do we see folks taking a break after graduation before postdoc? Some, but not frequent
Raw data would be year of graduation and yes/no => produce plot to visualize time trend with confidence bands
- could also use logistic regression to answer a variety of questions
- time to postdoc would be right censored for those who do not have Y10 career
multistate model with time to event analysis as a special case
Market forces change over time: unemployment, job openings, economic conditions -- time-dependent covariatesI am applying to access the Get with the Guidelines stroke database to analyze the correlation between certain socioeconomic variables and diagnostic testing done as part of patient workup. Mentor confirmed.
Health equity/access
Frank: possible some folks are not communicative, making interaction shorter?
Eesha: Health literacy could have something to do with it. Literacy measure & provider characteristics & patient address/social vulnerability index not available (zip code is available, but could capture a wide range of socioeconomic statuses)
Goals are descriptive -- is there a correlation/disparity
Study population -- All patients that present with ischemic/hemorragic: 2000 hospitals, n = 5 million patient records
- Would shy away from hypothesis testing with a sample size that big; do estimation instead
- Magnitude of correlation is of interest, less so the p-value
Secondary analysis: variable clustering -- see which variables run together, helps to understand true dimensionality
- Helps to identify if some variables are restatements of others
Correlate number of tests ordered with anything you're interested in
Stroke workup should not vary significantly by ethnicity/economic status
Group by suspected mechanism
Another type of analysis: characterize how people of a certain ethnicity differ from others
- For a particular ethnicity, if older folks do not seek out care, age distribution for that ethnicity would differ from others
- Predict ethnicity of a person based on other covariates
Neurology dept does not have a biostat person at present
VICTR voucher is an option if you don't want to do analysis yourself (preferable)
- ~12 week timeline to get started; don't apply too early - can extend, but we try to discourage this
- One thing that can hold up a voucher: data cleaning
Deadline for analysis: 9/29/2023, expected to present 2/2024
- Should start no later than end of JulyREACH study -- text messaging intervention, overcoming barriers to medication adherence
Results are published -- improved adherence and A1c (for those with more room to improve)
- Effect waned over time -- clinicians indicated results were "over the top effective"
12 clinical sites interested in implementing -- outcome will be A1c improvement
Consistent body of evidence that text message intervention improves adherence
Q: Can clinics implement this, and what would it look like?
We think the best thing is an implementation study (hybrid type 2)
Clinics are opposed to randomization -- don't want to retest based on overwhelming body of evidence
Even short term reductions are meaningful
Robert: Population was more homogeneous (smaller range of A1c), so could get away with a simpler model
Frank: Impact of 0.5 reduction is dependent on baseline measure
Figure: In control group, saw some regression toward the mean
Frank: mean might not be the best summary measure for A1c
- Worry for loss of effect from figure
Robert: mechanism of action is removing barriers to adherence
- Not unsual to see measures regress to baseline after making progress (akin to person trying to quit smoking)
- Potential need for repeated exposures for sustained behavioral change
REACH intervention effect size -- overlay histogram and fit spline
Frank: excitement over subgroup needs to be tempered by context of bigger picture
Robert: what would be most persuasive analysis of data when patients self-select?
- Time Zero = informative
Frank: measure attention & engagement is not sufficient to show treatment is working, but it is necessary
Robert: measure of engagement -- people can choose to respond whether they took meds or not
Robert: want model that can capture +/- 6 month comparison
- By design: ignore values +/- 2 months from start
- Include calendar time in model to account for seasonal effects
- Worried about medications a participant is prescribed at time A1c measure is taken
Frank: How far back from enrollment will A1c measurements go?
Robert: Could get several years for participant who has been going to that clinic for that length of time
- Would be surprised if 90% did not have at least a year
Frank: emphasize slopes or averages?
- Concern of dropout from text-messaging or don't return to clinic for A1c measurement
Hanging hat on A1c = risky
"We want to use the best available model for observational studies of this type to characterize long-term A1c success of a group of patients, accounting for past history of A1c"
- What is A1c a function of?
Lindsay: Do we leverage participants in clinic who don't sign up?
Robert: No; people who sign up are distinct from those who do not
Andrew: I would still be interested in having that information, even if we're not incorporating in the model
The problem: no sign-up date
- Frank: If you can infer what it would have been +/- a few weeks, could give basis for Andrew's comparison
McKenzie: What about participants who did not sign up, but then did a few weeks later? Instrumental variable analysisWe are trying to determine if the quantitative D-dimer and fibrinogen improve as platelets improve in patients with heparin-induced thrombocytopenia. Assistance with determining correlation of individual patient data and then composite data. Mentor confirmed.
Can serologic markers help to predict platelet improvement?
- Which of the two serologic markers (fibrinogen, D-dimer) can best predict improvement?
- Determine correlation coefficient for each?
13 patients, repeated measures
Interested in markers as being a preview of platelets (can we reliably say if fibrinogen improves, then platelets will also improve)
- What is the largest lag such that the correlation is preserved (no worse than 0.6)? How far of a look-ahead can you get?
- Is the most recent serologic value more informative, or the trajectory/slope?
Multivariable regression -- calculate R^2 for fibrinogen and D-dimer, assess whether majority of correlation for one with platelets is already accounted for with the other
"Cross-correlation analysis" -- spaghetti plots & regression/R^2, looking at various lags
Day to day measurements, few gaps
Knowing why data are missing is important
Exclusion criteria: negative HIT assay
Going to get the ball rolling on a VICTR voucherWe would like help to analyze the text within our survey responses, and further help to analyze our other data we collected from the survey. Survey has 125 responses. Mentor confirmed.
Wrote grant last year with intention of enhancing services to women of color
Questionnaire to ascertain demographic info & open-ended questions; distributed at churches, nail salons, grocery stores, sorority, etc.
Enrollees all African American women ages 40-64 (no disease history required)
4 Nashville zip codes where population is > 40% black
Qualtrics used to analyze data
How many received an invitation for the survey?
Self-selection/non-response bias an issue
- Respondents with Vanderbilt email are more likely to receive care at Vanderbilt (stratify)
Could collapse zip code to another dimension (rurality of zip code, median income of household, distance from center of zip code to closest center of excellence/Vanderbilt)
- Translate from categorical to numerical scale
Open field questions: "Is there anything else you would like for us to know?"
- Required field to complete the survey
Rank correlation coefficient for age ranked in order 1-5
- Don't stress p-valuesSearch "ChatGPT" on datamethods.org to get Frank's thoughts -- prone to leading questions
- Frank's experience: fast code for getting the wrong answer
Dr. Kumah-Crystal conducted a chi square test using ChatGPTAssess whether student performed significantly worse on any one of five categories of questions
ChatGPT ignored chi square test's independence assumption "Significant" language does not belong in a null hypothesisResearch Q: Which tract(s) as measured by FA best predicts reading and math scores?
- Build regression model (ridge, lasso, elastic net, or something else?)
- Ridge more reliable than lasso & elastic net
- Fatal flaw of lasso: trying to do selection => low probability of selecting right signals, not enough information
4 outcome measures: 2 lower level (reading, math) & 2 higher level (reading & math)
Multicollinearity in tracts
Can investigate redundancy and correlation (see Frank's analysis file)
Explore PCAI would like to analyze performance (accuracy, response time) among a behavioral assessment among our patient group. I plan to use mixed models, and I'll like to ask about the appropriate fixed and random effects structure.
Sentence verification test (48 sentences: half true, half false) -- measure accuracy and response time
Interested in effect of talker on accuracy and effect of talker on response time
Manipulate who speaker was (three male and three female)
Logistic mixed effects model
Sample size: 13
Set up as repeated measure (don't want to treat as completely new instance)
- Subject ID
Random effects: something you can't control (different sites)
Worried about overfitting
Look into VICTR voucher/R clinic (https://biostat.app.vumc.org/wiki/Main/RClinic) for coding questionsOur study is a multicenter, comparative efficacy cluster RCT of different journal club formats for internal medicine residents. Primary outcomes will be (1) subjective engagement and (2) critical appraisal skills.
Goals: 1. Revise existing subjective engagement questionnaire with your input to insure unbiased questions and appropriate scaling. 2. Discuss process for achieving content validity in questionnaire (feedback from manuscript reviewer from previous trial). 3. Discuss process for creating and validating a novel measure to assess critical appraisal skills that can be integrated into existing journal club sessions.
Critical appraisal by Berlin questionnaire
Tools and outcome measures we care about don't exist -- our aim is to create some!
Five centers, 140 participants -- all sites using gamified journal club curriculum
Hypothesis 1: do different JC formats yield different levels of engagement?
Hypothesis 2: do different JC formats yield different improvements in critical appraisal?
VICTR voucher an option
Design looks reasonable (no obvious methodologic flaws)
Put out an email to connect with someone to consult with for Education research
No measure of JC engagement in literature
CREATE framework: used to capture engagement
- Psychologist would be helpful in assessing appropriateness of instrument design
Embed questions throughout journal club proceedings
Randomization for which site begins with which format
Co-primary endpoints
REDCap app for cross-site data collectionHope to discuss a new project idea. Currently configuring the methods section. In short, we hope to see how variability in Post Concussion Symptom Severity Scores (PCSS) (0-128, over 22 questions/symptoms) affect outcomes. Mentor confirmed.
Questionnaire administered after concussion -- physical, cognitive, sleep, psychological
PCSS has proven robust in terms of recovery time from concussion
How does variability affect outcomes? 6 x 6 + 0 x 16 = 36, but 2 x 15 + 1 x 6 + 0 x 1 = 36
- A few bad symptoms or lots of mild symptoms: which profile is worse?
Formulate scale in different ways -- have a contest on which one predicts time to recovery the best
Idea: Hierarchical scale with clinical overrides
Analysis ideas:
- Variable clustering analysis -- rank correlation (determine which questions have overlapping information)
- Predict total in forward/stepwise fashion
- Principle components/sparse principle components
Instead of scoring each item 1-6, score as S1-S6
- Run regression analysis, solve for S's to derive appropriate weighting
- Compete with original scale to see if more predictive
Five additional opportunities: use adjusted R^2
Redundancy analysis: which question answers can be predicted by other question answers?Retrospective analysis of outcomes of transitions of care from the pediatric to the adult CF clinic within our institution. Outcomes include primarily lung function (FEV1), BMI, and number of hospitalizations/exacerbations in the year prior compared to the year after transition. Mentor confirmed.
Sample is in last five years
~ 85% transitioned at age 18 (a few a little older, a few a little younger) -- we know age of transition for everyone
Make data long: Column A = patient identifier, Column B = Date, Column C on = measures
Spaghetti plot: you see all data (no summary involved)Most participants will have E/T/I date, which increases lung function
- Only a few participants not eligible by genotype
- We will know when they started and assume compliance (rate of non-compliance = very low)
- Include two variables -- did they use medication (yes/no) and if yes, date
N = 60; type of modeling and conclusions are limited
- Can say, "Of people with lung function that declined _ amount, __ % ...- "This is what we observed in our population"
No one uninsured in the population -- so question would be is private insurance > medicaid
Stratify across variables of interest
If you have dates of clinic visits/hospitalizations, calculate intervals between themMy project is looking at the impact of requiring stop dates on antibiotic orders at the children's hospital. We have 4 time periods that we are looking at and would like help understanding what statistical tests should be run. Mentor confirmed.
Raw data: individual dose administered; what it was indicated for and what day it was given
Gap where things are uncertain; remove that time interval from analysis
Duration varies by season; COVID also happened during all of this
- Seasonal variation is smooth, covid is more discontinuous
Use general regression model which accounts for all forces in play
- On/off variable that is discontinuous (are you before policy change or after policy change)
You're interested in signal after subtracting off other effects
Want to pull all data, not just June-November- Might be unlucky on what months you pulled
Days from a starting point for the whole study (more resolution)
Time trends: seasonal, but could have other trends (staff adherence)
Reason for weird ANOVA variances -- possibly outliers
Apply for VICTR voucher?Equivalence trial, or non-inferiority (both high sample sizes, latter slightly lower); looking for 5% difference
How many would we need in each study arm?
Yes/No => lowest resolution, need highest sample
- N = 100-200; rules out Yes/No entirely (no close calls)
get consensus on outcome severity => hierarchical levels; get more out of your sample size
- Want levels to be well populated
Not appropriate for VICTR voucherRCTs very involved; should get a dedicated statistician
Statisticians should be embedded from beginning of the study to the end
How do you use REDCap for randomization? 1:1? Blocked?
Next steps:
- Involve dept chair: have him/her contact Yu ShyrOur project is a retrospective chart review analyzing outcomes of cystic fibrosis (CF) patients who have transitioned from our pediatric clinic to our adult CF clinic. We would like to discuss biostats needed to evaluate associations between patient factors and outcomes. Mentor confirmed.
Sample of about 60
"Port CF" database -- deidentified excel sheet in OneDrive -- prospective registryData need to be cleaned (remove comments)
Biggest burning question: Do outcomes change after transition?
Analyze BMI using age as a covariate; subtract out the effect of age to look at effects of other vars
Try to avoid percentile approach; makes too many assumptions (linearity, normality)
- Stay as close to raw data as possible; pull original BMI data
Would learn a lot from spaghetti plot (red before transition, green after); can see data gapsTall and thin data set; date + BMI
Some patients will have less data after transition (one year) -> analysis will account for that
Problem with mean change = regression to the mean (someone could have good or bad day or measurement error)
Testing of significance must account for data pairing
Big picture: look at time continuously when possible
Living situation data not well defined; employment status, student status, & health insurance best we have (must deal with missingness, though)
I want to set up a pilot study evaluating the effect of a low-energy diet (LED) intervention on measures related to weight, osteoarthritis, hypertension, and diabetes. The pilot study is to (a) test the intervention on a small scale before requesting funding for a sufficiently powered study and (b) ensure I have the infrastructure to execute the more extensive study effectively. My question for Biostatistics Clinic is, "are the statistical measures in my specific aims appropriate?"
Specific Aims and Hypothesis Aim 1: To evaluate the effect of an LED diet intervention, including pre-prepared meals, on weight. Hypothesis: Subjects will demonstrate a clinically significant reduction in weight (15%) at 12 weeks compared to their baseline. Approach: This aim is designed to compare mean differences in weight at the onset and endpoint of the study. As such, the data is from two paired datasets. The mean weight difference will have a normal distribution derived from a parametric variable. A paired t-test will be used to measure the difference between groups. Secondary outcomes will be the proportion of subjects that reach a 10% and 20% weight loss threshold. Too few subjects will be included to perform regression analysis of which variables (e.g., gender, the initial level of obesity, age) predict meeting those weight loss thresholds. This is a pilot study with five subjects. In the future, the sample size will be powered to determine a difference with a beta-error of 0.2 using a % weight loss standard deviation of 3.9%.
Aim 2: To evaluate the effect of diet intervention on knee osteoarthritis patient-reported outcomes measures. Hypothesis: Subjects will demonstrate a clinically significant improvement in the Visual Analog Score (VAS) for pain (2 points) at 12 weeks compared to their baseline. Approach: This study compares paired mean differences of pre- and post-VAS scores at the onset and endpoint of the study. Since the mean differences off a non-parametric VAS score have a non-normal distribution, a Wilcoxon-Rank-Sum Test will be used to measure the difference. Similar evaluation will be performed for secondary outcomes of KOOS sub-scales, WOMAC, and SF-12 scales. This is a pilot study with five subjects. In the future, the sample size will be powered to determine a difference with a beta-error of 0.2 using a VAS standard deviation of 1.1.
Aim 3: To evaluate if a LED diet intervention has a clinically significant change in markers of hypertension and T2D. Hypothesis: Between the onset of the study and the conclusion, subjects will experience improvements in systolic blood pressure, diastolic blood pressure, and HgbA1C. Regarding blood pressure, we hypothesize that 100% of the subjects will experience a 50% improvement in their baseline systolic and diastolic blood pressures and the goal of 120/80 mmHg. This compares baseline and endpoint datasets in a single population with non-normal distribution since we are evaluating proportions that meet a blood pressure goal, not the blood pressure numbers themselves. The statistical measure that will show significant change is a Wilcoxon-Rank-Sum test. Regarding the HgbA1C, we hypothesize an average 1.0% point change at three months. Our statistical measure for if the group demonstrates this degree of change is a paired t-test (the expected standard deviation for a 1.29% change is 1.32%). Aim 4: To evaluate if a LED diet intervention, including pre-prepared meals, reduces the proportion of patients using non-protocol interventions. Hypothesis: The proportion of subjects that use a non-protocol intervention (e.g., oral/topical NSAIDs, other oral/topical analgesics, corticosteroid injections in the previous three months, braces, units of short-acting insulin & units of long-acting insulin, etc.) will be lower and reach statistical significance (p<0.05) after the intervention compared to those subjects paired values at the onset of the study. Approach: This aim is to compare two proportions at various time points with two data sets. Statistical significance will be measured using the McNemar test. Furthermore, using non-protocol interventions will be evaluated to see if they predict the clinical changes in patient-reported outcomes and weight using logistic regression.Sample size of 300 at minimum for binary outcomes
- Scatterplot?
Paired t-test would be fine; Wilcoxon signed rank test more robust
BMI > 35 to be included in study
- Regression to the mean an issue (caught person on good/bad day, measurement error)
Admit patients if maintained stable BMI for a particular period of time? Unstable correlates with having higher/lower BMI...
Fidelity to the diet is less of a concern
Example: reliability of self-reported food intake = not good
Recalling intake might increase possibility of cheating
Signed rank test will work for aim 2
Wilcoxon rank difference test good for paired data -- same p-value no matter how you transform the data; robust
Rank difference test for aim 3
- Make patient their own control; pre-post, not against 120/80
- Wait x minutes, measure; wait, measure; use same instrument, keep other factors constant
Meals will be delivered to participant's houseEndoscopic rhizotomy systematic review - follow up from last meeting on May 5th, 2022. Data collection is complete, would like to further discuss VICTR application process. Mentor confirmed.
Six studies identified that met inclusion criteria - these were determined to meet acceptable standard of care
Pre-procedural screening
Any population-level differences across the studies should be adjusted for
Are studies randomized? Of the ones with comparison groups, one was randomized, others were cohort studies
Next steps: assess amalgomated effectMeta analysis can properly account for study to study variation
Simple pooled analysis (CI will be falsely narrow)
- How should you weight?
Accounting for Time Zero across studies
Randomized trials = the best
"Surgery before?" could be key covariate
Nail down grouping and modeling in further dialogue
- Make goals, candidate studies, and assessed outcomes clear
VICTR voucher good for a year
"Spreadsheet from hell" on website: things to avoidTesting genetic features' ability to predict risk of cardiac events. Recorded data are age at first event, frequency of subsequent events (some are age at subsequent event), and use (start date and duration) of controlling medication. I would like to use these data to evaluate the ability of genetic features to predict these events, controlling for other clinical features and use of medications.
Predict risk of event given carrier heterozygous status
Control for known clinical features: corrected QT interval, age at first event, event rate, age at all subsequent events, age at Beta blocker use, ...
100 people might have genetic variant (no variation in that marker)
- Like dose-response relationship without linearity assumption
How to handle multiple, distinct events (largely one type: syncope)?
- State transition model, allowing patients to move in and out of various states
If data are sparse (most don't have syncope, and if they do, only once)
- Cox model (time to event)
Frank: "why your collaborator is wrong": https://hbiostat.org/glossary- Look under "dependent variable" and click on the "other information" tab under that.
Explore imputation approaches
Key issue in arrhythmia research: access to EKG or summary of EKG
- Barrier will be quite high, but payoff could be worth it · Analysis: Propensity score weighting · Cox proportional hazards: recurrent / competing events · Frank: May be two reasons to change medication: o Planning, vs. reaction o Is causal analysis needed to get rid of this feedback loop? · Frank: internal time-dependent covariates are what is present here. o External version would be crossover study where everyone must switch drug at a certain time. o Interpretation is harder with internal. o If covariates arent updated frequently enough, what we are trying to learn from our change variable will be difficult to interpret o Propensity adjustment may not be sufficient for that. §Investigator: propensity was chosen because exposure groups may be vastly different. Ashley wanted to account for that via a number of covariates (about 10). Sample size is estimated at 70,000, but many have not switched at all (lots of 0s). §Static Propensity score: looking at baseline or characteristics at one point in time, then we are not accounting for people switching back and forth (increasing dosage and then decreasing, for example). §This situation is more dynamic; time-dependent covariates are important. §Frank: How do you want to word your conclusion? We learned something that gives the recipe for required changes to affect better outcomes (causal), or a non-causal conclusion? §Goal of observational study is understanding the system; §Frank suggests going without the propensity score: Miguel Hernan (observational 2008) , cohort from observation data, estrogen in case of hormone therapy or not. Can get same results as RCT if time-dependent covariates were well understood and updated frequently. §He also has a great book on observational data §Confounders would need to be measured within days / weeks of switch §Andrew Spieker, Bryan Shepard (a few others as well) specializing in causal inference. We could ask someone to join clinic. §Medical natural experiments: surgical conferences covered by junior surgeons while away. What happens to patient outcomes during this is a natural experiment (for example)