Follow up on figures for review paper on OCNDS
Prism computed an exact P value (0.0568), which takes into account ties among values. Note that most other programs do not compute exact P values when there are tied values, but would instead report an approximate P value (0.0540).
- P-value threshold traditional, but problematic. No excuse for using exact calculation when available
- Wilcoxon is exact in that it accounts for ties
Show proportions rather than counts -- concerned with ratio of grey to black
Don't use asterisks -- show p-value to three decimal places (
In writing: if p value is 0.01 -> evidence for difference in probability of response between the two groups (if binary outcome)
if p value right at 0.05 -> there is mild evidence...
if p value is 0.3 or something big -> there is limited evidence...
Count variable: want to show confidence limits rather than standard errors -- rectangle doesn't add any information (take away bar so confidence limits will show up)
Confidence interval going below zero highlights issue with parametric CIs -- doesn't go below by much so not a big deal (remove ns)
- Same issue with count variable -- bootstrap CI will work better than parametric
Bootstrap samples data multiple times -- CI will behave better (bootstrap "nonparametric" percentile confidence interval one of many variations)
Don't use fisher's exact test -- not accurate -> use wilcoxon exact; ordinary pearson chi square is better than fisherI’m working on a grant proposal to study the financial burden on older ICU survivors using secondary data. We aim to compare financial burdens between ICU survivors and those hospitalized without ICU care, and identify factors associated with higher burdens. We’ll analyze individual data collected every other year before and after hospitalization until death. Financial burdens will be categorized as no burden, high burden, catastrophic burden, or death. Some covariates, such as rehospitalization during follow-up, will change over time. We’re seeking advice on the best statistical modeling approach for this study.
Case-control study design -- case = ICU admission, control = no ICU admission
Primary outcome: financial burden -- out of pocket expenses < 20% = no financial burden, 20% - 40% = high financial burden, >40% = catastrophic financial burden
Secondary outcome: mortality
Covariates...
Want to model longitudinal and investigate in-person difference
Problem with splitting resilience into quartiles -- outer quartiles are wide, but you count them as the same
- Make sure you analyze resilience continuously
Hypotheses worded asymetrically -- non ICU hospitalization controls -> hospitalization survivors
Want to control for number of prior hospitalizations and comorbidities (Elixhauser) -- want comorbidity index that is high resolution (not charlson)
Population restricted to age 65+ participants with medicare
Follow-up every other year -- measure cumulative new experience from last year
Issue with death: early death means reduced cost
Order and use in state transition model
- If you didn't die what was the cost? Order by non-fatal cost
Data is family-wide but death is one person
In most analyses, death is an absorbing state -- but not here, where family continues to get bills in the mail
Financial burden -- how many dollars spent in the last year relative to family income in the last year
- Putting ratio into categories lowers sample size -- thresholding loses a lot of power
Leave ratio continuous
VICTR studio
VICTR voucher with 90 hours of help -- takes a month or so for statistician to be assigned to our project
Systematic review of the prevalence of reporting of medical therapy in peripheral artery disease clinical trials. We want to know the percentage of trials that report baseline medical therapy at all, and of those trials that do report it, what is the percentage breakdown of the various medications that are taken (ie 30% of patients were on aspirin at time of trial).
We have a question regarding the best way to report and display these data. The purpose of this project is to identify the need for future investigators to report medical therapy data even in trials studying devices or interventions. Basically, it may add a new facet to their outcomes when medical therapy is reporting in the patients they are studying.
Attendees: Frank Harrrell, Geonte Jackson, Aaron Aday, Jackson Resser, Cass Johnson
Investigators are primarily interested in meta-regression. Chapter 8 of Welcome! | Doing Meta-Analysis in R (bookdown.org) does cover Meta-regression.35 studies; let's say 10 achieve a goal / try a therapy, then effective sample size is close to 10. Therefore, number of things you can look at to predict will be limited.
Wilson interval: proportions could be calculated during clinic, if there's a small enough number (5-10, perhaps).
Please note potential confounders that may be adjusted for: publication year, publication quality, investigator group.
Investigators plan to return when they get further along in the project; scheduling on Thursdays will be ideal so that Frank / Jackson / Cass are sitting in on clinic.
Select a time · Biostatistics Clinics (youcanbook.me)I met with the clinic a few weeks ago about a project titled Neonatology Attitudes and Practices for Neonates with Kidney Failure. We were hoping to meet again to generate the randomization excel spreadsheet to import into redcap and review the survey questions for any framing biases.
Forgo randomization -- send out survey without mortality data
Walked through survey on callCapture age and years of practice as continuous variables rather than as categorical ranges
NICU level -- capture highest level
Survey will be sent to neonatologists in listserv
- N = 500 neonatologists in undefined number of centers
How much duplicate center-level information will each survey participant provide?
Survey looks a bit long -- survey fatigue is a danger
If partial surveys are recorded -- make sure key info like demographics are first -> do analysis based on those who bailed out, see if it was a random sample
- Participants who respond may be extremely passionate/dispassionate of topic
Problem with listserv: can't tell how many had opportunity to respond
If response rate < 80% -> cause for worry
Correlational analysis -- focus survey on decision-making and attitudes
What incentives you can use to get people to respond
- Frank's two-dollar bill survey
Order demographics in survey from most to least important
Next steps when you have data: stat help from pediatrics dept
What types of variables will you be examining the correlation between? Binary, ordinal, etc.
- More levels of ordinal variable -> spearman is best choice
- Use rawest form of the data -> correlate matter of degree of one variable with matter of degree of another
If something does not apply to someone, best to omit
The goal of our project is to understand the feasibility and acceptability of firearm safety counseling and utilization of storage devices when offered to families of children admitted to the Behavioral Health service at MCJCHV. I would appreciate assistance in optimizing our statistical analysis plan given this is a non-randomized pilot trial.
How useful and effective are secure storage counseling and secure storage devices?
Firearm owning guardians of youth admitted with mental health needs -- receive 5-10 brief injury prevention educational session
Aim 1: evaluate utilization
Aim 2: assess feasibility and acceptability
VICTR research proposal submitted a year ago
Do folks actually take these devices?
Primary measure: proportion of eligible families who self-report using chosen device at time of 3-month follow-up
Feasibility: compare enrollment rates, recruitment rates, completion rates
Big questions:
- improve rigor of SAP
- confirm appropriate data collection process
Expected sample of 180 guardians
Key word: "report" -- distinction between reported activity and actual activity (actually storing firearm securely?)
- Social desirability bias
Rural vs urban distinction
No gold standard way of asking survey questions -- pictures of firearms not feasible
Recognize biases that inherently lie in self-reported storage activity
Push confidence intervals -- not so much point estimates and p-values
Wilson interval method -- R package
Numerator: those who take a firearm storage device
Denominator: enrolled and had full opportunity to take storage device
wilson confidence interval when it's a regular proportion
concordance between historical and current use -- __ confidence interval for difference in two paired proportions (McNemar 's test)Mcnemar's: yes's vs no's
Frank's Hmisc R package binconf
Prospective study -- pre period, post period 3 months later
- Cannot have any drop-outs -> drop-outs don't occur at random
Biased non-response to follow-up is fatalTherapy for asthma
mHealth intervention -- receive interactive text messages
Compare step count among participants who receive text messgaes vs those who don't -- no cross-over component
Sample size of 32 patients (16 in each group) to detect an increase of 2488 steps after intervention assuming a standard deviation of 3241 steps (sd so wide that negative step counts are possible)
* Step count assymetric -- sd may not be best measure of variation, nor normal distribution the best distribution for step count
* Floor effect
Sample size more effective if step counts measured daily
No established minimally important difference -- dig a little deeper/survey disinterested investigators
Increasing mcid until sample size is feasible -- not a good approach
Frank thinks there is a statistician assigned to allergy/pulmonary and critical care division
If no existing collaboration, VICTR voucher is an option -- disadvantage that you get a new statistician each time and slow to start
Issue with keeping people enrolled -- drop-out rate important issue
SMART (Sequential Multiple Assignment Randomized Trial) design -- used frequently for smoking cessation studies
8000 step goal -- asthma patients often fail to achieve that
Area that is ready for advancement -- VICTR studio, more organized and multidisciplinary clinic
* Could be useful if disciplinary literature is limited -- exposed to new ideas and alternative design methodsUtilizing a survey, we will gain insight into the current postnatal depression levels and coping efficacy of this patient population in OB/GYN Clinics at VUMC by employing the Brief-COPE and EPDS scales, respectively. We will further ascertain interest levels and recommended facilitation methods in this survey by utilizing predominately closed questions crafted by the research team. With the knowledge obtained from the patients within this population, we hope this project will provide valuable insight for the implementation of a future support group.
Our main questions surround statistical analysis as well as power calculations. We want to ensure that our survey questions and statistical analysis are adequate prior to starting data collection.
Attendees: Frank Harrrell, Alexandra McKeown, Brighton Goodhue, Cass JohnsonHeart transplant allografts at VUMC are now procured and stored in a new manner as of July 2023. We would like to compare the impact of this new storage method to our historical method on how the hearts function immediately post-operatively as well as short-term outcomes.
New cooler (traferox) -- perception: longer duration, better preservation of hearts
Treatment for heart failure: heart transplant
Current standard of care: transport heart allograft from donor to recipient on ice -- recommended time limit is 4 hours (due to risk of primary graft dysfunction)
- 2021 study showed lungs transported at 10 degrees celsius had better outcomes -> same for hearts?
Traferox used starting on 2023-07-23 -- store hearts at 10 degrees
- N = 77 hearts -> 52 after exclusion criteria
Aim: compare prevalence of severe PGD in heart transplants
Study design: propensity match 3:1 control -- transplants from 2/2020 to 7/2023
Primary outcome: evidence of severe PGD
Secondary: markers of intraoperative and postoperative performance
Sub analyses: of all 10 degree hearts with an ischemic time > 4 hours, is there a difference? of all 10 degree hearts where donor heart > 40 years old, is there a difference?
All or nothing -- no phase-in period between device usage
- Phase-in allows for stronger inference. All or nothing allows for hidden time trend to have effect
No close calls with yes/no outcome -- sample size needs to be larger
Higher resolution variables are better (added sensitivity)
Could be a problem if matching algorithm is order-sensitive
Each patient matters -- avoid matching methods that would delete patients
If change procedure such that risk can be tolerated, people will take on more risk and benefit is decreased
- Example: instant brakes on a train, more speed
At same length of cooling time, was there an advantage?
Propensity score only needed when # of items to adjust for is large, but you lose ability to ask if there is a differential effect
Initial analysis using just iced hearts -- assess whether there was any time trend at all in outcome measure - If so, outcome contaminated - Go as far back as you can (will help a moderate amount), adjust for procurement time, model shape and slope of time trendProportion of DCD: 50%
Next Steps: potential VICTR voucher -- up to 90 hours of help
Change outcome to multi-level ordinal variableThis survey will collect individual demographics, center demographics, and neonatal attitudes regarding prenatal counseling and management of babies with Chronic Kidney Disease. There will be 2 study arms- 1 without any center specific data/statistics regarding outcomes for CKD and the other with this data.
I would like to see how many participants I would need to make the outcome statistically significant.
Attitude survey
Randomization algorithm in REDCap -- tricky with blinding/blocking
Survey population: neonatologists
Expect for respondents to have more optimistic view than data suggest
Expected sample size: hoping for 300-400 respondents
- Restate sample size question: sample to estimate proportion within an acceptable margin of error -- is margin of error small enough that we can say we learned what we need to learn
- Signal could be trivial
Create randomization list longer than you would ever need, make it reproducible (define randomization seed, hide it somewhere)
Framing/wording of survey questions = important!
- Get as many eyes as possible on it before launch
Schedule future clinic meeting to review survey questionsI am doing a quantitative study analyzing attitudes toward genetic testing among parents of children with inflammatory bowel disease (IBD). I will be sending a survey to parents of children with IBD that are seen in Vanderbilt's Pediatric Gastroenterology, Hepatology, and Nutrition Clinic. My aims are to assess attitudes and beliefs toward genetic testing among parents of children with IBD, assess the impact of family history on attitudes and beliefs toward genetic testing for parent's children with IBD, and assess the impact of previous genetic testing experiences on the attitudes and beliefs of parents of children with IBD. We will be doing a bivariate and multivariable analysis on this data. We need assistance on a power calculation for our study to identify how many people we would need to identify to analyze attitudes toward genetic testing in parents of children with IBD.
Attendance: Makayla Hall, Gillian Hooker, Frank Harrell, Jackson Resser, Cass Johnson
Mixed-methods study -- Parent's Perception of Genetic Testing:
How are people surveyed? What are you expecting to collect?
Aout 500 - 600 patients in patient population. Tracked in Improve Care Now database. Estimate of 100 patients that will respond and be able to be included in sample.
Subset of very young patients that have all had genetic testing.
Nonresponse rate will be an issue you have to contend with. The reason for non-reponse is important.Anything you can do to understnd non-reponse, or give some kind of incentive, would be useful. Is there anything you can learn about how typical responders are vs. non-responders?
Making survey brief is also important for responsiveness as well.
Online survey -- through REDCap, message sent through My Health at Vanderbilt.
Are any questions asked in manners of degree? Yes, some are Likert scale, some open-response.
REDCap gives the option to use a slider -- that can give better degree of variation, help break ties. These can be a great option to use. Calculating mean of 0-100 can give better info. Although if you can instruct patients to click the scale, even if they agree it would be at 50, that is best -- otherwise, REDCap will just mark it as 50, and it can be difficult to assess if the participant skipped the quesiton vs. truly agreed the answer was at a 50.
Makayla describes a previous study that is very similar, but done on adult patients (very descriptive). Driver of her research questions, as genetic testing is actually done in pediatrics, and there's no research in how primary caregivers of these patients (parents) feel about genetic testing.
Another study describes identifying predictors of positive attitude towards testing, which would be an ideal characteristic for her Master's thesis.
Frank's major point: main concern is not power, but representativeness and how trustworthy it is. Not a hypothesis testing framework, but instead looking at confidence intervals on things like means and proportions.For multivariable analysis, large sample sizes are required. For example, to estimate one proportion for a binary variable, you need about 96 participants for an overall proportion. If you want a margin of error plus or minus 0.5, you need 394 people. Multivariable is more extreme than this. To predict yes or no here, you need thousands of patients.
In smaller sample sizes like 100, you'll need a very high signal-to-noise ratio if you're doing something complicated. Non-continuous variables have much less signal-to-noise ratio. Being able to measure something as continuous or ordinal helps immensly in helping you do more complex analyses with smaller sample sizes.
They are interested in predicting how positively people feel about genetic testing for their multivariable model, and looking at predictors for that attitude. Those REDCap sliders come into play here -- that will help to be able to treat those predictors as continuous where possible.
Another suggestion from Frank -- descriptive analysis for how each patient factor correlates with the degree of feeling positively or negatively, and add CI's to those correlations. May or may not serve your ultimate goal, but could be a different strategy.
CI's help tell the reader the limitations of your sample size, which is great for good research.
Also, you can ask the question -- "If you had many patient characteristics, which ones are "winners" and which are "losers" when it comes to predicting response?"
Adding CI's here is also very helpful. Can help you find if there's a "smoking gun", one dominant characteristic helpful in predicting sponse. Check out Bootstrapping: Biostatistics for Biomedical Research (hbiostat.org)Look for "precision" here -- we can get a margin of error (half the width of a CI) for estimating a correlation coefficient, etc. 400 pairs are needed to do this well, but again, the CI will help describe your level of certainty and provide transparency.
Or, Frank's RMS course notes / R code examples: Regression Modeling Strategies (hbiostat.org) Suggestions for R Workflow for Reproducible Data Analysis: R Workflow (hbiostat.org)We would appreciate help in refining our proposal for analyzing the relationship between the management of Moyamoya and social determinants of health
Retrospective study, adult patients with Moyamoya since 2006 -- N = 350
Aim: medical & surgical management
Surrogate markers: cholesterol, worsening stroke
Impact from social determinants, ADI
Prelim analysis
Long list of variables team is collecting
Scope of study: causal or explanatory
- Example: inequities in opportunities for women -- looked like women were receiving fewer opportunities to be engineers, but that was because fewer applied
Have to deal with confounders with causal
Investigators are looking for somewhere in between explanatory and causal
One overall outcome: recurrent stroke after initial presentation
For patient to enter cohort, must have had prior stroke (could be old or recent)
- Date of prior stroke not reliably measured; we do know date of acute presentation at VUMC
Inclusion: VUMC patients since 2006 with Moyamoya
Exclusion: one time visit
Surveillance for finding new occurences of stroke: routine imaging after surgery; more frequent if symptoms
Time to first recurrent stroke:
- For patients with no documented recurrent stroke, documenting last time you know their status
Patient stops being followed because they're failing -- would be fatal
Cumulative incidence -- don't have to have strokes
- To learn other relative things, will need recurrent strokes
- Rule of thumb: need min of 15 recurrent strokes to study one variable -- with estimated 30 strokes, two variables
- Additionally need to have some in each group (if you want to study sex, need some male and some female)
There are black box methods, but may not be interested as they are not interpretable
- Put hopes that low dimensionality data will have enough info to learn what you want to learn
Increased risk given ADI
- Interpret ADI as unique risk factor for stroke and rule out explanation by something else (adjust for confounder)
- Collect variables you don't want to evaluate on their own, but adjust primary variable for -- propensity score adjustment
- What does having a high ADI go along with?
Put propensity score in model, how much of stroke recurrence explained by actual ADI?Percentiling makes sense when there is competition
- Grade on curve: beat other students in your class, not absolute knowledge
Investigators will look into ADI literature
- ADI percentile may be best metric given that you can't derive raw values (which would be better)
Biostat support = $5k, 90 hours of help
- Scope of help: publish one paperI will be surveying sperm donor recipient parents to see what information they value in selecting their donor. I plan to present the information in a matrix survey for them to rank in importance. I will also utilize a tiered approach to determine if they interacted with genetic counselors at any point in the process and if they found that information helpful.
This study will be mixed-methods as I will also include free response portions to collect contextual data. Our questions are what softwares would be best to analyze the data after collection, what types of statistical analysis would best support any patterns seen, and how could we calculate power for this study/what sample size should we aim for?
Rather than a 5 - option Likert scale, recommend using a slider scale on REDCap with no numbers shown, but has an internal scale of 0-100. Labels on the slider will be "Not at all important" and "Highest possible importance". Note that the slider will default to 50, make sure to tell participants to move the slider to the desired location even if they wish to have it somewhere in the middle. With this data you can estimate the mean for each question and a confidence interval. At a higher level, you could also use bootstrapping to determine the order of means of the questions, which would allow you to rank the importance of each question.
Consider randomizing the order of questions so that if people don't fill out questions at the end, coverage of all questions is still achieved. Still make sure demographics are asked at the top. Need to make sure the survey can be completed quickly to ensure high participation.
For sample size calculation, would need an estimate of the standard deviation in order to estimate the sample size needed for a particular margin of error for a continuous outcome. Without this SD estimate, you could make a conservative estimate using known sample sizes for binary (yes/no) questions. A sample size of 96 will give a proportion estimate with a margin of error of + or - 0.1 (10%), and a sample size of 96 x 4 (384) will have a margin of error of 0.05. As the number of response options increases we get more resolution, less ties within the data, and therefore more power.
Statistical software suggestions: SPSS, Stata
I am running a study using a new technique here, magnetic resonance elastography (MRE; collected in MRI scanner). To provide a good basis for future studies, we want to replicate a previous study AND we want to add an extension to look at sex-differences (it's basically a correlation between a single MRE measure and performance on a memory test). Because of the cost of MRI ($600/hr), it can be only a small sample size (total 50). Based on conventional power analysis, this is enough. I am requesting VICTR funding. I have been unable to satisfy the statistician and I don't fully understand the issue, so I would like to consult you! They wrote in the rejection letter that the stats section was ill-defined and "PI needs to show the margin of error in estimating r with the planned sample size, under the worst case scenario where the true correlation is zero and then justify the planned sample size after describing that margin of error." There is also concern around confounds (which is addressed). It's essentially a big data issue due to the neuroimaging component, but we have a really simple analysis plan (i.e., a correlation, because that's what the replicated study did).
Attendance: Frank Harrell, Annick Tanguay, Melissa Duff, Cass JohnsonSex Differences in Hippocampal Viscoelasticity and Relational Memory: A Replication Study
We know relatively little about brain health in women, despite well-established sex-differences in episodic memory. There are also sex diferences in the hippocampus.
Magnetic Resonance Elastography shows some promise -- determine how elastic a tissue is in the brain (found to correlate with memory tasks in prior study). The present study by Annick and Melissa hopes to replicate this study and investigate potential sex differences.Confounders will be controlled for (neurological conditions, regular menstrual cycles in women), questionnaires will be used to gather additional data.
Statistical Analyses: Partial Correlations (controlling for age and educaiton) within each group (male, female) between viscoelasticity and SR Shape, SR Object, etc.
Feedback: Lack of well-defined statistical section,
Frank's Comments:Statisticians can improve in how hypothesis testing is marketed / taught. Correlations are almost never 0; this is less of an existence hypothesis, and more a matter of degree.
Not "does there exist a difference", but "how big is the difference"?
We want to estimate an effect whether or not you're willing to assume there is an effect -- hypothesis testing shouldn't be used as a method for screening here, particularly with a smaller sample like the current case. A confidence interval should be calculated to help factor in sample size.
"Is there a sex difference that researchers should care about?"
Partial Correlations -- removing the effect of anoter variable.
What would inform readers of the research would be, what is correlation for men, correlation for women, and CI for both, as well as the difference between both and uncertainty between how different the correclation is between men and women.
Resource: Biostatistics for Biomedical Research \x{fffd}\x{20ac}\x{201c} 8 Correlation and Nonparametric Regression (hbiostat.org) (Specifically, Chapter 8, and figure 8.5: "Margin for error in r estimating the correlation, when correlation is 0, 0.25, 0.5, 0.75")Frank would usually recommend we do the calculation for the "worst case scenario", when your correlation is very small. If, in truth, the correlation is very small and there is a small sample size, the margin of error is 0.4 or greater. Does that result in you knowing more than you did before the study began?
If you wanted to calculate the difference in correlations, that requires 4 times whatever sample size is needed for one correlation. The nice thing about CI's is that it is very "honest"; if there's little information, that is an honest way of representing what you know and what you don't know. A better approach than an existance hypothesis, in this case.
If between subject variance is small (women appear alike in characteristics), and technical replication is very good, then you could expect precision to be a bit better than what is shown in the graph.
Annick's Question: Being explicit about what we do if we are disappointed -- what the next step would be -- would that be helpful? Frank says yes; confidence intervals will be shown regardless of the calculated correlations (for example).
Melissa's Question: What's the likelihood of a good return on VICTR's investment -- if we calculate CI's, is there a window where things look positive vs. negative? May help conserve resources. Frank points out that they might also look for whether the study will give good data and foundation for future work, that would be factored in. Clarifying what exactly the results will be used for would prove advatageous for you; it's definitely not hopeless. Rejection would happen if, for example, something is measured very crudely (whether a part of the brain is actuvated vs. degree to which it was activated, for example. Binary outcomes w/ a sample of 20 would result in incredibly large margins of errors that would result in rejection).
plotCorrPrecision in R -- looking for a difference in correlations is a bit more complicated, but isn't necessarily covered in that chapter. Fisher's Z transformation of r will be helpful for this. If you also included that, if you had solid evidence that your correlation is greater than 0.25, you might be able to use 0.25 as the worst-case scenario instead of 0.
Frank suggests removing the classic power analysis portion of first paragraph of power analysis.
Another comment, on Annick's earlier slide on goal of individualization: this is incredibly difficult to do. To do personalized medicine on a solid foundation, you need deeply rich data.
Annick's interpretation: By having a population measure, we may have a sense of what would ork best for them (female vs. male, age, etc).
Melissa's Note: Let's say there are two patients with TBI, that look similar on several characteristics, but outcomes are very different; as a clinician, we have no reliable way to describe individualized outcomes.
Group-Level Interaction Effect: maybe keep in mind that when studies are designed to compare two groups, they barely get enough to estimate clinical impact, let alone adjusting for confounders. The sample size needed to estimate an interaction effect is 4x greater than a simple effect. Under some assumptions, this can be up to 16x greater (to test the effect, get evidence for differential effect existing)I am writing an R01 proposal examining two continuous variables (distress and fear) and a continuous outcome (brain activation). I predict that as distress symptoms increase, brain activation will decrease and this effect will be stronger in distress than fear. I can run a regression with distress and brain activation and another with fear and brain activation, but I\x{fffd}\x{20ac}\x{2122}m not sure how to compare distress and fear without making them into dichotomous groups (which I\x{fffd}\x{20ac}\x{2122}m trying to avoid - I want to keep all variables continuous). I\x{fffd}\x{20ac}\x{2122}m looking for an approach where I can say that a continuous measure of distress shows significantly lower brain activation than a continuous measure of fear. I am also interested in looking at sex differences.
ABCD data - 11868 youth followed over 10 years
- 9-10 years of age at start, data collected annually
- Right now, have data for first 4-5 years
Variables of interest: sex, gender identity (youth report = skewed, parent report less so), puberty (skewed at baseline as expected, more normal as participants age), distress (depression), fear, brain activation (collected every other year)
Aim 1: look at mechanisms underlying distress & fear- Predict distress will show deficits in positive valence (blunted reward responsiveness)
- Predict fear will show excess negative valence (greater threat responsiveness)
Investigator plan: Brain = covariates + distress, Brain = covariates + fear, compare coefficients
- Want to avoid group-based analysis
Correlation between distress & fear is high
Could include 18 items in a scale to predict brain using a shrinkage method (like ridge regression)
- Frank likes sparse principal components analysis
- With correlation of 0.9 between distress and fear, variables would be inseparable
Compare big model to submodels, leaving out one covariate at a time
- Frank book chapter, added value and adequacy index (towards end of chapter): https://hbiostat.org/rmsc/mle- False discovery rate doesn't consider false negative rate (Frank really doesn't like FDR)
- FDR gives people false sense of comfort that ones you've chosen are winners, and ignores possiblility your losers are actually winners
- Alternative: convert to bayesian analysis (simple bayesian prior distribution for effects)
- Bayesian posterior distribution gives you evidence in all directions
Orthogonality restriction could keep distress & fear from measuring what they need to measure
Aim 2: How distress and fear change with age and pubertal development- Linear vs exponential vs other
More helpful to think of measurements at dates rather than yearly measurements (since measurements are not taken the same time apart)
- Fixed effect for time, time-dependent covariate in puberty status,
- Handle correlation (optimum power if specified well -- serial/AR1 usually works well) and time-response profile (spline function, e.g. restricted cubic spline)
- Frank book link: https://hbiostat.org/rmsc/longCalculate R^2 where model allows variable to be linear/non-linear, compare them
ChatGPT: combined model matrix algebra doesn't workSpecify time-response and get confidence bands
Aim 3: Estimate sex and gender identity differences in distress and fear- AOV not great option with correlation structure
Chunk test -- example in handout
Dichotomizing gender identity could make it worse -- depending on choice of cut-point
- Want to treat ordinal predictors as ordinal
If sex of person predicts trajectory, trajectory predicts sex
The project is clinical evaluation of the utility, usability, and impact of a pilot trial using ambient AI documentation vendor solutions. I would like to discuss how to statistically analyze the survey and various clinical electronic health record metrics for each vendor solution and then how best to compare amongst vendors with the different pilots.
The project is clinical evaluation of the utility, usability, and impact of a pilot trial using ambient AI documentation vendor solutions. I would like to discuss how to statistically analyze the survey and various clinical electronic health record metrics for each vendor solution and then how best to compare amongst vendors with the different pilots.
Attendance: Jacob Franklin, Allison McCoy, Frank Harrell, Jackson Resser, Cass Johnson Goal for Clinic:Note Composition method - Characters per week, per method. If possible, capturing patients transcribed per method could aid in interpretation, if at all possible.
Influenza, tetanus toxoid, reduced diphtheria toxoid, and acellular pertussis (Tdap) and COVID vaccines are routinely recommended during pregnancy to prevent adverse maternal and neonatal outcomes. It is well known that pregnant individuals infected with influenza or COVID are at increased risk of severe illness and adverse perinatal outcomes compared to non-pregnant individuals. Prior research has shown that global COVID-19 vaccination prevalence in pregnant women is low. Multiple factors are suggested to be associated with vaccine uptake including age, ethnicity and social living conditions.
The purpose of this study is to conduct a preliminary analysis of vaccine uptake before and after the COVID19 pandemic at our institution and understand the determinants associated with decreased uptake.
Population: Pregnant patients who delivered Vanderbilt with at least one pre-natal visit
Questions from investigator: sample size
Compare rates pre-pandemic to post-pandemic
Use highest resolution data: address > zip code
- population density, median family income for area
Initial step: understand who is coming into clinic. Relevant to understand change in participant characteristics over time
- trend in median family income over time
- population density over time
- trend in vaccine receipt over time
Look at trends in raw form and adjusted form
- estimate prevalence of vaccine over time, adjusted for covariates
10 years pre-pandemic sounds good, but subject-matter knowledge should guide that decision
CDC - social vulnerability index
Potential exclusions: allergy to vaccine, fetal anomalies
List variables to adjust for, factors that would alter tendency to receive vaccine
Analysis methods: Logistic regression model (probability of uptake by time trend, age, address, etc)
LR: to estimate prevalence in a single group well, need sample size of at least 400
Think about as prospective cohort study
Potential next steps: VICTR voucher, VICTR studio
Potential second part: vaccines administered to new-born after dischargeWe would like to create a cohort of children who received either received an ASD diagnosis or a speech/language disorder diagnosis by our department\x{2019}s clinic. We would like to be able to pull some data from EHR (diagnoses, demographics) but know that some of the audiological data will not likely be able to be pulled by your system (it\x{2019}s housed on a 3rd party system that then interfaced with eSTAR.)
Population: children with autism
- Look at population being discharged from clinic vs not
Big study question: how many visits did patients have, what information was used to make decision (four potential pieces of info that could be used)
- Clarification: what information was available for them to use
- Laying out things that were important to capture, make sure you can capture them accurately
Most children will have between 1-3 visits
History variables to understand current context
Potential exclusion criteria: facial cranial, certain ages, language, family/family history
Boys more likely to be diagnosed with autism earlier
- Age against something else
Goals sound more descriptive
- Estimate proportions of sample characteristics with confidence intervals
Make study questions as specific as possible (specific enough that it's possible the data may not be able to answer the question)
Combining EHR data with third party data
- Need to figure out what is extractable from third party
- Ask around dept
Sample size: to estimate prevalence of Yes's to +/- 0.05, need sample of at least 400
- Confidence intervals will be self-limiting
Get flexible time trend
- create windows of interest in overall smooth trend
- superimpose discontinuity
Interrupted time series analysisWith CIBS biostatisticians we have created a prediction model for cognitive impairment following the ICU using logistic regression technique. My question is how we can adjust this based on the initial results (e.g. we have 2 outcome variables but might want to reduce to 1 outcome variable) and how could we turn this into a reasonable clinical tool/calculator? In general looking for an open discussion on CPM\x{2019}s to help guide next steps
Attendees: Frank Harrell, Mark Rolfsen, Rameela Raman, Onur Orun, Wes Ely, Jackson Resser, Cass Johnson
40 \x{2013} 60% of patients may have cognitive impairment, but having a model to help individuals understand their own risk would be new.
Logistic Regression \x{2013} using prespecified baseline and in-hospital characteristics. Outcome variable was either cognitive impairment or functional disabilities. Two models: one three month, one twelve month
541 and 465 patients per model, respectively
Outcome variable occurred in 50% of 3-month patients and 43% of 12-month patients
Calibration curve \x{2013} predicted probability of outcome vs. actual probability.
Clinical context \x{2013} loved one may be at high risk of impairment, inform clinical conditions or potential support options (Bedside tool towards end of hospital stay).
Questions from Frank:Two ways of getting into system:
1) oncologist started database for cancer patients (all cancer patients at the hospital)2) At every follow-up visit, patient added to database
Some women could have failed to enter the study population because the cancer became severe quickly
Want to guard against ill-defined denominator
- Problem: patients who die before entering population (not a random sample)
- Example: cats falling off buildings; cats that died the moment after the fall were excluded
Paucity of data for breast cancer in this area of Africa
Time-oriented outcome like age of onset prone to bias
Could be difference in types of breast cancer in population
Value in determining pieces in the data that don't matter and then confirming that they don't matter
- Negative controls give you more confidence in positive controls
Data exploration: make a model to predict a missing lab value
First: dig into data, build demographic tables
Multivariable analysis of the differences (logistic regression model) to predict age cohort
- Looking for unique differences
Pre-cursor analyses: degree of missingness could limit types of analyses you could run
- Cluster analysis: understand degree of missingness on the same individual
Regression analysis: using R - https://hbiostat.org/rmsc/softwareAlso resources available to help get data from REDCap into R
Kaplan-Meier vs logistic regression
- LR better when time is not important
In some cases, not confident whether participant died from breast cancer or another causeNeurofibromatosis Type 1 (NF1) is a common genetic condition that affects approximately 1 in 2,500-3,000 individuals. The goal of this study is to investigate if a reported family history of NF1 influences perceived levels of stress and coping styles in adults with NF1. To do this, adults with NF1 completed a survey that includes questions about their diagnosis, their family history, the Perceived Stress Scale 10-Item Version, the Brief Coping Orientation to Problems Experienced Inventory, short response questions, and demographics.
We have completed a lot of our bivariate analyses and are working on a hierarchical multiple linear regression to identify other variable that modify people\x{2019}s experience of stress. During this clinic, I would like to review the analyses that I have run to ensure I am reporting things correctly. Within this, I would like to talk through the blocks through which I created to make sure we are teasing out the variables correctly.
Grand question: is there a difference in stress levels and coping styles based on family history?
How do people get into cohort?
- Survey, recruited from three different sources
- Diagnosis of NF1, > 18 years old, US resident who can speak english
- Current age and age of diagnosis available
Stage-wise multi-linear regression in SPSS
- Base model (outcome is stress level):
- M1: demographics
- M2: demographics + NF1 characteristics
"stage-wise" = different models with additional covariates
F Change = overall F for corresponding model
Additional variables adding half as much explained variation (stress level hard to predict)
Spline functions to deal with non-linear relationships
- F statistic for joint influence of age and age^2
Too many covariates to look at each individual -- result = a lot of noise
Can't look at correlation to determine which variables to analyze (double-dipping)
- Wouldn't do any statistical testing (remove p-values), report correlations to two decimal places
Can compare correlations, never p-values
Three coping subscales and have them interacting with family history in stage 3
- F test with six degrees of freedom and R^2
- Do subscales predict stress level for either family history group?
The more chunk tests you use, the more license you have to deal with things without p-value corrections like Bonferroni
Can remove p-values, report correlations as descriptive measures: do better to assume correlations are non-zero
Grouping variables into blocks is a good practice
Adjusted R^2: tells you if added variables are worth the $$
Cass suggests using "nested" terminologyI have the privilege of reporting on a pre-specified subgroup analysis of a RCT (MINT Trial, NEJM).
Briefly, MINT randomized participants with an acute MI and anemia to a liberal versus restrictive transfusion strategy. I am reporting out on a stratified analysis by type of MI (Type 1 vs Type 2 MI). If possible, I would very much like to discuss how to address the fact that the size of MI differs between Type 1 and Type MI in this trial (likely confounding the interpretation of MI type on the outcome).
Attendance: Frank Harrell, Andrew DeFilippis, Jackson Resser, Cass Johnson \x{201c}Not all heart attacks are the same\x{201d}Prespecified subgroup analysis \x{2013} differ by index enrollment MI was Type 1 or Type II
MINT \x{2013} 3,500 patients with heart attacks who were also anemic, randomized to liberal transfusion strategy or restricted. Outcome is 30 day death, MI.
Protocol specifies that index hospitalization includes designation of Type 1 or II MI. Very few unknowns.
Primary result: Whether allcomer MI\x{2019}s did better with liberal or restricted transfusion. Death / MI in Type 1 vs. Type 2, liberal vs. restricted
Troponin measurement: Many different assays, but in a heart attack, troponin value can change by 10,000 fold. Size of MIs were categorized (somewhat arbitrarily) into 5 categories -- <1, 1 to <10, 10 to <100, 100 to <1000, greater than or equal to 1000.
Frank: Wouldn\x{2019}t patients who got more troponins drawn have a better possibility of having the peak value found?Fundamentally flawed design \x{2013} if we know that people have different variables at baseline, especially when they are predictors of the outcome, makes results incredibly difficult to interpret.
Clinical trials have nothing to do to control for within-group variability. We need to know which ones can be defended as big players, not in pursuit of controlling for every single variable.
Back to Table 6 \x{2013} MI size would be very important to relate to the outcome. Push to do analysis looking at if transfusion strategy would impact large vs. small MIs
Figure 2 \x{2013} concerns with confounding, propose analyses for quantifying MI type and treatment strategy. Frank thinks that size variable is likely to be more important, would hesitate to call that confounding.
Relationship between troponin and outcome \x{2013} log ratio
Is the number of troponins reported related to actual outcome?
Of your ability to predict something, how much of it comes from variable x / y/ z. Dot plot in descending order; big prognostic players that can\x{2019}t be learned from what is currently provided.
Regarding Figure 2; age, LVF are not included.
Scatterplot of MI size at study start vs. second MI, with two colors for treatment type, could be good to look at. Then we can include baseline characteristics (hemoglobin)
Andrew suspects that size will be a second paper; Frank thinks best route for improving Figure 2 would be baseline size vs. outcome, stratified by treatment and type (four curves). Could also do it without stratifying by type for larger denominators / greater stability.
These would not be Kaplan-Meier curves; logistic regression models (size on X axis, yes/ no at 30 days). Don\x{2019}t assume that log ratio is linear (Frank Harrell and Magnus Olsen, nonparametric regression on Troponin in NEJM, or spline function)
Also \x{2013} Spearman correlation coefficient between size and LVF
When firm threshold is present for qualification into the study (hemoglobin), which is also the variable being treated to, you may need to verify that there's no boundary artifacts. People at 9.9 hurt by treatment vs. helped, for example.
This is an already completed analysis of which the abstract is posted below. We would would like to discuss potential methods to account for unmeasured confounders:
Introduction:Discussion notes:
Association of plasma and mortality in severe TBI
Question: other methods ot account for unmeasured confounding? Instrument variable analysis (generally limited to randomized study), e-value sensitivity analysis
Inflection point found between 6-10 units of plasma (categorized exposure at clinically relevant threshold)
- Frank: categorization approach counts all values in a category as the same. Categorizing at inflection point does NOT respect the form of the data
Survival bias (patients that die early don't receive as much plasma)
- Higher resolution data needed to address
We want to adjust for confounding of bleeding for the relationship between plasma and severe TBI
- Difficult to disentangle bleeding from plasma -- so intimately intertwined (hard to do without randomized design)
- Question that can be answered: investigate relationship between bleeding and plasma, characterize by other variables
- Could analyze quality of clinical practice, variation in how much plasma was given
- Could inform later analysis when you bring in mortality
Frank R package (rms) to perform instrument analysis
High proportion of participants who did not receive plasma at all
- Need to choose knots in spline function. Placing knots difficult when lots of zeroes
- Manual override places knots using non-zeroes
Retrospective data -- feedback loopFollow-up from 1/11/24 to meet with Frank
Attendees: Frank Harrell, Alvin Jeffrey, Marianna LaNoue, Dagmawi Negesse, Jackson Resser, Cass Johnson Summary of last week:I am working on a master\x{2019}s thesis project to conduct a retrospective chart review for individuals undergoing testing for Huntington Disease at VUMC across two decades to assess whether there are differences between asymptomatic and symptomatic individuals. The question I would like to address is \x{201c}How do asymptomatic and symptomatic individuals who decide to pursue genetic testing for Huntington Disease differ?\x{201d} I need assistance with descriptive and comparative statistics.
Clinic attendees: Dandan Liu, Cass Johnson, Jackson Resser
Years: 2001-2022; Huntington = neurodegenerativePopulation of interest: Tested for huntington's disease (by ICD) initially pulled from pathology
- Within this, those symptomatic or asymptomatic
Symp/asymp: motor symptoms at initial visit or at test
Descriptive study with subgroup comparison
Dates: initial visit date, blood draw date, results disclosure date
Criteria for neurologist used to assess symptomatic/asymptomatic
406/415 have complete symp/asymp: need to really think about how to handle missingness for symp/asymp
Age = continuous variable; if normally distributed -> two sample t-test
If non-normal -> non-parametric Wilcoxon rank sum test (preferred because it makes less assumptions)
Chi-square test to compare categorical variables
Use test statistic and p-value to assess whether results are significant; report raw differencesClinic attendees: Bryan Blette, Cass Johnson, Jackson Resser
Prior work: tested four risk formats for same underlying info (latin square randomization -- three scenarios)
SPECTACULAR
Phase 1: build your own adventure (pick which do you want to see)
Phase 2: keep chosen design from P1 on one screen, randomize pieces on the other
Can you merge bayesian design with factorial framework?
Could embed rules such that a given participant, based on their info, is more to be randomized a certain way
6-8 factors, about 1000 total combinations
Proposed: drop conditions after person has completed one hour of data collection
Aim: formalize framework then operationalize bayesian-adaptive design
VICTR voucher could be a good fit but Bryan thinks 90 hours might not be enough
- voucher might get desired deliverable
Idea for paper: simulations & power calculations
- Would help to assess feasibility and whether you want to drop factors