Health services research, diagnosis, and prognosis

Click here for 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, and before.

Current Notes (2023)

2023 October 2

Cara Donohue, Hearing and Speech Sciences

Study design for a pragmatic clinical trial in HT patients.

2023 September 25

Mikaela Bradley (Sarah Stallings), Genetic Counseling Program

Neurofibromatosis Type 1 (NF1) is a common genetic condition that affects approximately 1 in 2,500-3,000 individuals. The goal of this study is to investigate if a reported family history of NF1 influences perceived levels of stress and coping styles in adults with NF1. To do this, adults with NF1 have been recruited to complete a survey that includes questions about their diagnosis, their family history, the Perceived Stress Scale 10-Item Version, the Brief Coping Orientation to Problems Experienced Inventory, short response questions, and demographics. The scores from the two validated scales will be used to evaluate how our participants perceive their stress and how they use coping styles, respectively. Scores will be compared between individuals with “inherited NF1” and “sporadic NF1” to evaluate if a reported family history influences perceived stress and coping styles of adults with NF1. During this clinic, I would like to review the types of analyses I plan to run and make sure I am setting things up correctly.

Katherine Hajdu (Jon Schoenecker and Stephanie Moore-Lotridge), Orthopedics

There is a commonly used classification system of a pediatric injury that is used to reporter their overall risk for an adverse event. We are adding a subclassification to that classification system that increases the specificity of identifying patients with that adverse event. We are wanting to make sure we are calculating the specificity appropriately for this condition.

2023 September 11

Megan Passarelle & Thu Can (Sarah Welch), Acute Rehabilitation

We are studying co-treatment in the hospital setting. We are planning on distributing a survey to occupational therapists, certified occupational therapy assistants, physical therapists and physical therapy assistants who work in an adult, inpatient setting to understand what factors influence their decision to co-treat. Co-treatment occurs when 2 therapists of different disciplines (OT and PT) work together within one patient treatment session. It is a controversial topic with limited published evidence currently. There is little guidance on when co-treatment is most appropriate, leaving many therapists confused surrounding its practice.

The questions we would like answered are:
  1. What type of analysis should we conduct?
  2. How large should our sample size be? What number would you recommend for conducting a pilot study?
  3. Do we need a statistician to analyze our data?
Clinic Notes:
  • Perspectives on co-treatment in hospitals. Aims are to see how the following factors impact the perspectives: 1. Demographics. 2. Billing practice. 3. Conditions that people find beneficial to co-treat. Has a survey with 22 questions (Likert scale).
  • Recommendations:
    • Choose a primary hypothesis and base sample size calculation on that.
    • Start with basic descriptive statistics and figures. Then think about ordinal regression model.
    • Can come back to clinic to talk about sample size, and/or apply for a VICTR voucher. If applying for a VICTR voucher, then it's recommended to apply before distributing the survey.
    • VICTR voucher is appropriate to apply for if interested. Application website ( and research proposal template (

Mark Rolfsen (Wes Ely and Matt Mart), Allergy, Pulmonary and Critical Care

We are performing a multi-arm observational survey based study to assess the awareness, communication practices and patient preferences of Post Intensive Care Syndrome. We will have several data sets of varying but generally low complexity and are requesting biostats support.
Clinic Notes:
  • Survey on Post Intensive Care Syndrome (PICS) for patients discharged from ICU. Current design has four arms: 1. Patients. 2. Providers for patients in arm 1. 3. General ICU providers. 4. Patients with PICS.
  • Recommendations:
    • Will need to create an official hypothesis and get a power/sample size calculation for VICTR voucher application.
    • Can use agreement statistics to measure the discordance between patient and provider experience, there are several approaches.
    • VICTR voucher is appropriate to apply for if interested. Application website ( and research proposal template (

2023 August 28

Erica Carballo (Courtney Penn), OB/GYN

I'd like to run an ANOVA analysis I did by you for this project:

We seek to estimate the annual percentage of patients with advanced-stage epithelial ovarian cancer in the United States who are eligible for and will derive benefit from PARP inhibitor therapy based on US FDA-approved indications. We compare the rates of eligibility and expected benefit, then analyze these trends over time.

Indications have changed year to year since 2014 as trial data emerges.

Each PARPi indication specifies one of 3 treatment timings (maintenance after first-line, maintenance after recurrence, late-line treatment) and biomarker status that falls into one of 3 distinct categories (BRCAmut, homologous recombination (HR) deficient/ BRCAwt, HR proficient).

I wanted to compare the effects of these two variables. By 2018, there was at least one PARPi FDA approval across all treatment timings and homologous recombination statuses. A two-way ANOVA demonstrates significant effects of biomarker status (F(2, 45) = 10.6, p ? 0.001) and treatment timing (F(2, 45) = 36.7, p ? 0.001) on number of patients benefiting from PARPi from 2018 to 2023. There was a significant interaction effect between biomarker status and treatment timing (F(2, 45) = 3.8, p = 0.010). Variations in populations with FDA eligibility for PARPi therapy is a known contributing factor to this interaction in addition to variation in efficacy (Figure 3). Subsequent ANOVA analyses without replication were performed separately comparing biomarker status and treatment timing to year (and therefore FDA approvals for a given year) which also show significant effect of biomarker status (p = 0.001) and treatment timing (p ? 0.001) independently.
Clinic Notes:
  • Estimating annual percentage of patients with advanced-stage epithelial ovarian cancer in the USA eligible for PARP inhibitor therapy and expected benefit
  • Three biomarker status categories and treatment timing tested using two-way ANOVA
  • Recommendations:
    • Instead of focusing on p-values, focus on newly eligible patients.
    • Explore PubPeer for any possible criticisms against previously published articles using similar methodology

2023 August 21

Rachel Azevedo (Ashley Shoemaker), Pediatric Endocrinology

My project is a two phase study where we will be analyze patient perceptions of genetic testing in the evaluation and management of pediatric obesity to see if these perceptions influence patient outcomes. In the first phase, we will be sending surveys to ~117 patients who have already completed genetic testing to determine if there are themes in patient perceptions towards testing, followed by interviews for 10-15 of these patients. Once we have distinguished any themes in these surveys, we will proceed with our second phase, where we will actively recruit patients aged 12-19 at the VUMC weight management clinic. These children are all offered free genetic testing as part of their management, so for patients choosing to get genetic testing we will ask them to complete a pre- and post- testing survey and will follow their outcomes (including BMI, appointment show-rate, medication fill-rate) to determine if there is a correlation between their views on genetic testing and their outcomes. I plan to also apply for a separate VICTR voucher with the QRC to aid in the design of the interview protocol and analysis of surveys in the first phase. As the second phase is searching for correlations between patient perceptions and the collected data, I am requesting assistance with planning this analysis & creating a statistical model.
Clinic Notes:
  • Survey cohort study, 117 pediatric patients with genetic testing and obesity. Phase 1 is qualitative interviewing. Phase 2 is going to recruit patients from weight management clinic to see how their understanding of genetic testing impacts outcomes. Follow for 6 months to track bmi, visit rates, medication fill rates.
  • Phase 1 population is people who have already taken genetic testing, phase 2 population is eligible for genetic testing which they can accept/refuse
  • Recommendations:
    • For phase 2, patients who refused to take genetic test can serve as a reference group
    • Clearly note the primary outcome in the application, better to select an outcome that will be most impacted
    • Sample size calculation/justification needed for VICTR voucher, base on primary outcome

Jason Samuels, General Surgery

Polycystic ovarian syndrome (PCOS) is one of several obesity-related diseases that results from derangements in hormone signaling pathways. Bariatric surgery has been shown to improve PCOS through weight loss and improvements in patients’ metabolic profiles. However, to date, whether a particular bariatric surgery, i.e. sleeve gastrectomy or roux en y gastric bypass, provides greater resolution of PCOS via greater weight loss is unknown. This study consists of two parts. 1. A retrospective analysis will be conducted identifying patients using the research derivative with PCOS. This will provide preliminary data for a society grant through SAGES that will fund a prospective observational study comparing outcomes following sleeve versus gastric bypass in patients with PCOS. Our hypothesis is that gastric bypass will achieve greater weight loss than sleeve gastrectomy resulting in greater disease remission in PCOS.
Clinic Notes:
  • Retrospectve analysis to compare different types of bariatric surgery on obesity related to polycystic ovarian syndrome (PCOS). Big issue with dealing with missing data.
  • Follow up post-surgery is not required, about 50% of eligible patients are seen up to a year
  • Primary outcome is weight loss in patients after bariatric surgery, secondary outcome is resolution of PCOS with measurement to be determined
  • Population is patients diagnosed with PCOS who are referred to bariatric surgery
  • Recommendations:
    • Due to missing data, using pregnancy indicator as a secondary outcome may not be optimal
    • Identify a biomarker that is reliably collected for measuring secondary outcome
    • Need to use a data source that contains the same population of interest
    • May be helpful to identify patients whose primary care is within VUMC (for more reliable outcomes related to pregnancy)

2023 August 14

Elizabeth Longino (Shiayin Yang), Otolaryngology/Head & Neck Surgery

Need statistician to help perform statistical analysis on data from our Randomized Controlled Trial investigating the use of TXA (tranexamic acid) in rhinoplasty surgery. 2 groups (TXA and control, total ~70 patients) with intraoperative data and postoperative data to analyze. We are nearly done accruing patients – data will be ready for analysis within the next month and will be presented at a national conference in late October 2023.
Clinic Notes:
  • A randomized controlled trial investigating the use of TXA (tranexamic acid) in rhinoplasty surgery, using intraoperative data and postoperative data to analyze. Still in the enrollment process, with 60+ patients now.
  • Primary outcomes: reduction in intra- and post- operative bleeding, reduction in post- operative edema. All are measured in a Likert scale.
  • Sample size calculation is conducted before data collection, recalculating the sample size may result in requiring more patients for enrollment
  • Given the time constraints, decision is needed in whether to move forward with statistical analysis or sample calculation

Chen Bo Fang (Avni Finn), Vanderbilt Eye Institute

Identify ophthalmologic features on OCT imaging predictive of clinical outcomes such as visual acuity. Would like to examine how 3-4 quantitative variables from the OCT imaging correlate with pre-operative visual acuity (quantitative variable). There are also additional variables we would like to control for in a multivariate analysis.
Clinic Notes:
  • Project is to identify OCT features predictive of pre-/post-op visual acuity and successful outcomes in patients with macular hole repair.
  • Interested in the effect of number of cysts and the overall cystic volume on the outcomes (pre-operative vision, minimum linear diameter, lines of visual improvement, symptom duration). Sample size is 46 patients.

  • Unless there is a clear clinical cut-off for a feature that is meaningful, it is usually better to leave well-defined continuous variables as is
  • Needs at least 96 patients to estimate a proportion within 0.1. Consider the sample size, will need to reduce the number of confounders
  • Instead of adjusting confounding variables separately, can consider adjusting for the propensity of confounders
  • For lines of visual improvement, analyze the post result adjusting for the pre value

2023 July 31

Rachel Appelbaum, Surgical Sciences

• In the field of Trauma Surgery, many essential research questions revolve around the initial resuscitation and management of unstable trauma patients. Due to their clinical status on presentation, they are often incapable of providing informed consent or identifying their legally authorized representative (LAR) [1-3].
• In 1996, the Federal Drug Administration (FDA) established EFIC, a set of federal regulations implemented when conducting a clinical trial to inform best practices in an emergency [4]. These regulations include utilizing a process known as community consent, defined as consent obtained through collaboration between the research team and community members. The requirement of community consultation remains poorly defined and variably interpreted [3]. Additionally, once approved, the education of patients’ families around the concept of EFIC and the consent process are variable and researcher dependent.
• EFIC studies help researchers perform trials with significant societal benefit that would otherwise be unattainable. It is incumbent upon the researchers to prioritize respect of patient concerns and experiences [5].
• In the setting of EFIC, pre-existing interactions are not present to build the trust and rapport between the researcher and patient/caregiver that is so important in conducting meaningful research.
• Previous efforts to gain patient/caregiver insight regarding EFIC highlight the need for continued work to determine how best to partner with a patient’s family to improve understanding and the importance of EFIC.
• This work will inform a more patient-centered community consultation process as well as create a more collaborative consent process for individual patients/caregivers and researchers.
Clinic Notes:
  • Perception of patients on informed consent (EFIC).
  • Qualitative study: phase 1 (subject enrollment) would consider demographics such as gender, race, time from injury
  • Phase 2 - trial of delivery methods; phase 3 - standardized EFIC education; phase 4 - validate materials. For phases 1-3, three different groups of participants will be enrolled.
  • Recommendations
    • Biostatistics collaborate plan with Surgical Sciences. Biostatistics faculty contact is Fei Ye.
    • For phase 1, descriptive analysis would be enough and sample size doesn't really matter.
    • For phases 2-4, quantitative comparison requires justifying sample size for detecting meaningful effect size and listing inclusion/exclusion criteria
    • Define outcome measure to refer to in hypotheses
    • Combining phases (to reduce the number of phases) recommended for less complication with sample calculation, IRB approval, inclusion/exclusion criteria; can also have less dependency on previous phases
    • Needs to clarify the outcomes of phase 2 and how to proceed to phases 3-4 depending on the outcome of phase 2.

2023 July 24

Lindsay Podraza (Maya Neeley), Pediatrics

We implemented a new curriculum within the pediatrics clerkship to improve student knowledge, confidence, and clinical skills related to a diagnosis of streptococcal pharyngitis. We have pre- and post-intervention survey data (REDCap), as well as a small control group that we want to compare. Our question is what type of test would be best to analyze knowledge questions (right vs wrong), confidence questions (likert scale), and clinical skills (based on a rubric, gauging # of satisfactorily performed items). We have tried t-tests and Wilcoxon Rank Sum tests so far.
Clinic Notes:
  • Curriculum intervention. Three different domains: knowledge (percent), attitude (Likert scale), and clinical skill (percent). Have control group (post assessment) and intervention group (pre and post assessment). From the intervention group, 65 students had a pre assessment and 28 had a post assessment. Control group had 8 students.
  • Recommendations:
    • Likert scale scores are ordinal, there might be some information lost if using Wilcoxon Rank Sum test (which treats variables as categorical).
    • Pre vs. post assumes that everyone who had a pre-assessment also has a post-assessment, and usually can't withstand even a single loss in the post group. Missingness in post assessment is usually not random.
    • Question: How to best share the data that we have collected?
      • Descriptive statistics can be used to illustrate what happened to the specific students in the study, but making inference is hard with the current study design and collected data
      • Mean works well for summarizing Likert scale as well as continuous data with symmetric distribution
      • Plot raw data, scatter plot/histogram to show variation between students
      • Correlational analysis (spearmans rank) within sample is possible for biased data to compare the relationship between variables

2023 July 17

Anna Pfalzer, Neurology

We have run proteomics on approximately 500 plasma samples collected from individuals with 30 different Neurodevelopmental Disorders. I have several pieces of clinical information from these individuals that I would like to include in my analysis. I am several primary research questions I would like to address, but need help articulating the correct analysis plan and procedure.
Clinic Notes:
  • Grant to collect plasma samples on children with neurodevelopmental disorders. Sample size is about 20 samples for each of the 22 disorders and 40 samples from controls (n~480). Hypotheses are looking at proteomic differences between disorders, within disorders by variant, within disorders by covariates. Also looking in to similarities of disorders, which cluster together. Interested in how neural components (ex seizures) impact nuero-specific proteins.
  • Recommendations:
    • Track record of research when you have more features than you have patients is not the best. It has to be a "smoking-gun" for the research to work. Do literature review to see what has been done. Need to ask more general question that can live within the constraints of the sample size.
    • Start with the simplest question: one protein with one trait still need N=300 to get a reliable correlation coefficient.
    • Principal Component Analysis should not go above 3 components based on literature. First PC maximizes the disagreement between patients. Info would not be available for which protein maximized that difference.
    • 22 disorders may be too many to research for the given sample size (N=20 per disorder), try to group them by characteristics/biological function so that we're not limited by N=20 per disorder, or use a numeric metric/trait as a summary measure to get more power.
    • A mixed approach would be selecting specific proteins and grouping all the remainder proteins to conduct PCA.
    • For the clinical covariates, do sparse PCA first to decide which ones to include in the model.
    • Another method to discover potentially important proteins is bootstrapping the importance of variables: do the data support that you will find the same "smoking gun" over and over?
    • Possible VICTR voucher. Application website ( and research proposal template (

2023 May 22

Amelia Maiga (Mayur Patel), Surgery / Acute Care Surgery

We are designing an unfunded multicenter retrospective study to look at the impact of an institutional process improvement program (trauma video review) on outcomes in trauma patients (time to hemorrhage control, inpatient mortality, etc.) at a systems level (ie, by hospital, not by patient). We are specifically looking for assistance with power calculations to identify the number of centers needed to be recruited and number of patients per center.

Clinic Notes:
  • Using TVR (trauma video review) program improves systems-level outcomes for trauma patients at highest risk of death and disability, specifically in 2 populations. Aims 1) shortens time to hemorrhage for trauma patients presenting in shock 2) shortens time to neurosurgical intervention for trauma patients presenting with blunt traumatic brain injury and midline shift.
  • Retrospective multicenter cohort study; exposure is presence of TVR program (institution/hospital level); primary outcome is time to hemorrhage control/neurosurgery (min); covariates at both patient and hospital level.
  • Recommendations:
    • Need to include a fair number of hospitals so that the hospital characteristics are not related to whether TVR is used. In cross-sectional, the more hospitals you get the more imbalances can cancel out but still does not remove all bias. Having a study where 20 hospitals can be their own controls (pre v post) is much more powerful than having a cross-sectional study with 40 hospitals. Pre-post design removes the bias between hospitals since the specific hospital can be its own control.
    • Since primary outcome is per patient unit, there is more room to include per-patient covariates even with relatively smaller number of hospitals (cluster).
    • If system level covariates are collinear with using video then you would not be able to assess impact of using video review. Best situation is if you first cannot predict use of video from hospital then it means that you should be able to disentangle the effects of video review.
    • Possible use of state-transition model. Finds expected length of time patients are in each state. Death is worst state, discharge home is best state. Would measure every 30 min-1 hour during hospitalization. (Link to lecture:
    • VICTR vouchers are limited time and support-wise for biostatistics support to 90-hours and 12-month duration. Would contact department collaboration statistician first, otherwise best possibility would be to look in to learning health system (if chosen, unlimited help). Contact is Cheryl Gatto.

2023 April 24

Aileen Wright, Biomedical Informatics

LHS supported trial. Interruptive Versus Non-Interruptive Alerts to Recommend Statin Prescribing in Primary Care Questions about outcome measures, statistical analysis, power calculation.

Clinic Notes:
  • Purpose of trial is to determine if interruptive vs. non-interruptive alerts increase statin prescription. Three arms (interruptive, non-interruptive, control). Primary outcome is percent of patients prescribed a statin within 24 hours of alert firing. Secondary outcomes: 1. the percentage of patients prescribed a statin within 12 months post-intervention. 2. First postintervention LDL.
  • Recommendations:
    • Better to state outcome as presence/absence of event for a patient (e.g. statin prescription within 24 hours of alert firing).
    • For analysis, test for difference in 3 proportions using logistic regression; encompasses an overall test in difference of the 3 groups with 2 degrees of freedom. This is equivalent to an ANOVA with continuous outcomes. If 3 groups comparison shows evidence for difference, then you can do pairwise comparisons freely.
    • Increase the power to 0.9 or higher. The power calculation program PS Power and Sample Size can take three different probabilities.
    • Time to event assumes equal opportunity surveillance of outcome. Time to event up to 6 months, 12 months, etc.
    • The R function power.prop.test from the stats package does power calculation for two-sample test for proportions.

2023 April 17

Rong Wang, Human and Organizational Development

My team has been working on a large Facebook dataset (1.3 million posts) to analyze how moral framing and message sentiment influence people’s reactions to posts on social media. Our dependent variables (DVs) are emotional reactions indicated by how people use the Facebook reaction buttons. For DVs, we have tried using both the raw count of emotional reactions and also the proportions of emotional reactions (e.g., number of angry buttons used divided by number of total reactions received by a post) in the measurement. And we used negative binomial regression to run the models. For both approaches of measuring DVs, our negative binomial regression results generated high incidental rate ratios. All of our IVs (moral framing and message sentiment) are measured on a scale of 0-1, so it is impossible for any of our DVs to reach a significant increase in score.

Because of that, we wonder if we could interpret our high IRR scores differently through some form of transformation. E.g., we assume changes in sentiment scores are of the unit of .001, then we open 100th power of the IRR (say 170 become 1.05). This way, the interpretation would be if there is .01 increase in this IV, we would expect 1.05 increase in DV. If there is .02 change in IV, there would be 1.05*1.05 change in DV. Another potential solution we are exploring is the rescaling method by transforming our IVs. For example, we can multiply our IVs by 100, and then rerun the model. Is it feasible?

Clinic Notes:
  • Data: 1.3 million posts on Facebook about vaccine and reactions. All public available posts from March 2020- August 2021. Used negative binomial regression for count variables.
  • Independent variables: moral framing and message sentiment (of a Facebook post), on a scale of 0-1
  • Dependent variable: Facebook reaction button to a post ( 6 categories)
  • Received critique on interpretation of effect size
  • Recommendations:
    • Calendar time and time of day (local time) can be relevant.
    • Offset in Poisson regression (negative binomial regression) should be independent of the thing you're studying.
    • The independent variable should not penalize the length of the post. Could see the effect of interaction with the length of the post.
    • Use LOESS non-parametric trend smoother to visualize data points, could color-code according to dimensions (ex. volume or length of post)
    • Ordinal regression (proportional odds model) does not need as many assumptions (on distribution shape) as negative binomial regression, use around 1000 categories
    • Allow variables to be non-linear (splines). Could also explore interaction with time.
    • Would not use p-values for interpretation.
    • <- resource for ordinal regression models
    • Do a validation study with 200 random samples on the scale.

2023 March 6

Margaret Rutherford (Byron Schneider), Physical Medicine and Rehabilitation

Study title: Outcomes Associated with Guideline-Based Conservative Care for Low Back Pain Question: We would like assistance with some more advanced data analysis. Mentor confirmed.

Clinic Notes:
  • Study population: standard of care patients (therapy + medication). Six-week outcomes: pain score and ODI. Outcomes Associated with Guideline-Based Conservative Care for Low Back Pain Interested in types of prior treatment (physical therapy (Y/N), medications (several types, Y/N), injection (Y/N)).
  • Aim 1: Assess whether conservative care improves outcomes (pre/post). Aim 2: Assess whether prior treatment type affects outcomes.
  • Recommendations:
    • Collapse treatments into meaningful groups.
    • For Aim 1, could do non-parametric tests and pairwise comparison. For Aim 2, could do ANCOVA to adjust for baseline characteristics. The variable of interest would be prior treatment type.
    • Power calculation /Sample size justification can be discussed with VICTR Biostatisticians and should be included inthe the voucher proposal.
    • Possible VICTR voucher. Application website ( ) and research proposal template ( ).

Aileen Wright, Biomedical Informatics

This LHS study is planned as a randomized trial to compare interruptive vs non-interruptive vs. no alert to increase statin prescribing in primary care. I plan to randomize at the patient level. I would like advice about my power calculation, and whether I should target a certain number of patients in each arm or have a recruitment time period.

Clinic Notes:
  • Randomized trial on alerts (interruptive v.s. non-interruptive) for statin use. Study population: statin-eligible patients not currently on a statin. Objective: to determine whether an interruptive or non-interruptive alert increases the proportion of statin-eligible patients who are on a statin. Intervention: 2 interventions (comparator arm A: interruptive BPA recommending statin, comparator arm B: non-interruptive BPA recommending statin), 1 usual care arm (control arm (usual care) with no BPA displayed). Primary outcome: the percentage of patients prescribed a statin in each arm within 3 months after the implementation.
  • Recommendations:
    • Consider focus group interviews for physicians to know their reaction for non-interruptive alerts.
    • Potentially, there is a gap between physicians prescribing statin and patients picking up the medication.
    • Primary outcome may need to be more conservative in order to accurately measure the affect of BPA alert on physician's decision to prescribe statin.
    • Look at pilot study data to check the proportion of patients prescribed statin on the day of visit (and the next day) after the intervention to refine the time window for primary outcome; large proportion can be reason to shorten the time window.
    • For the secondary outcome, 18 months may be too long to attribute the effect to BPA alerts.
    • If doing two tests (interruptive v.s. control, and non-interruptive v.s. control), could do a Bonferroni correction and use alpha=0.05/2.

2023 February 20

Giovanna Giannico, Pathology

Project 1: I have created a survey, and would like to analyze survey results and association with epidemiological variables.

Clinic Notes:
  • Survey of genitourinary pathologists, ~120 records (50% complete). Involved epi data (pathologist demographic questions) and prostate biopsy (practice specific questions). Survey was disseminated using both emails and social media.
  • Purpose of survey is to find how much uniformity there is in reporting prsotate cancer across different societies and countries.
  • Recommendations:

Project 2: I would like to perform an outcome analysis of patients with prostate cancer age </= 45 years (young) compared to those with regular screening age (old).

Clinic Notes:
  • Recommendation for screening PSA (for detecting prostate cancer) is 50 years old, so comparing patients under 45 years old to patients over 45 years old. Collected data on pathological features and outcomes. Outcome: biochemical recurrence. Time 0: time of surgery.
  • Recommendations:
    • The projects may be appropriate for cancer center biostat help.
    • Good to have an analysis plan drafted in voucher proposal.

2023 January 30

Benjamin Collins (Ellen Clayton), Biomedical Informatics, Biomedical Ethics and Society

Development of a measure for patient literacy of artificial intelligence in healthcare. Planning sample size for testing and validation of scale and statistical analysis of results. Mentor confirmed.

Clinic Notes:
  • Develop education module for patients on artificial intelligence.
  • Need to be aware of bias in surveys. Surveys are not great to think of hypotheses, but think about how much does something happen (estimation).
  • Recommendations:
    • Can get proportions/correlations and confidence intervals on them.
    • Margin of error +- 0.1 (on a scale of 0-1) takes 96 subjects, helpful to think about when considering sample size.
    • Randomizing which questions to ask to different patients can put less burden on low response rates; also consider randomizing the order of questions if they are sensitive.
    • Could do redundancy tests on cluster of questions.

2023 January 23

Matt Christensen (Michael Ward), Pulmonary and Critical Care Medicine

We aim to develop a clinical prediction score to estimate the risk of a MRSA infection among patients diagnosed with Sepsis in the ED.

1) feedback on appropriate population? We can analyze data from all ED encounters, or use data from an existing clinical trial (ACORN) which enrolled patients at VUMC with an order for an anti-pseudomonal antibiotic.

2) How to estimate sample size for building a clinical prediction tool?

3) Which method to use for building clinical prediction score (ehr based that can be augmented by provider input)? - principle component analysis? - sequential component selection? - Others? Mentor confirmed.

Clinic Notes:
  • Looking to do risk predictions score for MRSA. Aims: 1. Compare performance of existing strategies (risk factors, syndrome-specific scores, MRSA PCR) for predicting MRSA risk to current practice (provided order for anti-MRSA). 2. Derive an automated EHR based MRS risk score in sepsis and validate internally. Population: adults with suspected sepsis. Effective sample size: number of MRSA*3.
  • Recommendations:
    • Try to collapse variables that act similarly (variable reduction).
    • Broader population (from a non-study) vs. narrower population (from a regulated prior study): use the clinical trial data to get a model, and not assume the model is calibrated correctly for low-severity patients. Weakness: when real data has many noise predictors, obtaining intercept needs to be carefully done.
    • Wouldn't recommend accessing sensitivity/specificity.
    • Can use C index or AUC (area under the curve) to assess the discriminative power of the prediction score.

Carson Moore (Thomas F. Scherr), Chemistry

I work in the Mobile Health for Global Health group. We are currently working on a project that maps human schistosomiasis infections and environmental factors related to the spread of this disease. We are interested in consulting with the Biostats Core to work on calculating the number of sites needed to sample and number of intermediate host snails needed to collect to provide statistically valid results, and assistance on determining the best methods to randomly sample mapped areas. Mentor confirmed.
Clinic Notes:

  • Interested in schistosomiasis infections. Would like to know the best way to sample to validate the risk map (existence of snails).
  • Recommendations:
    • Do not recommend classification method (presence/absence of snails). Consider semiparametric model that uses ranks (more abundance means higher in rank).
    • Use lift curve (as in marketing) for areas.
    • Consider geospatial models that account for correlation between areas. Use random effects in the model.
    • Stratifying regions valid for resource savings, whereas complete randomization assures unbiasness.
Topic attachments
I Attachment Action Size Date Who Comment
BoxPlotR.RR BoxPlotR.R manage 5.7 K 17 Apr 2006 - 11:44 QingxiaChen  
InforegardingwhatmySPSSfilesays.docdoc InforegardingwhatmySPSSfilesays.doc manage 24.5 K 17 Apr 2006 - 11:44 QingxiaChen  
LOA_condensed_data.sxcsxc LOA_condensed_data.sxc manage 22.1 K 04 Dec 2006 - 09:17 PatrickArbogast Data from Edward Butterworth
Oluwole_Biostat_Clinic.xlsxls Oluwole_Biostat_Clinic.xls manage 46.5 K 25 Aug 2014 - 11:30 SharonPhillips data file for Olalekan Oluwole
StatisticalAnalysisRequest.docdoc StatisticalAnalysisRequest.doc manage 22.5 K 17 Apr 2006 - 10:26 QingxiaChen  
WellsIschemicCollat.pngpng WellsIschemicCollat.png manage 37.0 K 31 Jan 2011 - 13:58 MattShotwell  
WellsIschemicEF.pngpng WellsIschemicEF.png manage 37.4 K 31 Jan 2011 - 13:55 MattShotwell  
analysisEXT analysis manage 3.9 K 11 Feb 2006 - 20:30 QingxiaChen  
biost_clinic_stephanie_vaughn.csvcsv biost_clinic_stephanie_vaughn.csv manage 4.3 K 23 Apr 2007 - 11:37 PatrickArbogast  
biost_clinic_stephanie_vaughn.dtadta biost_clinic_stephanie_vaughn.dta manage 1.7 K 01 May 2007 - 11:12 PatrickArbogast Stata datafile for Stephanie Vaughn
biost_clinic_stephanie_vaughn.loglog biost_clinic_stephanie_vaughn.log manage 8.1 K 01 May 2007 - 11:13 PatrickArbogast Analysis results for Stephanie Vaughn from April 30th clinic
biost_clinic_stephanie_vaughn.xlsxls biost_clinic_stephanie_vaughn.xls manage 25.0 K 23 Apr 2007 - 11:37 PatrickArbogast  
boxplotdata.csvcsv boxplotdata.csv manage 2.7 K 17 Apr 2006 - 10:27 QingxiaChen  
clinicimage.jpgjpg clinicimage.jpg manage 134.8 K 14 Aug 2020 - 10:15 DalePlummer  
clintCarroll.sxcsxc clintCarroll.sxc manage 40.4 K 26 Feb 2006 - 21:30 FrankHarrell Clint Carroll Langerhans Data
clintCarrollabstract.sxwsxw clintCarrollabstract.sxw manage 8.7 K 26 Feb 2006 - 21:27 FrankHarrell Clint Carroll Langerhans Abstract
specificaims.docdoc specificaims.doc manage 25.5 K 13 Feb 2006 - 10:11 ChuanZhou Specific Aims
tang.rdarda tang.rda manage 13.4 K 19 Dec 2009 - 08:42 FrankHarrell Data from Yi Wei Tang processed using R code above
Topic revision: r839 - 20 Sep 2023, IneSohn

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback