Recommendations, Analyses, and Data for Biostatistics Basic and Animal Research Clinic
Click
here for older notes
Biostatistics vs. Lab Research (aka I Need 3 Patients) (YouTube)
Note: Some files related to this clinic are on ~/clinics/basicSci on the conference room computer.

2023 February 3
 2022 September 30
 2022 September 9
 2022 July 22
 2022 July 8
 2022 May 27
 2020 Jan 31
 2019 Aug 23
 2019 Aug 23
 2019 July 26
 2019 Jun. 28
 Consultant, Dan Ayers
 2019 Mar. 29
 Consultant Alex Zhao
 2019 Jan 25
 Consultant Dan Ayers
 2019 March 22
 Consultant Dan Ayers
 2019 Mar 22
 Consultant Dan Ayers
 2018 Dec. 07
 Consultant Alex Zhao
 2018 Aug. 31
 Consultant Alex Zhao
 2018 July 13
 2018 Jun 29
 Consultant Alex Zhao
 2018 Mar 23
 Consultant Dan Ayers,
 2018 Mar 7
 2018 Feb02
 Amanda Leung
 2017Dec01
 Kristen Ogden, Pediatrics
 2017Oct20
 Patricia Overcarsh, Gynecology
 2017Mar24
 Consultants Dan Ayers, Run Fan
 2017Feb24
 Consultants Dan Ayers
 2017Jan27
 Consultants Dan Ayers
 2016Dec30
 Consultants Alex Zhao
 2016 Oct 28
 2016Oct28
 Consultants: Dan Ayers
 2016Sep30
 Consultants Alex Zhao
 2016Sep23
 Consultants Alex Zhao
 2016Aug26
 2016Jul22
 Consultants Dan Ayers, Cathy Jenkins, Heidi Chen
 2016May27
 Consultants Dan Ayers
 2016April22
 Consultants Dan Ayers
 2016March26
 Consultants Dan Ayers, Heidi Chen
 2016 January 29
 Consultants Dan Ayers
 2015 October 30
 Consultants Dan Ayers
 2015 October 23
 Consultants Dan Ayers and Heidi Chen
 2015 October 23
 Consultants Dan Ayers
 2015 September 25
 Consultants Matt Shotwell, Dan Ayers
 2015 September 3
 Fei Ye
 2015 Aug 28
 Dan Ayers
 2015 Aug 14
 Guanhua Chen and Alex Zhao
 2015 July 24
 Dan Ayers and the esteemed Heidi Chen
 2015 June 19
 2015 May 29
 Dan Ayers, Consulting
 2015 May 22
 Dan Ayers, Consulting
 2015 May 1
 2015 April 10
 2015 Mar 06, Dan Ayers and Heidi Chen consulting
 2015 Mar 06
 2015 Feb 20
 2015 Jan 30
 2015 Jan 9
 2014 Sep 24
 2014 Sep 19
 2014 May 30
 2014 Mar 28
 2014 Mar 7
 2014 Mar 7
 2014 Feb 28
2023 February 3
Cristina Harmelink, Pediatrics/Cardiology
Dataset includes multiple lymphatic valve tests on different transgenic animals  need to discuss how to compare/ what test to use that includes all variables.
2022 September 30
Shannon Townsend (Maureen Gannon), Molecular Physiology & Biophysics
Project deals with examining gene expression by qRTPCR in cultured cells after different tretments singly and in combination. Our issue is figuring out the best way to graph the data, as baseline measurements between treatments differs between groups. Mentor confirmed.
2022 September 9
Selene Colon (Gautam Bhave), Medicine/Nephrology
Peroxidasin (Pxdn) generates HOBr to form sulfilimine crosslinks in collagen IV. Collagen IV is a prominent constituent of a specialized sheetlike form of extracellular matrix called basement membranes (BM). It underlies cell layers in all tissues, such as the kidney glomerular BM. Previously we determined that the loss of Pxdn and sulfilimine crosslinks in Pxdn knockout (KO) mice led to reduced sulfilimine crosslinks and BM strength. To determine if the loss of sulfilimine crosslinks affects renal vascular injury, we used the unilateral nephrectomy with angiotensin II infusion (Unx + Ang II) model of kidney injury. After 2weeks of injury, Pxdn KO females showed an increase in the inflammatory response, renal fibrosis, and a decrease in both renal function and survival compared to all other mice. In this work, we found that the loss of Pxdn disproportionately affects females more than males. These data suggest that the loss of Pxdn and sex contribute to renal fibrosis and vascular inflammation in response to vascular mechanical injury. We would like to know how to statistically show the differences in both sex and genotype that we see. Mentor confirmed.
2022 July 22
Rei Ukita, Cardiac Surgery
Would like to receive statistical consultation on analyzing large animal data experimental data. Currently under manuscript review, and I received a lot of critical comments regarding the statistical analysis method. I can provide visuals/tables/other upon request. Experimental Design: In this study, we attached a blood pump device to sheep for several hours under different configurations (i.e. where the device was anatomically attached). We increased the pump speed/flow rate over time, with a goal of achieving 1, 2, or 3 L/min of blood flow. The flow was maintained at one of these levels for 12 hours, and we collected various data points at the end of these periods (e.g. blood pressure, pump performance, etc) before increasing pump speed/flow. The purpose of the study is to compare between different configurations at different flows and see if there is any top contender in the optimal support technology. I've been recommended on using mixed linear model for these types of large animal experiments, as complex studies like these are prone to missing data points and mixed linear model handles that pretty well. I want to make sure if (1) linear model is the way to go, and (2) I am using it properly.
2022 July 8
Sara Ramirez (Sabine Fuhrmann), CDB
I have data collected from mice along different timepoints. 3 different conditions, and 3 different timepoints. The main question we want to ask is whether at the last time point, there is a 'recovery' of the damaged values. What I need advice on is whether to compare to control conditions at the 3 different time points, or compare each condition to their respective time points to answer the question about 'recovery'. Mentor confirmed.
2022 May 27
Jeffrey Schmeckpeper (Bjorn Knollmann), Cardiology
Animal studies. Help with selection of specific statistical test for multiple group comparisons.
2020 Jan 31
Consultant: Dan Ayers
Lucy Li, Dianna Rowe, Ed Levine
 Compare 5 mRNA between 4 (including control) genotypes.
 Recommendations: Analyzed data on single delta Ct values. Blocking on different analysis times. Looking at a reference control along with wildtype, then treated. Ratio's are messy
2019 Aug 23
Consultant: Dan Ayers
Brittany Spitznagel, Pharmacology, Mentor: David Weaver
 Animal behavioral study. Effect size determination.?
 Sample size and effect size

2019 Aug 23
Consultant: Dan Ayers
Paige Vinson, PI, VICB HTS Core
 Kinetic data fit to model. How to compare binding pairs.
 Twosample ttest on fit parameters accross conditions. ANOVA and Tukey's HSD

2019 July 26
Consultant: Dan Ayers
Manisha Sharma, Dr. Susan Wente's Lab, Cell and Developmental Biology
 3 experiments: qRT PCR
 Paired ttest on the CT value
 qPRCR based on pull down %. Use nonparametric Wilcoxon Rank Sums Test if concern about normality of the data.
2019 Jun. 28
Consultant, Dan Ayers
Sarah Graph, MPB
 Analysis approach to islet donor analysis. WT v Mutant. Pairing on donor.
 Paired ttest at high glucose measurements.
2019 Mar. 29
Consultant Alex Zhao
Christian Egly, Clinical Pharmacology
 The investigator’s primary goal is to compare the cell contraction time among multiple groups that were treated for a different length of time.
 Recommendation: Anova or kruskalwallis test could be used to test the overall differences among all the groups. If pairwise group differences want to be assessed, may need to take multiple comparisons into consideration.
Laura Winalski , Pharmacology
 The investigators are seeking help for power analysis.
 Recommendation: Based on the previous study published from the group, the power analysis will be done the same way (two groups, dicohtomized outcome) in PowerSampleSize software. The Bonferroni correction for the Alpha was recommended for the power analyses.
2019 Jan 25
Consultant Dan Ayers
Rebecca Weiner, Jason Russel, Laura Teal. Grad Student, Department of Pharmacology
 PBS vs MOM injected eye ( mouse model. Treatment randomized within animals. > 30 anaimals). Stimulate eye tissue post sac, Primary endpoint is number of spikes and amplitudes in stimulated tissue over a range of 0 to 280 picoAmps.
 Paired ttest with multiplicity adjustment. Better  mixed model.
2019 March 22
Consultant Dan Ayers
Ed Levine, Opthamology (No Show)
 Methods for examining MV data.
2019 Mar 22
Consultant Dan Ayers
Miles Bryan, Ph.D. Candidate, Department of Pediatric Neurology
 Veh, Mn, IGF, and Mn+IGF in wildtype and HD mouse cells. All possible comparisons. Recommend ANOVA and Tukey's HSD for anlaysis in GraphPad prism. Many experiments so this controls the experimentwise type I error rate. Studies were not powered a priori.
2018 Dec. 07
Consultant Alex Zhao
Roslin Thoppil, Cell and Developemental Biology
 The investigator’s primary goal is to determine if ectopically expressed CLASP2 can rescue MT growth between control and CLASP1 knockdown cells.
 Current data shown: the control groups between Normal and CLASP2 high groups are different.
 Recommendation: For the primary goal of comparing growth between control and KD, among Control and KD samples, take a linear model approach, include primary condition, sub conditions and the interaction term in the model. The interaction term will show the difference between CLASP2 High group and Normal groups in the differences between KD and Control.
2018 Aug. 31
Consultant Alex Zhao
Klarissa Jackson, Pharmacology/Pharmaceutical Sciences, Vanderbilt University  Lipscomb University Partnership
 The PI is aiming to examine the effect of genetic variation in drug metabolizing enzymes on the metabolism of multiple anticancer drugs in vitro. The PI is seeking help for the power analysis/sample size estimation.
 Discussed the parametric and no parametric tests that she could potentially use.
 Discussed the potential multiple testing issue that she needs to address in her proposal.
 Discussed the different approaches she may consider for her proposal (varying sample size, varying difference, varying power).
 Discussed the opportunity for applying VICTR support for the study.
 Recommended the PS software (http://biostat.mc.vanderbilt.edu/wiki/Main/PowerSampleSize), so that she can start planing for the proposal.
 Recommended to refer the biostatistic clinic in the proposal.
Rachelle Johnson, Dept. of Medicine/Clinical Pharmacology
 The PI is seeking help for a power analysis for a mouse study that she is proposing for a grant application. She had preliminary study/data ready to conduct the power analysis.
 Discussed the parametric and no parametric tests that she could potentially use.
 Discussed the different approaches she may consider for her proposal (varying sample size, varying difference, varying power). If the budget is fixed, she may consider providding power/sample size under different secnarios.
 Recommended the PS software (http://biostat.mc.vanderbilt.edu/wiki/Main/PowerSampleSize) that she can start with for the proposal (due in few days).
2018 July 13
Christin Giordano and Heba Allam (Anthony Langone), Internal Medicine
 Consultant Chris Slaughter
 PIs are interested in studying time to fractures and acute kidney injury in subjects who received a kidney transplant. Most patients are presecriped a drug, Reclast, but the timing of the prescription can occur at variable times after the transplant. The PIs are interested in if receiving a Reclast prescription is associated with increased risk of fracture and acute kidney injury
 Study design is a retrospective chart review
 We discussed a time to event survival analysis with Reclast being a time varying covariate. We discussed the necessary data requirements for creating a Redcap database including the capturing of key dates: Date of Transplant, date of Reclast prescription, and date of fracture or date of last exam without a fracture
 While the timing of prescribing Reclast varies from subject to subject, no subjects are prescribed Reclast at the time of their transplant. For the time to event analysis, subjects will be at risk of Reclastrelated fracture starting at some time point (not known when this is at the meeting) following transplant. It may be necessary to start the time to event analysis at, say, 6 months after transplant.
 Since the analysis is complex, will be seeking additional assistance with VICTR for biostat support
2018 Jun 29
Consultant Alex Zhao
Benjamin Reisman (Mentor: Brian Bachmann), Dept. Chemistry
 PIs are building a data analysis pipeline for 'multiplexed' flow cytometry data in R. Part of the analysis workflow involves predicting the intensity of one of the markers based on the other markers using linear regression. PIs had previously used a constrained polynomial regression, but would like to switch to a faster and more flexible model using natural splines or generalized additive models. PIs are looking for advice on how to select a model, selecting covariates, and allocating degrees of freedom.
 Overfitting is less an issue given the large study sample size. However, PIs need to balance the models' flexibility and computing intensity.
 Restricted cubic splines is another option that can be tried.
 Not using the modeld fitting residuals to guide removing samples. Instead, any sample inclusion/exclusion need to be done upfront with valid reasons.
2018 Mar 23
Consultant Dan Ayers,
Molly Altman, Dept. Molecular Physiology and BioPhysics
 In vitro islet cells. 6 animals per group. Repeated measures over time. Recommend mixed model.
 Same thing for human islet experiments.
Natalya Ortolano, Dept. Cell and Developmental Biology
 marking different labels for different samples. Differencing these and comparing markers. 3 biological replicate per group to get estimates of effect and variability to plan more definitive experiments.
 will seek VICTR funding for experiments and CTSA biostat support.
2018 Mar 7
Amy Zeller, Mari Powell, Dept of Otolaryngology
 Seeking advice on planning a pilot study comparing SOC speech pathologist patient interviews with SOC + the addition of a peer interview. Logistic limitations of accrual of about 12 patient in 12  24 months. Recommended language for a randomized pilot study that will provide preliminary data to plan more definitive trials.
Bradley Richmond, MD, PhD., Department of Medicine, Asst. Prof.
 Seeking advice on stat budget VANDY TIP grant. Two sample ttests with multiplicity control.
 Comparing ratios of markers to total cells for 4 markers (control multiplicity?) between normal and COPD patients from banked tissue in each.
 Pursue a VICTR (CTSA) stat support grant (Li Wang).
 Contact Sandra Hewston for hourly cost of a staff statistician. TBN. 10 to 20 hours (max).
2018 Feb02
Amanda Leung
My project is looking at detecting the differences between samples using a single marker, and I am at the point where I have collected a decent amount of data and would like to understand how to detect the power of my dataset.
She has data on different mouse genotypes and is looking at neurogenic length. The measurements are made at different time points (11.5,12,12.5, and 13.5 days) for the genotypes to determine if they are positive or negative for a particular marker. She has a control group (wild type) to compare against. She does have each mouses time and has placed them into the time points above creating 'bins'. We discussed graphing the actual times and seeing if they 'bin' or cluster the way she has them. She is concerned with having enough power to detect a difference. Discussed when determining power for a sample, using the number of mice you can afford, or the number you can feasibly have for the experiment, to determine what effect size you are able to detect.
2017Dec01
Kristen Ogden, Pediatrics
*"We are exploring the frequency of viral gene segment reassortment during coinfection and superinfection. The viral genome is made of ten segments (3 large, 3 medium, 4 small). I want to know how many segments (and how many of each size) we need to analyze per virus and how many individual viruses we need to analyze to determine whether reassortment in general and for specific segment sizes (lg,med,sm) is random or nonrandom. This is typically a binary problem, with segments coming from one parent or the other."
2017Oct20
Patricia Overcarsh, Gynecology
*"The study I’m interested in doing is looking at time to identification of ureteral jets during operative cystoscopy. Performing cystoscopy after hysterectomy has become a standard practice; however, the most commonly used dye to help better visualize the ureteral jets, indigo carmine, has not been available (and unlikely to be available in future) for some time. Instead of indigo carmine surgeons have substituted other dyes as well as no dye at all. Our hope is to look at 3 of the commonly used techniques (1 no dye at all, 2 pyridium, 3 uroblue) with a primary outcome being time to identification of the ureteral jets. A secondary outcome would potentially be looking at costeffectiveness of the various techniques.
I’m still in the project design stage, currently needing help with a power calculation/sample size for a RCT with 3 arms. I’ve started working with Dr. BeeghlyFadiel but we had a few questions regarding the statistics."
2017Mar24
Consultants Dan Ayers, Run Fan
No clients
2017Feb24
Consultants Dan Ayers
Lance Thomas, PH.D., Research Faculty, Cell and Developmental Biology
 QRTPCR
 Two sample ttest, ANOVA for group comparisons of mRNA.
2017Jan27
Consultants Dan Ayers
Boire, Tim  Ph.D. candidate, Biomedical Engineeral. Craig Duval, PI
 Return consulation for Sample Size, etc.
 Discussed the twogroup sample size estimate from PS. Model assumptions of an ordinal scale summed over tests, scale transformations.
2016Dec30
Consultants Alex Zhao
Reynolds, Bryson B, Institute of Imaging Science
 comparing transformed accelerometer data resulting from two transformation strategies (VICTR applicaiton)
 The aim of the study was to evaluate the performance of an individualized transformation strategy for acceleration measurement (new method) over the current standardized transformation strategy, comparing with the goldstandard acceleration measurement. The investigator hypothesized that the new method will be more accurate (to the goldstandard) than the current standardized method. The BlandAltman plot was recommended to visualize the agreement with the goldstandard. Other metric for accuracy (statistics) that are generally used in the field may be calculated. The investigator may want to investigate the formula (models) for the individualized transformation. This will require some statistical modelling work that potentially involves some correlated data. Model validation may need to be done as well.
 VICTR biostatisticion: the PI is seeking VICTR biostats support, Please provide an estimate for the VICTR application.
Girish S. Hiremath, MD MPH, Assistant Professor of Pediatrics, Division of Pediatric Gastroenterology, Hepatology & Nutrition
 statistical analysis of dataset related to a questionnaire based survey to understand practice patterns of management of esophageal food impaction
2016 Oct 28
Heidi Silver (VICTR Manuscript Studio)
 Dr. Silver is looking to revamp the statistical analysis of her manuscript "Higher Fat Diet Reduces Cardiovascular Risk in Obese Women: Differential Changes in Adipose Tissue Distribution, Lipoprotein Pathways, and Insulin Resistance by Race."
 Recommended Dr. Silver analyze the data with a regression model, paying special attention to
 missing data from loss to followup
 appropriate inference when comparing two populations
 variable selection motivated by clinical expertise instead of datadriven algorithms like backwards/forwards selection
 Recommended Dr. Silver seek a VICTR voucher for 35 hours of biostatistical support.
2016Oct28
Consultants: Dan Ayers
No clients.
2016Sep30
Consultants Alex Zhao
Lauren Woodard, Ph.D., Research Instructor, Division of Nephrology, Department of Medicine
Gwynne Davis, graduate student, Neuroscience Program
2016Sep23
Consultants Alex Zhao
Scott R Jafarian,Research Fellow, Clinical Pharmacology
 attending as recommended by the MSCI course instructor
Tamara Moyo
 Exp 1. two study group, 3 mice per group, measured mean fluorescence intensity at time 0, 5, 10,30,60,180 minutes.
 Exp 2. two study group, 2 mice per group, three different stimuli were used in each group. measured mean fluorescence intensity at time 0, 10, 30,minutes.
 Comments: 1. Given the population that the data were sampled from follow a normal distribution, ttest is valid, though the power could be a major issue. 2. Multiple testing must be taken into consideration if multiple ttests are performed. 3. Comparing the trend might answer the study question better than comparing two groups at each time point. 4. including more mice would definitely help and the baseline data can be adjusted to reduce the errors caused by the instruments/days.
2016Aug26
Yuanjun Guo, Ph.D candidate, Pharmacology, Hind Lal labe.
 Exp 1. Compare expression of XX via QRTPCR in a two group comparison of WT vs KO mice. Two sample ttest or Wilcoxonrank sums test
 Exp 2. 2 (surgery +/) x 2 (WT vs KO) factorial comparing QRTPCR. ANOVA . Given this design, don't need Exp#1.
 Plan for an RNAseq study (more suited for Omics clinic). 34 mice per group not suitable.
2016Jul22
Consultants Dan Ayers, Cathy Jenkins, Heidi Chen
Michael Wang, Student  Dept of Neuroscience
 Expression levels of different markers accross different tissue types and diseases. Data sent are mean values. Need patient level data to make the comparisons of interest. Return to clinic at a later date to look.
Ian Setliff, Grad Student  Dept of Microbiology
 HIV vaccine study. Immune response in nonhuman kinase studies. Antibody profiles in the primates. Next gen sequencing of antibodies that each subject produces. Each primate has a profile.
 mRNA abundancy. Several mutagens to vaccinate.
 Sample size? How many nonhuman primates needed per mutagen.
 CQS website has a calculator.
2016May27
Consultants Dan Ayers
Debra Jacobson  Urology
 Event rate data that can be converted to survival data with times to events for people with events and censored at 365 days for people with out. Ten years of data.
 Time to first event is primary endpoint with differences over years. Log rank test for overall differences in timetoevent. Create 9 dummy variables for 10 years and use Cox (proportional hazards) regression for closer look at the year effects.
2016April22
Consultants Dan Ayers
Rachel Crouch  Grad Student, Department of Pharmacology
 I would like to compare substrate depletion rates obtained from incubating substrate with liver subcellular fractions of male or female animals. These are commercially available liver fractions, and in some cases, male and female were not purchased from the same vendor. Also, they are pooled liver fractions from several animals, however the male and female pools were not necessarily obtained from the same number of animals (e.g., 5 animals for female liver fractions and 10 animals for male liver fractions).
 Pooled samples result in samples sizes of 1 for males and females. No statistical test available.
2016March26
Consultants Dan Ayers, Heidi Chen
Denise Buenrostro  Grad Student, Department of Cancer Biology
 Flow cytometry data. Time course with mice. Tumor+IgAnti, Tumor+Ig Inhibitor, PBS+IgAnti, PBS IgInhibitor.
 Day 10, 14, day 32. Discussed use of twosample ttest Vs ANOVA error term and log transformation.
Oscar Ayala Grad Student, Department of Biomedical Engineering
 RAMEN data, Laser excitation. Measuring a change in energy..xaxis is change in energy (wave numbers, inverse cm) Y axis is photon counts. For bacterial identification. Can RAMEN tell the difference between samples of one gene knockout.
 Suggesting logistic regression to predict group membership.
2016 January 29
Consultants Dan Ayers
Johanna Schafer  Grad Student, Department of Biochemistry
 Screening test. Mechanism of resistence for a Breast Cancer drug. Cal51 cellline derived from patient tumor. Applying the FDA drug cancer screen to a derivative cell line resistance to PI3K inhibitor. 119 drugs in the screening library. Added 60 addition drugs. 2 clones from parental and 2 from resisitance. Tested drugs on each of these. Validation. Take the "hits" defined as 2fold IC50 change between parental and resistant response.
 Find summary statistics for each close over all drugs.
 Use summary data in sample size calcualtions.
2015 October 30
Consultants Dan Ayers
Bianca Flores, Ph.D. student, Neuroscience, Mentor  Eric Delpire
 Group, Time, Group*time repeated measures ANOVA.
Giju Thomas, PostDoc, Biomedical Engineering  Anita MahadevanJansen mentor.
 Sample size calcs for animal study. Recommend pilot study for variance estimates.
2015 October 23
Consultants Dan Ayers and Heidi Chen
Zeljka Korade
 Nanostring technology to detect gene expression. Counts of probes are the outcome. No technical replicates but internal controls to test repeatabiity. n=6 patients. treatment and control in the paired setting.
 compare counts using Wilcoxon signed rank with FDR control
 current sample size n=6 is the issue. Significant obstacles to planning a larger study.
 Pilot study  justify sample size as what is needed to answer logistical (methodological) questions, and to summarize preliminary data for determing statistical power for in vivo models.
2015 October 23
Consultants Dan Ayers
Shellese Cannonier, Cancer Biology (Julie M. Sterling  supervisor).
 Take the average of triplicates. Compare these with twosample test like the ttest (if data look symmetric), log of the data if the data remain skewed, or the Wilcoxon Rank Sum test.
2015 September 25
Consultants Matt Shotwell, Dan Ayers
Eric Hustedt, Richard Stein, Molecular Physiology
 Need method for calculating variance of Distance distribution using double electron, electron resonance.
 Recommended delta method.
Kathy Jingquong, Ph.D., Neurology
 Needs statistical support for pharmacokinetic estimation.
 Neurology has collaboration plan. Contact Josh Chen. Matt can provide support.
Kurt Schilling, Ph.D., Biomedical Engineering
 Needs a method to randomly sample a grid of imaged voxel slices.
2015 September 3
Fei Ye
David Aronoff, MD, FIDSA, Director, Division of Infectious Diseases
 Associations between genotypes of some immunityrelated genes (Siglec genes) in women who are in a prospective cohort study of pregnancy.

At least one of the genetic mutations under investigation exists in about 2030% of humans (a mutation in the Siglec 14 gene) and in the study cohort about 30% of women will deliver prematurely. If the presence of this mutation (when homozygous) increases the risk for prematurity by, say, 2fold, then how many people are needed to study to detect this difference with a power of 80% and alpha at 0.05?
2015 Aug 28
Dan Ayers
Cheryl Gatto, Depart of Biological Sciences
 Drug testing on Drosophila in a Fragile X (Neurodevelopmental disorder). Data aquired in a 2x2 factorial treatment arrangement. Vehicle/Drug and Control/Mutant. Outcome is fusion events among all events. Experimental unit is the brain of a Drosophila. Recommmend an exact logistic regression. * Also interested in differences among drugs. So potential random effects from different experiements, confounded with drug. General linear mixed models. * Recommend seeking statistical support through VICTR or short term statistical support arranged through Dr. Harrell. * Ran through one example for Baclofen. 4 2x2 tables and then the logistic regression.
2015 Aug 14
Guanhua Chen and Alex Zhao
Cameron Schlegel, Department of Surgery
 Question: want to use the BIOVU to confirm the novel (rare) mutation found in the whole genome sequencing study done in the lab is truly associated with diarrhea.
 Suggestion: Use record counter to get the sample size for disease and nondisease kids with genetic information. Based on the frequency of the mutation in the healthy disease,
generate power table by varying the mutation in the diarrhea kids.
2015 July 24
Dan Ayers and the esteemed Heidi Chen
Juan Gnecco, Department of Pathology
 Experimental design. In vitro experiments with 4 groups. Standard ANOVA with Tukey HSD.
 Sample size examples using PS
Christine O'Brien, Department of Biomedical Engineer
 Generalized linear models with robust covariance in R.
 Optical imaging (e.g. Hgb concentration, etc). C. Slaughter recommend GEE using RCS on time. Discussed normalization versus ANCOVA. Question was how to interpret the anova(fitted robcov object). * Will seek sample size advice
2015 June 19
Lara Harvey, Department of Gynecology
 Would like to discuss what statistical measure is most appropriate to determine the magnitude of an effect of BMI on FSH levels among different obesity levels.
 Want to know the association between BMI and FSH controlling for age, built a prediction model FSH = BMI + age + BMI*age
2015 May 29
Dan Ayers, Consulting
Jessica Wilson, MD Endocrinology
 Have a data set on all cause readmission for 394 sarcoma patients. 5% incidence. Use a logistic regression to model given a prespecified set of variables.
2015 May 22
Dan Ayers, Consulting
Anthony Daniels, Department of Opthamology
 Sample size estimation for animal protocol. Components of sample size for twosample ttest discussed. Examples constructed using PS software.
2015 May 1
Daniel J. Miller, Department of Psychology, Psychological Sciences
 Discussion about microstimulation data to develop a test of the hypothesis that stimulating two areas in the brain from which evoked movements differ produces a blend of those movements (endpoint neuronal encoding)
 Need help understanding how to organize the data in order to build a model to explain physiological results (e.g., how the dual stimulation sites interact)
 Plan to apply for VICTR funds to get statistical help for the analysis the data
2015 April 10
Jason J. Winnick, Department of Molecular Physiology & Biophysics
 Need help to address R01 review comment: “Statistical analyses have been addressed using animal data. Unfortunately the intraindividual and interindividual variability in T1DM is considerable. While the preliminary data in dogs of the effects of glycogen loading on counterregulatory responses (with the relatively small SDs) are impressive, humans with T1DM are very different, with significant variability in such responses, especially with antecedent hypoglycemic episodes. Hence, the power calculations (n=10 T1D), likely based on dog studies, may be insufficient for the human T1DM studies. This requires refinement and further consideration.”
MaKenzie Kalb, Shasta Rizzi, Ally Garcia (sponsored by Joe Schlesinger)
From email to clinic list:
We are in the process of creating a dynamic alarm prototype for potential use in the ICU. The underlying idea of our prototype is to measure the dB of ambient noise levels and then output an alarm noise at a prescribed number of dB below the measured ambient noise.
We are encountering some statistical questions when attempting to calibrate our microphone circuitry for the ambient dB readings. The measured voltage from the microphone is plotted vs. "actual" dB readings using a professional meter. Our goal is to find a fit to calculate dB from the measured voltages. We have been using a logarithmic fit in excel that is alright, with an Rsquared of about 0.95. But when we plot our percent error when using the logarithmic fit there is a definite quadratic pattern.
Gabrielle Rushing and Rebecca Ihrie, Cancer Biology
2015 Mar 06, Dan Ayers and Heidi Chen consulting
Tim Hill, Grad Student, Path/Immunology
 Two group experiment repeated at 4 different times. How to combine data.
 Recommended mixed models with random effect for experiment number on log scale.
 In the absence of statitical support, compare group using ttest or Wilcoxon rank sum test each experiment. Discused type I error control using Bonferroni.
Ryan Doster, IMD. nfec Diseases
 Animal study to compare live births in pregnancies follow 13 days of pregnancy. infection and sac'd on day 17 to count live births.
 Any death is important so sample size estimates comparing po=55% to p1=35% require 96 animals per group with type I error of 5% and 80% power.

2015 Mar 06
Vance L. Albaugh, M.D., Ph.D., Department of Surgery
 It is a basic science project but in a surgical population followed over time postoperatively.
 bile acids metabolisms for patients undergoing weightloss surgery, preop, 1m , 6m, 12m, 24m. Measured 17 bile acid concentrations in the blood. 20~30 patients at each time point. 45 patients in total.
 hypothesis: upregulation of bile acid is associated with loss of weight.
 Correlation between bile acid concentration and weight (loss)
 fit longitudinal model. The outcome is postop bile acid concentration. The independent variables are time point, baseline bile acid concentration, baseline weight, weight at each time point, and other confounders, Take into account withinsubject correlation (compound symmetry or AR1 correlation structure).
Celestial JonesParis, Graduate student, PMI
 RTPCR of three genes of mouse tissues pooled together (only one biological sample). three tubes of RNA. ttest.
2015 Feb 20
Dan Ayers, Consulting
Joe Prinsen, Cancer Biology
 Experiment using mice. WT and transgenic mice. Both groups being injured. Followed for 3 month. Cardia Echo. Primarily measuring ejection fraction. Would
like to be able to detect a 5% change or 10% difference between the two groups at the final measurement. Day 1, 7, and 21 are standard, days, preSac. Mice undergoing ligation so there are dropouts. WT mice lose EF a slower rate.
* Recommended/demonstrated sample size estimation using
https:/statools.crab.org.
2015 Jan 30
Westley Bauer, Arts and Science
 Sample of N=12. There is instrumental error, want to use statistical test to find outliers.
 Any statistical test is based on some distribution. Need larger sample to test for distribution.
 AndersonDarling test for normality
 Nonparametric test based on rank
Linh Dang, Psychology
 Sample size calculation. All patients had baseline measures and then take treatment, will compare pre and post value.
 Effect size is not recommended. Should use the variance and the difference.
 Assume the most possible correlation between pre and post and calculate the SD of the difference.
2015 Jan 9
Heidi Chen, Chris Slaughter Consulting
Eric Rellinger, research fellow in the laboratory of Dai Chung, Pediatric Surgery
 I am a general surgery resident working as a research fellow in the laboratory of Dai Chung, Pediatric Surgery. I submitting a VICTR grant in support of a project I am pursuing looking at the role of reactive oxygen species in neuroblastoma growth (a very common peds tumor). I will be conducting both in vitro and in vivo work in mice, and would like to optimize my setup plan with your help.
 Primary outcome: tumor size. Need variability across mice and estimated effect size to calculate sample size. Suggest PS software
 Triplicate cell lines
Begum Erdgoan, Bio Science
 Compare between groups on an outcome with nonnormal distribution. Suggest use Wilcoxon test, which gives same pvalue with or without transformation
 There are replicates in the data, can take average for each experiment
 Cells will be in treatment or control groups and observed on day 1, day 2, day 3
 Can test between groups using Wilcoxon test for each day
 Better use mixedeffects model using all the data points
2014 Sep 24
Dan Ayers, Consulting
Nicole Higgins, Graduate Student, Biological Sciences
 Submitted a paper. Responses back. Cell based adhesion assembly and disassembly in 2D and 3D culture. Rates of assembly and disassembly in time lapse images. GFP tag on cell adhesion proteins. GFP intensity increases over time, adhesion is increasing. Measure is a slope of the fitted intensity:time curve, to get t1/2. T1/2 is the outcome measure. Ho: T1/2 Protein 1 = T1/2 Protein 2. Sample size is an adhesion. N~20 for each protein. Parametric vs nonparametric. Use the Wilcoxon rank sum test.
Allen Wu, Infectious Diseases, Neonatology Fellow
 Survival Curves of groups of mice (~n=12 to 25 per group). Use logrank test to compare groups K=8. Having looked at the data and graphs, want to test if 2 to 6 day old pups differ from >= 7 day old pups with respect to survival. Use logrank test to compare groups overall.
 Next hypothesis is to test whether the "break" noted in the first experiment 26 day old pups versus >=7 day pups is significant. Repeat experiment to confirm.
 Is weight gain a surrogate for illness? Let Y= AvgWT /n = group + Time + group*time + time*time + time*time*group + Error in an analysis of variance
Lauren Herrera, Psychiatry. Ariel Deutsch, PI
 Dendritic spines on a neuron. They change number and shape. Depending on where the neuron is going, it has different sizes and shapes. Used KolmogorovSmirnov test to compare groups. Groups are targets (k=5). Want to compare skewness (kurtosis). We are interested in a mean shift, but you want to explain this based on the skewness of the raw data. KS test is doing, if max diff is later in the CDF's, given skewness is in the same direction. If just looking at 5 groups, ANOVA with Tukey's Honestly Significant Difference. Now expanding to 2 Factors in a 2x5 factorial treatment arrangement. Compare, using least square means.
2014 Sep 19
Dan Ayers, Consulting
Eva Sawyer, Neuroscience Graduate Student
 In 2 animals. Resect "stars". Each star contains 11 rays, each ray comprised of 600 to 800 Eimers organs. Hypothesis: Does ray 11 have smaller Eimer's organs on average compared to the other 10 arrays. There are 2 sets of 11 rays in each start, symmetric around a vertical axis. Measurements take on one set of 11. Analysis within animals. Need to compare Eimer organ size among the 11 rays. Might also consider ray 11 with pooled rays 110. Need to account for spatial correlation among the Eimer's organs (hierarchical model). Recommended contacting department head to see if stat support already in place for the department of neuroscience and also to contact Chang Yu to apply for VICTR support for this analysis.
 Second measurement process. From the base of each ray to the tip, sampled 20 regions on each ray. Counted the number of organ in each region and calculated region area. Density measurement for each of the 20 regions is number of dots/area for density. Correlation among density estimates will need to be accounted for.
2014 May 30
Justin Zumsteg, orthopedics
 Recovery of typing function, speed, accuracy and duration. Before and after surgery on hand. Typing function was measured at preop, 1.5, 2, 3, 4, 5, 6, 8, 12 weeks postop. The measurement is #words per minute (given % accuracy reaches certain threshold). Factors associated with "return function". Time to recovery can be continuous.
 Outcomes measurements at baseline, 1.5, 3, 6, 12 weeks postop. Correlate with typing function. Can do a GLS linear regression.
SM Sahabi
2014 Mar 28
Jamaine Davis, Meharry
* Interested in association between mutations in genes and breast cancer.
* Study will look at mRNA and protein expression levels of a specific gene at different stages of breast cancer (stages 14, not sure about monotonic trend in tumor progression)
* Interested in assistance with sample size calculation and analysis strategy
* Need preliminary information on expected difference and variance of mRNA expression (have microarray data)
* Calculated sample size based on twosample ttest to detect an effect size of 0.3 between lowest expression and highest expression groups: ~176 patients per group
* Client will locate this information and return to a future clinic for guidance
2014 Mar 7
Marvin W. Kronenberg , Medicine and Radiology
 Proposal for a followup study of an existing study with 12 heart failure patients: several markers need to be remeasured after 6month followup using imaging techniques to study a mechanism of spironolactone treatment
 The imaging technique (e.g., MRI) used to measure the markers in the previous study is very expensive, so the investigators consider to use Eco method to measure the same markers in the followup study, which is much cheaper and hence affordable
 Suggest: use the same imaging method to measure the markers, which is more precise (so provide more power) and the measurements are more comparable to learn a longterm effect of the treatment; otherwise, the study will be prone to criticism
2014 Mar 7
Marvin W. Kronenberg , Medicine and Radiology
 Proposal for a followup study of an existing study with 12 heart failure patients: several markers need to be remeasured after 6month followup using imaging techniques to study a mechanism of spironolactone treatment
 The imaging technique (e.g., MRI) used to measure the markers in the previous study is very expensive, so the investigators consider to use Eco method to measure the same markers in the followup study, which is much cheaper and hence affordable
 Suggest: use the same imaging method to measure the markers, which is more precise (so provide more power) and the measurements are more comparable to learn a longterm effect of the treatment; otherwise, the study will be prone to criticism
 CTSA application: at least 3 repeated measurements per patient and about 7 markers; for data analysis and paper publication, 40hrs.
2014 Feb 28
Jason Tomichek, Joanna Stollings , PGY1 Pharmacy Resident
 which variable is associated with Inappropriate Discontinuation of Antipsychotics after Resolution of ICU Delirium.
 Suggest: describe stat and multinomial logistic regression
 CTSA application: for paper publication, 40hrs.
Anthony E. Archibong, PhD , Meharry Medical College
 NIH grant application.
 Contact BCC.
Older Notes