Outcome Dependent Sampling for Longitudinal Data Analysis
Publications and Software

This page provides references to papers and links to software associated with our research related to longitudinal data analysis and outcome dependent sampling for longitudinal data. All research below was at least partially supported by our grant Outcome Dependent Sampling for Longitudinal Data: Design and Analysis (R01-HL094786) which was funded by the NHLBI.

People involved: Chiara Di Gravio, Shawn Garbett, Sebastien Haneuse, Patrick Heagerty, Jake Maronge, Lee McDaniel, Nate Mercaldo, Peter Mueller, Paul Rathouz, Jonathan Schildcrout, Bryan Shepherd, Ran Tao, Leila Zelnick

Published Work (with Pubmed Central Links when available)

  1. Mitani AA, Mercaldo ND, Haneuse S, Schildcrout JS. Survey design and analysis considerations when utilizing misclassified sampling strata. BMC Med Res Methodol. 2021 Jul 11;21(1):145. doi: 10.1186/s12874-021-01332-8. PMID: 34247586; PMCID: PMC8273975.

  2. Sauer S, Hedt-Gauthier B, Haneuse S. Optimal allocation in stratified cluster-based outcome-dependent sampling designs. Stat Med. 2021 Aug 15;40(18):4090-4107. doi: 10.1002/sim.9016. Epub 2021 Jun 2. PMID: 34076912.

  3. Sauer S, Hedt-Gauthier B, Rivera-Rodriguez C, Haneuse S. Small-sample inference for cluster-based outcome-dependent sampling schemes in resource-limited settings: Investigating low birthweight in Rwanda. Biometrics. 2021 Jan 14:10.1111/biom.13423. doi: 10.1111/biom.13423. Epub ahead of print. PMID: 33444459; PMCID: PMC8277876.

  4. Tao R, Mercaldo ND, Haneuse S, Maronge JM, Rathouz PJ, Heagerty PJ, Schildcrout JS. Two-wave two-phase outcome-dependent sampling designs, with applications to longitudinal binary data. Stat Med. 2021 Apr 15;40(8):1863-1876. doi: 10.1002/sim.8876. Epub 2021 Jan 13. PMID: 33442883; PMCID: PMC8110123.

  5. Tao R, Lotspeich SC, Amorim G, Shaw PA, Shepherd BE. Efficient semiparametric inference for two-phase studies with outcome and covariate measurement errors. Stat Med. 2021 Feb 10;40(3):725-738. doi: 10.1002/sim.8799. Epub 2020 Nov 3. PMID: 33145800; PMCID: PMC8214478.

  6. Lotspeich SC, Shepherd BE, Amorim GGC, Shaw PA, Tao R. Efficient odds ratio estimation under two-phase sampling using error-prone data from a multi-national HIV research cohort. Biometrics. 2021 Jul 2. doi: 10.1111/biom.13512. Epub ahead of print. PMID: 34213008.

  7. Rivera-Rodriguez C, Spiegelman D, Haneuse S. On the analysis of two-phase designs in cluster-correlated data settings. Stat Med. 2019 Oct 15;38(23):4611-4624. doi: 10.1002/sim.8321. Epub 2019 Jul 29. PMID: 31359448; PMCID: PMC6736737.

  8. Schildcrout JS, Haneuse S, Tao R, Zelnick LR, Schisterman EF, Garbett SP, Mercaldo ND, Rathouz PJ, Heagerty PJ. Two-Phase, Generalized Case-Control Designs for the Study of Quantitative Longitudinal Outcomes. Am J Epidemiol. 2020 Feb 28;189(2):81-90. doi: 10.1093/aje/kwz127. PMID: 31165875; PMCID: PMC7298772.

  9. Rivera-Rodriguez C, Haneuse S, Wang M, Spiegelman D. Augmented pseudo- likelihood estimation for two-phase studies. Stat Methods Med Res. 2020 Feb;29(2):344-358. doi: 10.1177/0962280219833415. Epub 2019 Mar 5. PMID: 30834815; PMCID: PMC7659466.

  10. Mercaldo ND, Brothers KB, Carrell DS, Clayton EW, Connolly JJ, Holm IA, Horowitz CR, Jarvik GP, Kitchner TE, Li R, McCarty CA, McCormick JB, McManus VD, Myers MF, Pankratz JJ, Shrubsole MJ, Smith ME, Stallings SC, Williams JL, Schildcrout JS. Enrichment sampling for a multi-site patient survey using electronic health records and census data. J Am Med Inform Assoc. 2018 Dec 27. doi: 10.1093/jamia/ocy164. [Epub ahead of print] PubMed PMID: 30590688. link

  11. Zelnick LR, Schildcrout JS, Heagerty PJ. Likelihood-based analysis of outcome-dependent sampling designs with longitudinal data. Stat Med. 2018 Jun 15;37(13):2120-2133. doi: 10.1002/sim.7633. Epub 2018 Mar 15. PubMED PMID: 29542170. link

  12. Haneuse S, Rivera-Rodriguez C. On the Analysis of Case-Control Studies in Cluster-correlated Data Settings. Epidemiology. 2018 Jan;29(1):50-57. doi: 10.1097/EDE.0000000000000763. PubMed PMID: 29068840; PubMed Central PMCID: PMC5718962 link

  13. Schildcrout JS, Schisterman EF, Aldrich MC, Rathouz PJ. Outcome-related,Auxiliary Variable Sampling Designs for Longitudinal Binary Data. Epidemiology. 2018 Jan;29(1):58-66. doi: 10.1097/EDE.0000000000000765. PubMed PMID: 29068841; PubMed Central PMCID: PMC5718926. link

  14. Schildcrout JS, Schisterman EF, Mercaldo ND, Rathouz PJ, Heagerty PJ. Extending the Case-Control Design to Longitudinal Data: Stratified Sampling Based on Repeated Binary Outcomes. Epidemiology. 2018 Jan;29(1):67-75. doi: 10.1097/EDE.0000000000000764. PubMed PMID: 29068838; PubMed Central PMCID: PMC5718932. link

    ODSCode.R ODSCodeFunctions.R FullCohort.RData: An example of an R script that will conduct ODS analyses using a simulated dataset (FullCohortData.RData). This code contained in ODSCode.R calls ODSCodeFunctions.R and loads FullCohortData.RData in order to conduct analyses described in the manuscript.

  15. Schildcrout JS, Shi Y, Danciu I, Bowton E, Field JR, Pulley JM, Basford MA, Gregg W, Cowan JD, Harrell FE Jr, Roden DM, Peterson JF, Denny JC. A prognostic model based on readily available clinical data enriched a pre-emptive pharmacogenetic testing program. J Clin Epidemiol. 2016 Apr;72:107-15. doi: 10.1016/j.jclinepi.2015.08.028. Epub 2015 Nov 25. PubMed PMID: 26628336; PubMed Central PMCID: PMC4779720.

  16. Schildcrout JS, Rathouz PJ, Zelnick LR, Garbett SP, Heagerty PJ. BIASED SAMPLING DESIGNS TO IMPROVE RESEARCH EFFICIENCY: FACTORS INFLUENCING PULMONARY FUNCTION OVER TIME IN CHILDREN WITH ASTHMA. Ann Appl Stat. 2015 Jun;9(2):731-753. PubMed PMID: 26322147; PubMed Central PMCID: PMC4551501. link

  17. Alan Huang and Paul J. Rathouz. Orthogonality of the mean and error distribution in generalized linear models. Communications in Statistics - Theory and Methods. link

  18. Schildcrout JS, Garbett SP, Heagerty PJ. Outcome vector dependent sampling with longitudinal continuous response data: stratified sampling based on summary statistics. Biometrics. 2013 Jun;69(2):405-16. doi: 10.1111/biom.12013. Epub 2013 Feb 14. PubMed PMID: 23409789; PubMed Central PMCID: PMC3880022. link

  19. McDaniel, Lee S., Nicholas C. Henderson, and Paul J. Rathouz. "Fast Pure R Implementation of GEE: Application of the Matrix Package." R JOURNAL 5.1 (2013): 181-187. link

  20. Huang A, Rathouz PJ. Proportional likelihood ratio models for mean regression. Biometrika. 2012 Mar;99(1):223-229. PubMed PMID: 24421412; PubMed Central PMCID: PMC3888642. link

  21. Schildcrout JS, Mumford SL, Chen Z, Heagerty PJ, Rathouz PJ. Outcome-dependentsampling for longitudinal binary response data based on a time-varying auxiliary variable. Stat Med. 2012 Sep 28;31(22):2441-56. doi: 10.1002/sim.4359. Epub 2011 Nov 16. PubMed PMID: 22086716; PubMed Central PMCID: PMC3432177. link

  22. Schildcrout JS, Heagerty PJ. Outcome-dependent sampling from existing cohorts with longitudinal binary response data: study planning and analysis. Biometrics. 2011 Dec;67(4):1583-93. doi: 10.1111/j.1541-0420.2011.01582.x. Epub 2011 Apr 2. PubMed PMID: 21457191; PubMed Central PMCID: PMC3134621. link

  23. Schildcrout JS, Rathouz PJ. Longitudinal studies of binary response data following case-control and stratified case-control sampling: design and analysis. Biometrics. 2010 Jun;66(2):365-73. doi: 10.1111/j.1541-0420.2009.01306.x. Epub 2009 Aug 10. PubMed PMID: 19673861; PubMed Central PMCID: PMC3051172. link

  24. Schildcrout JS, Heagerty PJ. On outcome-dependent sampling designs for longitudinal binary response data with time-varying covariates. Biostatistics. 2008 Oct;9(4):735-49. doi: 10.1093/biostatistics/kxn006. Epub 2008 Mar 27. PubMed PMID: 18372397; PubMed Central PMCID: PMC2733177. link

Submitted work

  1. Lee S. McDaniel Jonathan S. Schildcrout, and Paul J. Rathouz. Generalized linear models under biased sampling designs: A sequential offsetted regression approach. (in revision)

Work in progress

  1. Outcome vector dependent sampling for continuous longitudinal data
  2. Observation-level outcome related sampling designs for longitudinal data

Video Seminars

  1. September 2013 (Schildcrout to HSR at VU): http://biostat.mc.vanderbilt.edu/wiki/Main/JonathanSchildcroutVideo
  2. October 2015 (Rathouz to Epi at UW Madison): http://videos.med.wisc.edu/videos/62263

Links to Software

Topic attachments
I Attachment Action Size Date Who Comment
FullCohort.RDataRData FullCohort.RData manage 87.4 K 16 Aug 2017 - 17:30 JonathanSchildcrout Full cohort data that is used by ODSCode.R to run analyses
ODSCode.RR ODSCode.R manage 3.7 K 16 Aug 2017 - 17:24 JonathanSchildcrout Example to conduct an ODS analysis using a simulated dataset. This code runs the full cohort model with maximum likeliheood and the ODS sample using ACML, WL, MI
ODSCodeFunctions.RR ODSCodeFunctions.R manage 15.7 K 16 Aug 2017 - 17:28 JonathanSchildcrout Functions called by ODSCode.R to run analyses
Topic revision: r30 - 30 Jun 2022, JonathanSchildcrout

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback