Statistical Modeling for Biomedical Researchers

This page provides the data sets that are used in Dupont, W. D. (2002). Statistical Modeling for Biomedical Researchers. Cambridge, U.K.: Cambridge University Press.

Data sets

Links related to this text are

William D. Dupont
Table of Contents
Log Files and Do Files
Cambridge University Press
Stata Corporation
Purchase hardback

Complete Stata log files and do files of all data analyses from this text are also provided. The sources of these data sets are as follows:

  1. Bernard, G. R., A. P. Wheeler, et al. (1997). "The effects of ibuprofen on the physiology and survival of patients with sepsis. The Ibuprofen in Sepsis Study Group." N Engl J Med 336: 912-8.
  1. Brent, J., K. McMartin, et al. (1999). "Fomepizole for the treatment of ethylene glycol poisoning. Methylpyrazole for Toxic Alcohols Study Group." N Engl J Med 340: 832-8.
  1. Breslow, N. E. and N. E. Day (1980). Statistical Methods in Cancer Research: Vol. 1 - The Analysis of Case-Control Studies. Lyon, France, IARC Scientific Publications.
    I have posted the data from Appendix I of this text, which is from the Ille-et-Vilaine study of esophageal cancer. (See also Tuyns et al. 1977)
  1. Dupont WD, Page DL (1985). Risk factors for breast cancer in women with proliferative breast diasease. N Engl J Med 312:146-51.
  1. Eisenhofer, G., J. W. Lenders, et al. (1999). "Plasma normetanephrine and metanephrine for detecting pheochromocytoma in von Hippel-Lindau disease and multiple endocrine neoplasia type 2." N Engl J Med 340(24): 1872-9.
  1. Framingham Heart Study (1997). The Framingham Study - 40 Year Public Use Data Set. Bethesda, MD: National Heart, Lung, and Blood Institute, NIH.
  1. Gross, C. P., G. F. Anderson, et al. (1999). "The relation between funding by the National Institutes of Health and the burden of disease." N Engl J Med 340(24): 1881-7.
  1. Lang, C. C., C. M. Stein, et al. (1995). "Attenuation of isoproterenol-mediated vasodilatation in blacks." N Engl J Med 333: 155-60.
  1. Levy, D., National Heart Lung and Blood Institute., et al. (1999). 50 years of discovery : medical milestones from the National Heart, Lung, and Blood Institute's Framingham Heart Study. Hackensack, N.J., Center for Bio-Medical Communication Inc.
    I have posted a subset of the 40 year follow-up data from the Framingham Heart Study (see also reference 6).
  1. O'Donnell HC, Rosand J, Knudsen KA, Furie KL, Segal AZ, Chiu RI, et al. (2000). "Apolipoprotein E genotype and the risk of recurrent lobar intracerebral hemorrhage." N Engl J Med; 342:240-5.
  1. Parl FF, Cavener DR, Dupont WD (1989). "Genomic DNA analysis of the estrogen receptor gene in breast cancer." Breast Cancer Res Tr; 14:57-64.
  1. Scholer SJ, Hickson GB, Mitchel EF, Jr., Ray WA (1997). "Persistently increased injury mortality rates in high-risk young children." Arch Pediatr Adolesc Med; 151:1216-9.
  1. Tuyns, A. J., G. Pequignot, et al. (1977). "Le cancer de L'oesophage en Ille-et-Vilaine en fonction des niveau de consommation d'alcool et de tabac. Des risques qui se multiplient." Bull Cancer 64: 45-60.

Stata Data Sets

To download any of the following Stata data sets, click on the data set name, then follow instructions. The numbers at the beginning of these names indicate the chapter and subsection where the data set is first used in Dupont (2002).

Stata Data Set Data Source Comma separated file
1.3.2.Sepsis.dtaBernard et al. (1997) 1.3.2.Sepsis.csv
1.4.11.Sepsis.dtaBernard et al. (1997) 1.4.11.Sepsis.csv
10.7.ERpolymorphism.dtaParl et al. (1989) 10.7.ERpolymorphism.csv
11.2.Isoproterenol.dtaLang et al. (1995) 11.2.Isoproterenol.csv
11.2.Long.Isoproterenol.dtaLang et al. (1995) 11.2.Long.Isoproterenol.csv
11.AreaUnderCurve.dta(no ref) 11.AreaUnderCurve.csv
2.12.Poisson.dtaBrent et al. (1999) 2.12.Poisson.csv
2.18.Funding.dtaGross et al. (1999)2.18.Funding.csv
2.20.Framingham.dtaLevy (1999)2.20.Framingham.csv
2.ex.vonHippelLindau.dtaEisenhofer et al. (1999) 2.ex.vonHippelLindau.csv
3.ex.Funding.dtaGross et al. (1999) 3.ex.Funding.csv
4.11.Sepsis.dtaBernard et al. (1997) 4.11.Sepsis.csv
4.18.Sepsis.dtaBernard et al. (1997) 4.18.Sepsis.csv
4.21.EsophagealCa.dtaBreslow & Day (1980) 4.21.EsophagealCa.csv
4.ex.Sepsis.dtaBernard et al. (1997)4.ex.Sepsis.csv
5.5.EsophagealCa.dtaBreslow & Day (1980) 5.5.EsophagealCa.csv
5.ex.InjuryDeath.dtaScholer et al. (1997) 5.ex.InjuryDeath.csv
6.9.Hemorrhage.dtaO'Donnell et al. (2000) 6.9.Hemorrhage.csv
6.ex.Breast.dtaDupont et al. (1985) 6.ex.Breast.csv
8.12.Framingham.dtaLevy (1999)8.12.Framingham.csv
8.7.Framingham.dtaLevy (1999)8.7.Framingham.csv
8.8.2.Person-Years.dta(no ref) 8.8.2.Person-Years.csv
8.8.2.Survival.dta(no ref) 8.8.2.Survival.csv
8.ex.InjuryDeath.dtaScholer et al. (1997) 8.ex.InjuryDeath.csv
11.ex.Sepsis.dta 11.ex.Sepsis.csv

These data sets can also be opened directly from within Stata on any computer that is connected to the Internet.  For example, the Stata command


will open the 1.3.2.Sepsis data set directly over the web.  When opening a Stata data file in this way you must be careful to capitalize the web address and data file name correctly.

Stata Log Files and Do Files

The log files given in the text, and their corresponding do files are given below. Stata released Version 8 of their software soon after the publication of this text.  To obtain Version 8 editions of the text’s log and do files click 

Version 8.    

To download any of the following Stata log files or do files, click on the file name, then follow the directions to save the file onto your computer.

Version 7 log and do files

Log File Name Data Source
1.3.2.Sepsis.log Bernard et al. (1997)
1.3.6.Sepsis.log Bernard et al. (1997)
1.4.11.Sepsis.log Bernard et al. (1997)
1.4.14.Sepsis.log Bernard et al. (1997)
10.7.ERpolymorphism.log Parl et al. (1989)
11.11.Isoproterenol.log Lang et al. (1995)
11.2.Isoproterenol.log Lang et al. (1995)
11.5.Isoproterenol.log Lang et al. (1995)
11.AreaUnderCurve.log (no ref)
2.12.Poisson.log Brent et al. (1999)
2.16.Poisson.log Brent et al. (1999)
2.18.Funding.log Gross et al. (1999)
2.20.Framingham.log Levy (1999)
3.11.1.Framingham.log Levy (1999)
4.11.Sepsis.log Bernard et al. (1997)
4.13.1.Sepsis.log Bernard et al. (1997)
4.18.Sepsis.log Bernard et al. (1997)
4.22.EsophagealCa.log Breslow & Day (1980)
5.10.EsophagealCa.log Breslow & Day (1980)
5.11.1.EsophagealCa.log Breslow & Day (1980)
5.12.EsophagealCa.log Breslow & Day (1980)
5.20.EsophagealCa.log Breslow & Day (1980)
5.32.2.Sepsis.log Bernard et al. (1997)
5.5.EsophagealCa.log Breslow & Day (1980)
5.9.EsophagealCa.log Breslow & Day (1980)
6.16.Hemorrhage.log O'Donnell et al. (2000)
6.9.Hemorrhage.log O'Donnell et al. (2000)
7.7.Framingham.log Levy (1999)
7.9.4.Framingham.log Levy (1999)
8.12.Framingham.log Levy (1999)
8.2.Framingham.log Levy (1999)
8.7.Framingham.log Levy (1999)
8.8.2.Survival_to_Person-Years.log(no ref)
8.9.Framingham.log Levy (1999)
9.3.Framingham.log Levy (1999)

Do File Name Data Source Bernard et al. (1997) Bernard et al. (1997) Bernard et al. (1997) Bernard et al. (1997) Parl et al. (1989) Lang et al. (1995) Lang et al. (1995) Lang et al. (1995) (no ref) Brent et al. (1999) Brent et al. (1999) Gross et al. (1999) Levy (1999) Levy (1999) Bernard et al. (1997) Bernard et al. (1997) Bernard et al. (1997) Breslow & Day (1980) Breslow & Day (1980) Breslow & Day (1980) Breslow & Day (1980) Breslow & Day (1980) Bernard et al. (1997) Breslow & Day (1980) Breslow & Day (1980) O'Donnell et al. (2000) O'Donnell et al. (2000) Levy (1999) Levy (1999) Levy (1999) Levy (1999) Levy (1999) ref) Levy (1999) Levy (1999)


I would like to thank the following people for generously allowing me to use their data in my text and on this web page.

Gordon R. Bernard, M.D.
Jeffrey Brent, M.D., Ph. D.
Norman E. Breslow, Ph.D.
Graeme Eisenhofer, Ph.D.
Steven M. Greenberg, M.D., Ph.D.
Cary P. Gross, M.D.
Daniel Levy, M.D.
Fritz F. Parl, M.D., Ph.D.
Paul Sorlie, Ph. D.
Wayne A. Ray, Ph.D.
Alastair J.J. Wood, M.D.

Other data web sites

Professor Breslow has posted the Ille-et-Vilaine data set as well as other data sets from his text books at


The opinions expressed in my text are my own and do not necessarily reflect the views of the authors listed above, their employers or funding institutions. This includes the National Heart, Lung and Blood Institute, NIH, DHHS.

William D. Dupont Nashville, TN

Topic revision: r4 - 21 Jan 2022, DalePlummer

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback