Comprehensive Introduction to Clinical Investigation
Biostatistics
Division of Biostatistics and Epidemiology
Department of Health Evaluation Sciences
924-8712
July 2001
Instructors:
Frank Harrell PhD
fharrell@virginia.edu
James Patrie MS
jpatrie@virginia.edu
Jennifer Gibson MS
jjgibson@virginia.edu
Mark Conaway PhD
mconaway@virginia.edu
Text:
Rosner, B. Fundamentals of Biostatistics, 5th Edition.
Pacific Grove, CA, 2000.
Articles:
Series of short articles about statistical concepts by JM Bland and DG
Altman et al. appearing in British Medical Journal
(provided in course pack).
Cohn V: A perspective from the press: how to help reporters tell the
truth (sometimes). Stat in Med 2001; 20:1341-1346.
Matthews JNS, et al.: Analysis of serial measurements in
medical research. British Med J 1990; 300:230-235.
1 Section Description
This section introduces biostatistical concepts that are useful in
clinical research, including the following.
- Experimental design
- Various types of random variables
- Data distributions and descriptive statistics
- Graphical presentation of data and results
- Probability
- Data analysis for description, estimation, hypothesis testing,
and prediction
- Linear regression models
- Dealing with repeated measurements in one patient and how to
measure change
- Avoiding pitfalls in interpreting statistical analyses
There are seven 3-hour sessions, some of which are split into multiple
topics.
2 Session Format
Sessions will be a combination of lecture and discussion. Lectures
will be informal; questions and discussion are encouraged at any
time. Sessions will rely heavily on participants reading assigned
sections of the text and several short articles in advance. Copies of
slides will be made available to participants at the start of each
session.
3 Assignments and Exercises
There will be a few short assignments and quizzes to stimulate
discussion and to identify weak areas. These will not be graded.
4 Session Outlines
In what follows Rn refers to Chapter or Section n in Rosner, BAn
refers to number n in the series of occasional notes on medical
statistics by Bland and Altman, and ABn refers to an article
in which Altman is the first author. MAn refers to articles written
by Matthews and Altman.
4.1 Session One 3 July 2001
- Presenter
- : Frank Harrell
- Topics and Readings
- :
General Overview (R1)
Descriptive Statistics and Graphics (AB8, R2)
- Objectives
- : To
- understand the role of biostatistics as a science, and
biostatistical methods as tools of scientific inquiry
- the meaning of description, estimation, hypothesis testing, and
prediction
- understand what is meant by random variable
- know advantages of using continuous variables and of preserving
their continuous nature in the analysis
- understand distributions of random variables
- know characteristics of distributions (central tendency, variance
(variability, spread), quantiles or percentiles)
- be able to choose graphs that are useful for depicting data
distributions
- be able to choose graphs that are useful for summarizing
results of studies
- be able to make informative tables
4.2 Session Two 5 July 2001
- Presenter
- : Jennifer Gibson
- Topics and Readings
- :
Probability (R3.1-3.6)
Estimation (R6.1-6.2,6.4-6.7.1)
- Objectives
- : To
- understand the meaning of probability
- understand what it means to say that two events are independent
- be able to compute the probability of the union of two events
- be able to compute the probability of the intersection of two
independent events
- understand conditional probability
- know the meaning of population and a sample from that
population
- know how to estimate population quantities such as mean,
median and other quantiles, and standard deviation from sample values
- obtain an initial understanding of interval estimates and how
to construct a confidence
interval for the unknown mean of a normal-shaped population
- understand how to estimate a population probability from a
sample of events and non-events
- know a simple approximate formula for a confidence interval for
an unknown population probability
- memorize and understand the 3/n rule
4.3 Session Three 10 July 2001
- Presenter
- : Frank Harrell
- Topics and Readings
- :
Hypothesis Testing: One-sample inference
(R7(except 7.4.1,7.8,7.9.2,7.10),BA8,AB1)
Two-sample inference (R8(except 8.6,8.7,8.9,8.11),MA25)
- Objectives
- : To
- understand the fundamentals of hypothesis testing and
assembling evidence using classical statistics
- know the meanings of type I and II errors, P-values, and
power
- know the general structure of a t statistic in general
- know one basis for estimating the required sample size
- understand the construction and interpretation of a confidence
interval for an unknown mean from a normal population
- know the relationshop between confidence intervals and
P-values
- know how to carry out and interpret a one-sample t-test for
paired (R8.2) or unpaired data from a normal distribution
- understand how P-values are ``backwards'' and how to avoid errors in
interpreting them
- learn how to compute and interpret confidence intervals for the
difference in two population means when the data are normal
- understand the setup for a two-sample problem
- be able to carry out a two-sample (unpaired) t-test for
normally distributed data
- be able to construct and interpret a confidence interval for
the difference in two means
- know how to compute power or the sample size to achieve a given
power for comparing two means
- know how to compute the sample size to achieve a given
precision for estimating a probability, a mean, and a difference in
two means
- understand pitfalls in interpreting P-values
4.4 Session Four 11 July 2001
- Presenter
- : Frank Harrell
- Topics and Readings
- :
Comparing two proportions (R10.1-10.2,10.5.1)
Nonparametric methods (R9.1,9.3-9.6)
Hypothesis testing review (R7,R8)
- Objectives
- : To
- learn how to do an approximate test for the difference in two
proportions by hand
- learn to use approximate methods for computing sample size or
power for comparing two population probabilities
- learn the advantages of nonparametric tests for continuous
responses without assuming a distribution
- understand the nonparametric counterpart of the one-sample
t-test, the Wilcoxon signed-rank test
- understand the nonparametric counterpart of the two-sample
t-test, the Wilcoxon-Mann-Whitney two-sample rank-sum test
- review ``big picture'' concepts of hypothesis testing and
interval estimation
4.5 Session Five 17 July 2001
- Presenter
- : Jim Patrie
- Topics and Readings
- :
Regression and Correlation (R11.1-11.7,11.9-11.10)
- Objectives
- : To
- understand in detail the simple linear regression model and how
its slope and intercept are estimated
- understand interval estimation of the slope and of a prediction
- know the assumptions made by regression
- understand multiple regression, especially interpreting
regression coefficients and what it means to adjust for the effects
of certain variables
- know what the linear correlation coefficient measures
- understand the correspondence between testing for nonzero
correlation and testing for nonzero slope in simple regression
- be able to interpret R2
- know the assumptions made by standard linear multiple regression
4.6 Session Six 18 July 2001
- Presenter
- : Frank Harrell
- Topics and Readings
- :
Regression Review (R11)
Rank correlation (R11.12)
One-way analysis of variance and the Kruskal-Wallis test
(R12.1,AB20,R12.7)
Heterogeneity of effects (BA23,AM24,MA25,MA26,R12.6)
Analysis of covariance (R12.5.3)
Multiple significance tests (BA10)
- Objectives
- : To
- further understand the most important issues related to
regression analysis, and hazards of multiple regression
- know how to estimate the sample size needed to estimate a
correlation coefficient to a certain precision
- know the advantages of the nonparametric counterpart to the linear
correlation coefficient and test
- understand principles involved in comparing k groups using
analysis of variance
- know a method for pairwise comparisons of means
- understand how the Kruskal-Wallis test generalizes the Wilcoxon
test from 2 to k samples
- understand advantages of the Kruskal-Wallis test over
parametric analysis of variance
- know when a two-way ANOVA is appropriate
- be introduced to methods for assessing differential treatment
effects
- know the purpose of analysis of covariance
- be introduced to methods (such as Bonferroni) for keeping the
probability of a false positive result at an acceptable level when
many hypotheses are tested
4.7 Session Seven 19 July 2001
- Presenters
- : Mark Conaway and Frank Harrell
- Topics, Readings, and Presenter
- :
Measuring change (Harrell) (TBD)
Repeated Measurements (Conaway) (BA1,BA12,BA13,Matthews
et al.)
Experimental Design (Conaway)
- Objectives
- : To
- know problems with percent change
- understand one basis for choosing a measure of change
- understand some of the most common experimental designs used in
experiments to compare therapies
- be introduced to factorial designs and their advantages and
disadvantages
- know why multiple measurements from the same patient cannot be
analyzed as if they were measurements from separate patients
- be introduced to simple methods for analyzing such serial data
- 1
- ``Absence of evidence'' paper