Department of Biostatistics Seminar/Workshop Series
Probability Machines
James D. Malley, PhD
Center for Information Technology, National Institutes of Health, Bethesda, MD
Wednesday, May 18, 1:30-2:30pm, MRBIII Conference Room 1220
Many statistical learning machines can provide an optimal classification for binary outcomes. However, probabilities are required for risk estimation using individual patient characteristics for personalized medicine. This talk shows that statistical learning machines that are consistent for the nonparametric regression problem are also consistent for the probability estimation problem. These will be called probability machines.
Probability machines discussed include classification and regression random forests and two nearest-neighbor machines, all of which use any collection of predictors with arbitrary statistical structure. Two simulated and two real data sets illustrate the use of these machines for probability estimation for an individual.