Department of Biostatistics Seminar/Workshop Series

Statistical Methods to Account for Data Errors Discovered from an Audit

Bryan Shepherd, PhD

Assistant Professor, Department of Biostatistics
VUMC School of Medicine

Wednesday, July 15, 2009, 1:30-2:30pm, MRBIII Conference Room 1220

Intended Audience: Persons interested in applied statistics, statistical theory, epidemiology, health services research, clinical trials methodology, statistical computing, statistical graphics, R users or potential users

A data coordinating team performed on-site audits and discovered discrepancies between the data sent to the coordinating center and that recorded at sites. We present statistical methods for incorporating audit results into analyses. This can be thought of as a measurement error problem, where the distribution of errors is a mixture with a point mass at 0. If the error rate is non-zero, then even if the mean of the discrepancy between the reported and correct values of a predictor is 0, naive estimates of the association between two continuous variables will be biased. We consider scenarios where there are 1) errors in the predictor, 2) errors in the outcome, and 3) possibly correlated errors in the predictor and outcome. We show how to incorporate the error rate and magnitude, estimated from a random subset (the audited data), to compute unbiased estimates of association and proper confidence intervals. We study the finite sample properties of our estimators using simulations and illustrate our methods with data from 5152 HIV-infected patients in Latin America, of whom 184 had their data audited.
Topic revision: r2 - 26 Apr 2013, JohnBock
This site is powered by FoswikiCopyright &© 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback