Department of Biostatistics Seminar/Workshop Series
Estimating Re-identification Risk in Biomedical Research Databases
Bradley Malin, Ph.D.
Assistant Professor of Biomedical Informatics, Director of the Health Information Privacy Laboratory, VUMC
Wednesday, December 17, 1:45-2:55pm, MRBIII Conference Room 1220
Intended Audience: Persons interested in applied statistics, statistical theory, epidemiology, health services research, clinical trials methodology, statistical computing, statistical graphics, R users or potential users
For years, medical researchers have been directed to de-identify patients’ health records and biological data before such information is shared beyond the collecting institution. This policy is reinforced by Institutional Review Boards, as well as regulations at the state and federal level, such as the Privacy Rule of the Health Insurance Portability and Accountability Act. De-identified data appears to be protected; however, the decreasing costs, and increasing adoption, of information and networking technologies have created a complex landscape that has eroded the protections afforded by such policies. Consequentially, our research has exposed that de-identification provides little in the form of protection guarantees. In this talk, I will review various automated approaches we have developed to link patients’ identities to seemingly anonymous biomedical data, often using nothing more than publicly-available information. Yet, I will also explore why all hope is not lost and how we can integrate policy with statistical and computational formalisms to measure the risks associated with sharing data according to various policies, as well as how to provably protect patients’ records from privacy invading attacks without preventing the workflow of worthwhile biomedical research endeavors. This talk will draw upon real emerging biomedical research infrastructures, such as de-identified repositories of electronic medical and genomic records at the National Institutes of Health.