Department of Biostatistics Seminar/Workshop Series
Sleep Hypnograms, Insurance Claims, & Hand Movement After Stroke: Big Data and Potentially Many Weak, Predictive Signals
Bruce Swihart, PhD
Postdoctoral Fellow in Biostatistics, Johns Hopkins Bloomberg School of Public Health
The increasing ease and decreasing cost for collecting complex and structured data has ushered in the Big Data era. Three big data sets are discussed in terms of their size and modeling approaches, with a primary focus on sleep hypnogram data. 5,598 sleep hypnograms collected from the Sleep Heart Health Study are analyzed in the context of sleep-disordered breathing and fragmented sleep with scalable GEE log-linear models as well as Multi-state Survival models. The information loss between 5-state hypnograms and 3-state hypnograms is explored. In the second dataset of Insurance Claims, the task of predicting the Days in Hospital of the following year is undertaken as issued in the challenge of the $3 million Heritage Health Prize. Hand Movements After Stroke (Stroke Kinematics) are highly structured and complex data that contain features that are a strong classifier of Stroke-status.