Department of Biostatistics Seminar/Workshop Series

Augmented Weighted Support Vector Machines Covariates

Thomas G. Stewart, PhD Biostatistics

University of North Carolina, Chapel Hill

Support vector machines (SVM) are a popular tool for a wide variety of classification tasks. A key feature of SVMs is the flexibility to generate both linear and non-linear decision rules. This feature is particularly helpful when the relationship between the outcome and the predictor variables is complex, as is frequently the case in biomedical studies. A practical challenge for SVMs, as with many other classification methods, is the common and real-world issue of missing covariates. Currently, many researchers and users of SVMs rely on complete-case or imputation solutions which may introduce bias and lead to reduced classification accuracy. Other approaches are limited to specific missing data scenarios or limited by computational issues. In this presentation, I discuss an EM-motivated solution to the incomplete covariate problem for SVMs. In this method, the hinge-loss for observations with missing covariates is replaced with its quasi-expectation conditional on the observed data and postulated model parameters. Simulations show that the proposed method often yields classification rules with higher accuracy than existing methods. We apply the approach to analyze data from HCV-TARGET, a longitudinal study of Hepatitis C patients.
Topic revision: r1 - 18 Aug 2015, AshleeBartley
 

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback