### Department of Biostatistics Seminar/Workshop Series

# Statistical methods for analysis of graph-constrained genomic data

## Caiyan Li, PhD Candidate

### Department of Biostatistics and Epidemiology

University of Pennsylvania School of Medicine

### Wednesday, June 17, 1:30-2:30pm, MRBIII Conference Room 1220

### Intended Audience: Persons interested in applied statistics, statistical theory, epidemiology, health services research, clinical trials methodology, statistical computing, statistical graphics, R users or potential users

Graphs and networks are common ways of depicting information. In biology, many different biological processes are represented by graphs, such as regulatory networks, metabolic pathways and protein-protein interaction networks. This kind of a priori information accumulated over many years of biomedical research is a useful supplement to the standard numerical genomic data such as microarray gene expression data. How to incorporate information encoded by the known biological pathways into analysis of numerical data raises interesting statistical challenges. In this talk, I will present two approaches for graph-constrained analysis of genomic data. The first approach is based on a GRAph-Constrained Estimation (GRACE) procedure for regression analysis to identify the sub-networks that are predictive of a certain clinical outcome. I will present the formulation of the problem and theoretical results on model selection consistency and estimation errors. Simulation studies indicated that the methods are quite effective in identifying genes and modules that are related to disease and have smaller prediction errors than the commonly used procedures that ignore the pathway structure information. The second approach is based on a discrete hidden Markov random field (MRF) model for identifying genes and sub-networks whose transcriptional activities are perturbed by or activated in response to experimental conditions. I will demonstrate the application of the two proposed methods using a microarray gene expression study of human brain aging.