Statistical Methods for Integrative Analysis of Genomics and Proteomics Data

Wang, Pei

Abstract

Tumors are complex biological systems. No single type of molecular approach fully elucidates tumor behavior, necessitating analysis at multiple levels encompassing genomics and proteomics. Therefore different types of data from numerous sources are now collected at a genome-wide scale, including: DNA copy number alterations, mRNA expression, protein expression measurements and many others. However, the full extent of biomedical information in these studies cannot be realized without effective statistical and computational methods. Thus, the long-term goal of this research is to develop innovative methods jointly modeling these different types of data to help uncover the large-scale organization of genes and proteins interacting. To tackle this challenge, this proposal begins in Aim 1 by developing new statistical and computational methods for identifying DNA/RNA/Protein interactions. We propose to use tools developed for graphics models and study conditional dependencies among genes/proteins with various conditional correlations.
Aim 2 proposes novel approaches to integrate the interaction network with disease phenotypes to improve biomarker identification and clinical outcome prediction. We will derive modules of genes/proteins which are associated with disease initiation/progression, and use boosting procedures to incorporate the module information into the predictive models. Sparse regression techniques together with proper smooth regularization will be used to handle the high-dimensionality and to account for the local correlation in both aims. The proposal uses two breast cancer studies as motivating examples. But the tools develop here can be well generalized to other disease. Success of this research will result in substantially improved statistical methods for large-scale integration studies, and thus help to increase mechanistic understanding of the contribution of genomic/proteomics alterations to tumor growth and progression, as well as facilitate the development of more effective molecular diagnostic and prognostic tests. Data from the two breast cancer studies will be used together with extensive simulation experiments to test and refine the methodology for real-world application.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM082802-05
Application #: 8281458
Study Section: Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer: Marcus, Stephen

Project Start: 2008-07-15
Project End: 2013-12-30
Budget Start: 2012-05-01
Budget End: 2013-12-30
Support Year: 5
Fiscal Year: 2012
Total Cost: $270,511
Indirect Cost: $97,436

Institution

Name: Fred Hutchinson Cancer Research Center
Department
Type
DUNS #: 078200995

City: Seattle
State: WA
Country: United States
Zip Code: 98109

Related projects


NIH 2012 R01 GM	Statistical Methods for Integrative Analysis of Genomics and Proteomics Data Wang, Pei / Fred Hutchinson Cancer Research Center	$270,511
NIH 2011 R01 GM	Statistical Methods for Integrative Analysis of Genomics and Proteomics Data Wang, Pei / Fred Hutchinson Cancer Research Center	$270,725
NIH 2010 R01 GM	Statistical Methods for Integrative Analysis of Genomics and Proteomics Data Wang, Pei / Fred Hutchinson Cancer Research Center	$273,085
NIH 2009 R01 GM	Statistical Methods for Integrative Analysis of Genomics and Proteomics Data Wang, Pei / Fred Hutchinson Cancer Research Center	$283,189
NIH 2008 R01 GM	Statistical Methods for Integrative Analysis of Genomics and Proteomics Data Wang, Pei / Fred Hutchinson Cancer Research Center	$299,987

Publications

Fu, Rong; Wang, Pei; Ma, Weiping et al. (2017) A statistical method for detecting differentially expressed SNVs based on next-generation RNA-seq data. Biometrics 73:42-51

Zhou, Yan; Wang, Pei; Wang, Xianlong et al. (2017) Sparse multivariate factor analysis regression models and its applications to integrative genomics analysis. Genet Epidemiol 41:70-80

Petralia, Francesca; Song, Won-Min; Tu, Zhidong et al. (2016) New Method for Joint Network Analysis Reveals Common and Different Coexpression Patterns among Genes and Proteins in Breast Cancer. J Proteome Res 15:743-54

Wang, Xianlong; Qin, Li; Zhang, Hexin et al. (2015) A regularized multivariate regression approach for eQTL analysis. Stat Biosci 7:129-146

Danaher, P; Paul, D; Wang, P (2015) Covariance-based analyses of biological pathways. Biometrika 102:533-544

Petralia, Francesca; Wang, Pei; Yang, Jialiang et al. (2015) Integrative random forest for gene regulatory network inference. Bioinformatics 31:i197-205

Teixeira, Leonardo K; Wang, Xianlong; Li, Yongjiang et al. (2015) Cyclin E deregulation promotes loss of specific genomic regions. Curr Biol 25:1327-33

Danaher, Patrick; Wang, Pei; Witten, Daniela M (2014) The joint graphical lasso for inverse covariance estimation across multiple classes. J R Stat Soc Series B Stat Methodol 76:373-397

Hu, Jie Kate; Wang, Xianlong; Wang, Pei (2014) Testing gene-gene interactions in genome wide association studies. Genet Epidemiol 38:123-34

Cheng, Jie; Levina, Elizaveta; Wang, Pei et al. (2014) A sparse Ising model with covariates. Biometrics 70:943-53

Showing the most recent 10 out of 28 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: