Statistical Methods for Genomic and Proteomic Data

Pan, Wei

Abstract

We propose developing, evaluating and comparing biologically motivated statistical methods in analyzing and interpreting microarray data, including a heart failure dataset. Some of the proposed methods will incorporate or be applied to multiple types of genomic or proteomic data. The overarching theme is that, to increase statistical power for new discovery and to maximize the use of existing knowledge and data, we propose integrating gene networks and multiple types of high-throughput data, such as gene expression data, DNA-protein binding, DNA sequences and SNP data, with novel analysis methods and applications. Specifically, we propose 1) further development and evaluation of a network-based statistical analysis method for genomic discovery with applications to several real datasets;2) developing analysis strategies to integrate gene networks and gene functional annotations for genomic discovery, such as detecting differentially expressed genes based on expression data, and identify binding target genes of a single transcription factor based on DNA-protein binding (i.e. ChIP-chip) data;3) developing analysis strategies to integrate gene networks and multiple types of genomic and proteomic data, such as gene expression data, DNA-protein binding data, and DNA sequences;4) integrating gene networks into regression analysis for variable selection and parameter smoothing with applications to inferring expression quantitative trait loci (eQTL) by regressing expression data on genotype data. 5) software development for free public use.

Public Health Relevance

This proposed research is expected not only to advance statistical methodology and theory for complex data with complicated dependency structures, but also to contribute valuable analysis tools to the elucidation of molecular mechanisms underlying diseases.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Heart, Lung, and Blood Institute (NHLBI)
Type: Research Project (R01)
Project #: 5R01HL065462-09
Application #: 8073077
Study Section: Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer: Wolz, Michael

Project Start: 2000-07-01
Project End: 2013-05-31
Budget Start: 2011-06-01
Budget End: 2012-05-31
Support Year: 9
Fiscal Year: 2011
Total Cost: $179,369
Indirect Cost

Institution

Name: University of Minnesota Twin Cities
Department: Biostatistics & Other Math Sci
Type: Schools of Public Health
DUNS #: 555917996

City: Minneapolis
State: MN
Country: United States
Zip Code: 55455

Related projects


NIH 2012 R01 HL	Statistical Methods for Genomic and Proteomic Data Pan, Wei / University of Minnesota Twin Cities	$177,296
NIH 2011 R01 HL	Statistical Methods for Genomic and Proteomic Data Pan, Wei / University of Minnesota Twin Cities	$179,369
NIH 2010 R01 HL	Statistical Methods for Genomic and Proteomic Data Pan, Wei / University of Minnesota Twin Cities	$179,642
NIH 2009 R01 HL	Statistical Methods for Genomic and Proteomic Data Pan, Wei / University of Minnesota Twin Cities	$204,907
NIH 2007 R01 HL	Statistical Methods for Genomic and Proteomic Data Pan, Wei / University of Minnesota Twin Cities	$141,753
NIH 2006 R01 HL	Statistical Methods for Genomic and Proteomic Data Pan, Wei / University of Minnesota Twin Cities	$145,987
NIH 2005 R01 HL	Statistical Methods for Genomic and Proteomic Data Pan, Wei / University of Minnesota Twin Cities	$174,500
NIH 2004 R01 HL	Model Building - Marginal Regression with Dependent Data Pan, Wei / University of Minnesota Twin Cities	$107,057
NIH 2003 R01 HL	Model Building - Marginal Regression with Dependent Data Pan, Wei / University of Minnesota Twin Cities	$107,223
NIH 2002 R01 HL	Model Building - Marginal Regression with Dependent Data Pan, Wei / University of Minnesota Twin Cities	$107,344

Publications

Austin, Erin; Pan, Wei; Shen, Xiaotong (2014) Does the inclusion of rare variants improve risk prediction? BMC Proc 8:S94

Kim, Junghi; Wozniak, Jeffrey R; Mueller, Bryon A et al. (2014) Comparison of statistical tests for group differences in brain functional networks. Neuroimage 101:681-94

Zhu, Yunzhang; Shen, Xiaotong; Pan, Wei (2014) Structural pursuit over multiple undirected graphs. J Am Stat Assoc 109:1683-1696

Xu, Zhiyuan; Shen, Xiaotong; Pan, Wei et al. (2014) Longitudinal analysis is more powerful than cross-sectional analysis in detecting genetic association with neuroimaging phenotypes. PLoS One 9:e102312

Zhang, Yiwei; Xu, Zhiyuan; Shen, Xiaotong et al. (2014) Testing for association with multiple traits in generalized estimation equations, with application to neuroimaging data. Neuroimage 96:309-25

Ho, Yen-Yi; Baechler, Emily C; Ortmann, Ward et al. (2014) Using gene expression to improve the power of genome-wide association analysis. Hum Hered 78:94-103

Pan, Wei; Kim, Junghi; Zhang, Yiwei et al. (2014) A powerful and adaptive association test for rare variants. Genetics 197:1081-95

Zhang, Yiwei; Pan, Wei (2014) Adjusting for population stratification and relatedness with sequencing data. BMC Proc 8:S42

Zhang, Yiwei; Guan, Weihua; Pan, Wei (2013) Adjustment for population stratification via principal components in association analysis of rare variants. Genet Epidemiol 37:99-109

Shen, Xiaotong; Pan, Wei; Zhu, Yunzhang et al. (2013) On constrained and regularized high-dimensional regression. Ann Inst Stat Math 65:807-832

Showing the most recent 10 out of 69 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: