We propose developing, evaluating and comparing statistical methods in analyzing and interpreting microarray data, including a heart failure dataset collected in the co-Principal Investigator's lab. Some of the proposed methods will incorporate or be applied to other types of genomic or proteomic data.
In Aim A.1, we consider detecting differential gene expression. A weighted permutation scheme is proposed to improve permutation-based inference procedures, and these methods will be compared with several recently proposed parametric and semi-parametric methods. We also propose incorporating existing biological data in the statistical methods.
In Aim A.2, we study a clustering-based classification (CBC) method for gene function prediction using microarray data. CBC will be compared with other state-of-the-art supervised machine learning algorithms, such as support vector machines and random forests. Other sources of biological data, such as protein-protein interaction data, will be incorporated in the proposed method.
In Aim A.3, we consider sample classification and prediction based on gene expression profiles in a general framework called penalized partial least squares (PPLS). PPLS will be compared with other supervised machine learning algorithms. We will extend PPLS to combine microarray data from multiple studies. We plan to implement the proposed statistical methods in R and make the software publicly and freely available.

Agency
National Institute of Health (NIH)
Institute
National Heart, Lung, and Blood Institute (NHLBI)
Type
Research Project (R01)
Project #
5R01HL065462-05
Application #
7056185
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Wolz, Michael
Project Start
2002-01-01
Project End
2008-05-31
Budget Start
2006-06-01
Budget End
2007-05-31
Support Year
5
Fiscal Year
2006
Total Cost
$145,987
Indirect Cost
Name
University of Minnesota Twin Cities
Department
Type
Schools of Public Health
DUNS #
555917996
City
Minneapolis
State
MN
Country
United States
Zip Code
55455
Xu, Zhiyuan; Shen, Xiaotong; Pan, Wei et al. (2014) Longitudinal analysis is more powerful than cross-sectional analysis in detecting genetic association with neuroimaging phenotypes. PLoS One 9:e102312
Zhang, Yiwei; Xu, Zhiyuan; Shen, Xiaotong et al. (2014) Testing for association with multiple traits in generalized estimation equations, with application to neuroimaging data. Neuroimage 96:309-25
Ho, Yen-Yi; Baechler, Emily C; Ortmann, Ward et al. (2014) Using gene expression to improve the power of genome-wide association analysis. Hum Hered 78:94-103
Pan, Wei; Kim, Junghi; Zhang, Yiwei et al. (2014) A powerful and adaptive association test for rare variants. Genetics 197:1081-95
Zhang, Yiwei; Pan, Wei (2014) Adjusting for population stratification and relatedness with sequencing data. BMC Proc 8:S42
Austin, Erin; Pan, Wei; Shen, Xiaotong (2014) Does the inclusion of rare variants improve risk prediction? BMC Proc 8:S94
Kim, Junghi; Wozniak, Jeffrey R; Mueller, Bryon A et al. (2014) Comparison of statistical tests for group differences in brain functional networks. Neuroimage 101:681-94
Zhu, Yunzhang; Shen, Xiaotong; Pan, Wei (2014) Structural pursuit over multiple undirected graphs. J Am Stat Assoc 109:1683-1696
Austin, Erin; Pan, Wei; Shen, Xiaotong (2013) Penalized Regression and Risk Prediction in Genome-Wide Association Studies. Stat Anal Data Min 6:
Zhang, Yiwei; Shen, Xiaotong; Pan, Wei (2013) Adjusting for population stratification in a fine scale with principal components and sequencing data. Genet Epidemiol 37:787-801

Showing the most recent 10 out of 69 publications