We propose developing, evaluating and comparing biologically motivated statistical methods in analyzing and interpreting microarray data, including a heart failure dataset. Some of the proposed methods will incorporate or be applied to multiple types of genomic or proteomic data. The overarching theme is that, to increase statistical power for new discovery and to maximize the use of existing knowledge and data, we propose integrating gene networks and multiple types of high-throughput data, such as gene expression data, DNA-protein binding, DNA sequences and SNP data, with novel analysis methods and applications. Specifically, we propose 1) further development and evaluation of a network-based statistical analysis method for genomic discovery with applications to several real datasets;2) developing analysis strategies to integrate gene networks and gene functional annotations for genomic discovery, such as detecting differentially expressed genes based on expression data, and identify binding target genes of a single transcription factor based on DNA-protein binding (i.e. ChIP-chip) data;3) developing analysis strategies to integrate gene networks and multiple types of genomic and proteomic data, such as gene expression data, DNA-protein binding data, and DNA sequences;4) integrating gene networks into regression analysis for variable selection and parameter smoothing with applications to inferring expression quantitative trait loci (eQTL) by regressing expression data on genotype data. 5) software development for free public use.
This proposed research is expected not only to advance statistical methodology and theory for complex data with complicated dependency structures, but also to contribute valuable analysis tools to the elucidation of molecular mechanisms underlying diseases.
|Xu, Zhiyuan; Shen, Xiaotong; Pan, Wei et al. (2014) Longitudinal analysis is more powerful than cross-sectional analysis in detecting genetic association with neuroimaging phenotypes. PLoS One 9:e102312|
|Zhang, Yiwei; Xu, Zhiyuan; Shen, Xiaotong et al. (2014) Testing for association with multiple traits in generalized estimation equations, with application to neuroimaging data. Neuroimage 96:309-25|
|Ho, Yen-Yi; Baechler, Emily C; Ortmann, Ward et al. (2014) Using gene expression to improve the power of genome-wide association analysis. Hum Hered 78:94-103|
|Pan, Wei; Kim, Junghi; Zhang, Yiwei et al. (2014) A powerful and adaptive association test for rare variants. Genetics 197:1081-95|
|Zhang, Yiwei; Pan, Wei (2014) Adjusting for population stratification and relatedness with sequencing data. BMC Proc 8:S42|
|Austin, Erin; Pan, Wei; Shen, Xiaotong (2014) Does the inclusion of rare variants improve risk prediction? BMC Proc 8:S94|
|Kim, Junghi; Wozniak, Jeffrey R; Mueller, Bryon A et al. (2014) Comparison of statistical tests for group differences in brain functional networks. Neuroimage 101:681-94|
|Zhu, Yunzhang; Shen, Xiaotong; Pan, Wei (2014) Structural pursuit over multiple undirected graphs. J Am Stat Assoc 109:1683-1696|
|Zhang, Yiwei; Guan, Weihua; Pan, Wei (2013) Adjustment for population stratification via principal components in association analysis of rare variants. Genet Epidemiol 37:99-109|
|Shen, Xiaotong; Pan, Wei; Zhu, Yunzhang et al. (2013) On constrained and regularized high-dimensional regression. Ann Inst Stat Math 65:807-832|
Showing the most recent 10 out of 69 publications