We propose in the renewal of this MERIT award application to continue developing advanced statistical and computational methods for analysis of correlated and high-dimensional data, which arise frequently in health science research, especially in cancer research. Correlated data are often observed in observational studies and clinical trials, such as longitudinal studies and familial studies. High-dimensional data have emerged rapidly in recent years due to the advance of high-throughput 'omics technologies, e.g, in Genome- Wide Association Studies (GWAS), and genome-wide epigenetic (DNA methylation) studies. Massive next generation sequencing data are soon available. There is an urgent need to develop advanced stati stical and computational methods for analyzing such high throughput 'omics data in observational studies and clinical trials. We propose to develop statistical and computational methods for analysis of (1) genome-wide association studies;(2) sequencing data for studying rare variant effects;(3) genome-wide DNA methylation studies;(4) gene-gene and gene-environment interactions. We will develop methods for both case-control studies and cohort studies, such as longitudinal studies and survival studies. We will study the theoretical properties of the proposed methods and evaluate the finite s ample performance using simulation studies. We will develop efficient numerical algorithms and user-friendly statistical software, and disseminate these tools to health sciences researchers. In collaboration with biomedical investigators, we will apply the proposed models methods to data from several genome-wide epidemiological studies in cancer and other chronic diseases.
Development of new statistical methods for analysis of correlated and high-dimensional data will provide powerful analytic tools to advance 'omics research in observational studies and clinical trials and to help understand the roles of genes, gene products, and the environment in causing human diseases.
|Chen, Jun; Behnam, Ehsan; Huang, Jinyan et al. (2017) Fast and robust adjustment of cell mixtures in epigenome-wide association studies with SmartSVA. BMC Genomics 18:413|
|Barnett, Ian; Mukherjee, Rajarshi; Lin, Xihong (2017) The Generalized Higher Criticism for Testing SNP-Set Effects in Genetic Association Studies. J Am Stat Assoc 112:64-76|
|Sofer, Tamar; Schifano, Elizabeth D; Christiani, David C et al. (2017) Weighted pseudolikelihood for SNP set analysis with multiple secondary outcomes in case-control genetic association studies. Biometrics 73:1210-1220|
|Chen, Jun; Just, Allan C; Schwartz, Joel et al. (2016) CpGFilter: model-based CpG probe filtering with replicates for epigenome-wide association studies. Bioinformatics 32:469-71|
|Chen, Han; Wang, Chaolong; Conomos, Matthew P et al. (2016) Control for Population Structure and Relatedness for Binary Traits in Genetic Association Studies via Logistic Mixed Models. Am J Hum Genet 98:653-66|
|Lin, Xinyi; Lee, Seunggeun; Wu, Michael C et al. (2016) Test for rare variants by environment interactions in sequencing association studies. Biometrics 72:156-64|
|Yung, Godwin; Lin, Xihong (2016) Validity of using ad hoc methods to analyze secondary traits in case-control association studies. Genet Epidemiol 40:732-743|
|Seow, Wei Jie; Kile, Molly L; Baccarelli, Andrea A et al. (2014) Epigenome-wide DNA methylation changes with development of arsenic-induced skin lesions in Bangladesh: a case-control follow-up study. Environ Mol Mutagen 55:449-56|
|Wang, Chaolong; Zhan, Xiaowei; Bragg-Gresham, Jennifer et al. (2014) Ancestry estimation and control of population stratification for sequence-based association studies. Nat Genet 46:409-15|
|Lee, Seunggeung; Abecasis, Gonçalo R; Boehnke, Michael et al. (2014) Rare-variant association analysis: study designs and statistical tests. Am J Hum Genet 95:5-23|
Showing the most recent 10 out of 67 publications