Over the last decade, scientists have identified many thousands of disease/trait susceptibility loci, with more to be discovered. However, the biological mechanisms by which these variants affect gene function and downstream biological processes remain unclear. A promising path forward is to study the effects of genetic variation on cellular/molecular phenotypes, such as the transcriptome, proteome, and epigenome (i.e., ?omics? phenotypes). Additionally, the analysis of the joint associations of a genetic variant to complex trait(s) and omics-phenotypes has the potential to elucidate mechanisms underlying known associations or to reveal novel relationships between genetic variants and complex traits.
Our first aim i s to develop methods to integrate QTL association summary statistics from multiple studies/tissue-/cell-types with overlapping or independent samples to identify the omics QTLs and multi-omics QTLs with coordinated effects (and potentially different effect sizes) on multiple omics phenotypes in different conditions. Moreover, most existing omics QTL analyses focus on cis-associations, because the study of trans-associations is underpowered after considering multiple testing adjustment. In our second aim, we will propose novel methods to detect a particular yet quite prevalent type of trans-association ? the type mediated by a cis-gene transcript. Different than the trans-associations with extreme effects that are often tissue-specific, the trans-associations mediated by cis-gene expression often present effects shared among functionally related tissue types. As such, our proposed mediation methods will borrow information across tissue types to improve power. An ultimate goal is how to further utilize (cis- and trans-) QTLs in disease/trait-mapping and further understand their disease/trait relevance. In the third aim, by harnessing gene-specific patterns of how eQTL effects are shared across different tissue types, we will develop improved methods over existing methods for transcriptome-wide association studies. We will propose models predicting gene expression levels in multiple tissue types and further associate genotype-predicted expression levels in disease-relevant tissue types with complex diseases/traits using existing GWAS data. In the three aims, we will analyze breast cancer, schizophrenia, and height, respectively, as three focused traits in each aim by integrating data from Genotype-Tissue Expression Project (GTEx), Clinical Proteomic Tumor Analysis Consortium (CPTAC), UK Biobank and summary statistics from large-scale genome-wide association studies consortia. The proposed methods can be applied to other related diseases and traits. Our work will identify new gene candidates associated with complex traits, as well as provide new hypotheses, tools, and data resources that will accelerate future research efforts to understand the susceptibility mechanisms of human diseases.
By developing tailored integrative genomics methods and tools, in this project we will assess the effects of genetic variants (in particular disease/trait-related variants) on nearby and distal (cis- and trans-) gene activity, genome structure, and protein abundance in many human tissues and cell contexts. We will develop novel methods and tools, improving upon existing methods for transcriptome-wide association studies, to map gene- level trait-associations for complex traits using genotype-predicted gene expression levels in disease-relevant tissues. Our goal is to develop new statistical methods and computational tools in order to characterize the effects of genetic regulation on different molecular phenotypes in different cell contexts, to integrate and re- capitalize on existing large data resources, and to create a more comprehensive understanding of how genetic variation impacts biological processes related to complex diseases.
Showing the most recent 10 out of 12 publications