Genotyping and emerging sequencing technologies have enabled comprehensive interrogation of genetic variation across the human genome, thereby facilitating a study's ability to map genetic variants that influence phenotypes of interes. Nevertheless, genome-wide association studies (GWAS) and next-generation sequencing (NGS) projects have uncovered only a limited number of trait-influencing loci. While large increases in sample size will improve power to detect such variation, the ascertainment and sequencing/genotyping of such samples are costly and inefficient. Therefore, it is desirable to increase power to detect such variants without requiring additional sample collection. We propose novel methods for improved gene mapping of common and rare susceptibility variants that move beyond standard strategies typically applied to GWAS and NGS studies of complex traits. The first topic we consider is pleiotropic or cross- phenotype effects of genetic variants. Empirical studies have suggested that pleiotropy is widespread throughout the genome and that leveraging this additional information for gene mapping yields a more powerful analysis than an analysis that ignores such information.
In Aim 1, we propose novel statistical methods for genetic analysis of high-dimensional phenotype data using an innovative kernel distance-covariance (KDC) framework that allows for an arbitrary number of phenotypes both continuous and/or categorical in nature, as well as an arbitrary number of genotypes (permitting gene-based testing of both rare and common variants). We will use the KDC framework to implement tests of pleiotropy as well as tests of mediation. The second topic we consider is the mapping of rare susceptibility variants using affected pedigrees, which provide many attractive features for rare-variant testing that case-control studies lack.
In Aim 2, we propose a series of powerful statistical methods for rare-variant association testing in affected pedigrees that are based on a framework (recently published in AJHG) for rare-variant association testing in affected sibships. The existing framework compares rare-variant burden in a region by an affected sib pair to the number of regions that pairs shares identical by descent. We have shown the method is more powerful than case-control association testing given fixed sample size and further is robust to population stratification. In this proposal, we will extend the framework to handle affected pedigrees of arbitrary size and structure (rather than just affected sib pairs) and devise a powerful two-stage screening and validation strategy for rare-variant mapping that first compares familial cases in the pedigrees to external controls and then follows up the most interesting findings using an independent test based on our identity-by-descent sharing statistic among the affected relatives used in the first stage. We will apply the methods in Aims 1-2 to relevant data from genetic studies of complex traits in which we are directly involved. We also will implement the methods in public user-friendly software (Aim 3).
The goal of this project is to develop a set of statistical approaches to investigate two important topics in gene- mapping studies of complex human traits. First, we will develop techniques for identifying genes that have pleiotropic effects on phenotypes of possibly high dimension and further assess whether such genes have direct effects on such phenotypes or indirect effects through other possible factors. Second, we will develop tools to facilitate identification of rare polymorphic variation that increase risk for complex disease using data from affected pedigrees of arbitrary size and structure. We will evaluate these methods using simulated data and illustrate their value by applying them to genetic projects of complex traits in which we are actively involved. Application of the proposed methods to these datasets should improve our understanding of the genetic origins of various complex traits.