Although numerous common variants have been identified for human complex traits in the past few years, a large proportion of the heritability o these traits remains unexplained. Next-generation sequencing is currently being employed to uncover the full spectrum of genetic variations with a particular focus on identifying low frequency variants (e.g. minor allele frequency (MAF) between 1-5%) and rare variants (e.g. MAF<1%) associated with complex traits. However identification of associated rare variants is extremely challenging due to the low frequency and allelic heterogeneity. Therefore it is crucial to develop effective designs and efficient analytical and computational tools to address these difficulties. Although case-control studies were extensively used for association studies of common variations, family designs provide an effective alternative for rare variant analysis due to the enrichment of causal rare variants in pedigrees. In addition, family studies are robust in the presence of population stratification, a property that is essential since routinely used methods for common variants may fail to correct for population stratification of rare variants. For family sequencing studies, a critical step is to infer underlying genotypes from sequence data and inaccurate genotype calls can lead to Mendelian inconsistencies and power loss of association studies. To address these challenges, in this application we propose to develop a comprehensive suite of statistical and computational methods for genotype/haplotype inference from family sequencing data and for rare variant association analysis in families. Using these methods we will carry out simulations to investigate cost efficiency of various family designs in comparison with case-control studies for improved power of detecting rare variant associations. We will also apply our methods to sequence data in our Amish family study and datasets from our collaborators on multiple complex traits. User-friendly and well-documented software packages will be released for public use.

Public Health Relevance

Large-scale sequencing studies are being widely carried out to identify less common and rare variants associated with complex diseases and disease-related traits. However, this strategy is challenging and little is known about effective approaches to discover rare variants from sequencing and to identify disease- associated rare variants. In this application, we aim to develop a comprehensive suite of statistical methods and computational tools for variant calling and rare variant association studies from sequencing data in families and to apply our methods to studies of multiple complex diseases.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG006857-02
Application #
8668121
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Brooks, Lisa
Project Start
2013-06-01
Project End
2017-05-31
Budget Start
2014-06-01
Budget End
2015-05-31
Support Year
2
Fiscal Year
2014
Total Cost
$364,527
Indirect Cost
$125,455
Name
Vanderbilt University Medical Center
Department
Physiology
Type
Schools of Medicine
DUNS #
004413456
City
Nashville
State
TN
Country
United States
Zip Code
37212
Yan, Qi; Chen, Rui; Sutcliffe, James S et al. (2016) The impact of genotype calling errors on family-based studies. Sci Rep 6:28323
Liu, Yongzhuang; Liu, Jian; Lu, Jianguo et al. (2016) Joint detection of copy number variations in parent-offspring trios. Bioinformatics 32:1130-7
Chang, Lun-Ching; Li, Bingshan; Fang, Zhou et al. (2016) A computational method for genotype calling in family-based sequencing data. BMC Bioinformatics 17:37
Yan, Qi; Weeks, Daniel E; Celedón, Juan C et al. (2015) Associating Multivariate Quantitative Phenotypes with Genetic Variants in Family Samples with a Novel Kernel Machine Regression Method. Genetics 201:1329-39
Wei, Qiang; Zhan, Xiaowei; Zhong, Xue et al. (2015) A Bayesian framework for de novo mutation calling in parents-offspring trios. Bioinformatics 31:1375-81
Chen, Rui; Wei, Qiang; Zhan, Xiaowei et al. (2015) A haplotype-based framework for group-wise transmission/disequilibrium tests for rare variant association analysis. Bioinformatics 31:1452-9
Li, Bingshan; Wei, Qiang; Zhan, Xiaowei et al. (2015) Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data. PLoS Genet 11:e1005271
Li, Bingshan; Liu, Dajiang J; Leal, Suzanne M (2013) Identifying rare variants associated with complex traits via sequencing. Curr Protoc Hum Genet Chapter 1:Unit 1.26