Although numerous common variants have been identified for human complex traits in the past few years, a large proportion of the heritability o these traits remains unexplained. Next-generation sequencing is currently being employed to uncover the full spectrum of genetic variations with a particular focus on identifying low frequency variants (e.g. minor allele frequency (MAF) between 1-5%) and rare variants (e.g. MAF<1%) associated with complex traits. However identification of associated rare variants is extremely challenging due to the low frequency and allelic heterogeneity. Therefore it is crucial to develop effective designs and efficient analytical and computational tools to address these difficulties. Although case-control studies were extensively used for association studies of common variations, family designs provide an effective alternative for rare variant analysis due to the enrichment of causal rare variants in pedigrees. In addition, family studies are robust in the presence of population stratification, a property that is essential since routinely used methods for common variants may fail to correct for population stratification of rare variants. For family sequencing studies, a critical step is to infer underlying genotypes from sequence data and inaccurate genotype calls can lead to Mendelian inconsistencies and power loss of association studies. To address these challenges, in this application we propose to develop a comprehensive suite of statistical and computational methods for genotype/haplotype inference from family sequencing data and for rare variant association analysis in families. Using these methods we will carry out simulations to investigate cost efficiency of various family designs in comparison with case-control studies for improved power of detecting rare variant associations. We will also apply our methods to sequence data in our Amish family study and datasets from our collaborators on multiple complex traits. User-friendly and well-documented software packages will be released for public use.

Public Health Relevance

Large-scale sequencing studies are being widely carried out to identify less common and rare variants associated with complex diseases and disease-related traits. However, this strategy is challenging and little is known about effective approaches to discover rare variants from sequencing and to identify disease- associated rare variants. In this application, we aim to develop a comprehensive suite of statistical methods and computational tools for variant calling and rare variant association studies from sequencing data in families and to apply our methods to studies of multiple complex diseases.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
1R01HG006857-01A1
Application #
8504179
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Brooks, Lisa
Project Start
2013-06-01
Project End
2017-05-31
Budget Start
2013-06-01
Budget End
2014-05-31
Support Year
1
Fiscal Year
2013
Total Cost
$380,000
Indirect Cost
$136,227
Name
Vanderbilt University Medical Center
Department
Physiology
Type
Schools of Medicine
DUNS #
004413456
City
Nashville
State
TN
Country
United States
Zip Code
37212
Li, Bingshan; Liu, Dajiang J; Leal, Suzanne M (2013) Identifying rare variants associated with complex traits via sequencing. Curr Protoc Hum Genet Chapter 1:Unit 1.26