Although numerous common variants have been identified for human complex traits in the past few years, a large proportion of the heritability o these traits remains unexplained. Next-generation sequencing is currently being employed to uncover the full spectrum of genetic variations with a particular focus on identifying low frequency variants (e.g. minor allele frequency (MAF) between 1-5%) and rare variants (e.g. MAF<1%) associated with complex traits. However identification of associated rare variants is extremely challenging due to the low frequency and allelic heterogeneity. Therefore it is crucial to develop effective designs and efficient analytical and computational tools to address these difficulties. Although case-control studies were extensively used for association studies of common variations, family designs provide an effective alternative for rare variant analysis due to the enrichment of causal rare variants in pedigrees. In addition, family studies are robust in the presence of population stratification, a property that is essential since routinely used methods for common variants may fail to correct for population stratification of rare variants. For family sequencing studies, a critical step is to infer underlying genotypes from sequence data and inaccurate genotype calls can lead to Mendelian inconsistencies and power loss of association studies. To address these challenges, in this application we propose to develop a comprehensive suite of statistical and computational methods for genotype/haplotype inference from family sequencing data and for rare variant association analysis in families. Using these methods we will carry out simulations to investigate cost efficiency of various family designs in comparison with case-control studies for improved power of detecting rare variant associations. We will also apply our methods to sequence data in our Amish family study and datasets from our collaborators on multiple complex traits. User-friendly and well-documented software packages will be released for public use.

Public Health Relevance

Large-scale sequencing studies are being widely carried out to identify less common and rare variants associated with complex diseases and disease-related traits. However, this strategy is challenging and little is known about effective approaches to discover rare variants from sequencing and to identify disease- associated rare variants. In this application, we aim to develop a comprehensive suite of statistical methods and computational tools for variant calling and rare variant association studies from sequencing data in families and to apply our methods to studies of multiple complex diseases.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Vanderbilt University Medical Center
Schools of Medicine
United States
Zip Code
Li, Bingshan; Liu, Dajiang J; Leal, Suzanne M (2013) Identifying rare variants associated with complex traits via sequencing. Curr Protoc Hum Genet Chapter 1:Unit 1.26