Although numerous common variants have been identified for human complex traits in the past few years, a large proportion of the heritability o these traits remains unexplained. Next-generation sequencing is currently being employed to uncover the full spectrum of genetic variations with a particular focus on identifying low frequency variants (e.g. minor allele frequency (MAF) between 1-5%) and rare variants (e.g. MAF<1%) associated with complex traits. However identification of associated rare variants is extremely challenging due to the low frequency and allelic heterogeneity. Therefore it is crucial to develop effective designs and efficient analytical and computational tools to address these difficulties. Although case-control studies were extensively used for association studies of common variations, family designs provide an effective alternative for rare variant analysis due to the enrichment of causal rare variants in pedigrees. In addition, family studies are robust in the presence of population stratification, a property that is essential since routinely used methods for common variants may fail to correct for population stratification of rare variants. For family sequencing studies, a critical step is to infer underlying genotypes from sequence data and inaccurate genotype calls can lead to Mendelian inconsistencies and power loss of association studies. To address these challenges, in this application we propose to develop a comprehensive suite of statistical and computational methods for genotype/haplotype inference from family sequencing data and for rare variant association analysis in families. Using these methods we will carry out simulations to investigate cost efficiency of various family designs in comparison with case-control studies for improved power of detecting rare variant associations. We will also apply our methods to sequence data in our Amish family study and datasets from our collaborators on multiple complex traits. User-friendly and well-documented software packages will be released for public use.

Public Health Relevance

Large-scale sequencing studies are being widely carried out to identify less common and rare variants associated with complex diseases and disease-related traits. However, this strategy is challenging and little is known about effective approaches to discover rare variants from sequencing and to identify disease- associated rare variants. In this application, we aim to develop a comprehensive suite of statistical methods and computational tools for variant calling and rare variant association studies from sequencing data in families and to apply our methods to studies of multiple complex diseases.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Vanderbilt University Medical Center
Schools of Medicine
United States
Zip Code
Chen, Rui; Davis, Lea K; Guter, Stephen et al. (2017) Leveraging blood serotonin as an endophenotype to identify de novo and rare variants involved in autism. Mol Autism 8:14
Wei, Q; Ye, Z; Zhong, X et al. (2017) Multiregion whole-exome sequencing of matched primary and metastatic tumors revealed genomic heterogeneity and suggested polyclonal seeding in colorectal cancer metastasis. Ann Oncol 28:2135-2141
Chang, Lun-Ching; Li, Bingshan; Fang, Zhou et al. (2016) A computational method for genotype calling in family-based sequencing data. BMC Bioinformatics 17:37
Zhan, Xiaowei; Hu, Youna; Li, Bingshan et al. (2016) RVTESTS: an efficient and comprehensive tool for rare variant association analysis using sequence data. Bioinformatics 32:1423-6
Liu, Yongzhuang; Liu, Jian; Lu, Jianguo et al. (2016) Joint detection of copy number variations in parent-offspring trios. Bioinformatics 32:1130-7
Yan, Qi; Chen, Rui; Sutcliffe, James S et al. (2016) The impact of genotype calling errors on family-based studies. Sci Rep 6:28323
Wei, Qiang; Zhan, Xiaowei; Zhong, Xue et al. (2015) A Bayesian framework for de novo mutation calling in parents-offspring trios. Bioinformatics 31:1375-81
Yan, Qi; Weeks, Daniel E; Celedón, Juan C et al. (2015) Associating Multivariate Quantitative Phenotypes with Genetic Variants in Family Samples with a Novel Kernel Machine Regression Method. Genetics 201:1329-39
Li, Bingshan; Wei, Qiang; Zhan, Xiaowei et al. (2015) Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data. PLoS Genet 11:e1005271
Chen, Rui; Wei, Qiang; Zhan, Xiaowei et al. (2015) A haplotype-based framework for group-wise transmission/disequilibrium tests for rare variant association analysis. Bioinformatics 31:1452-9

Showing the most recent 10 out of 12 publications