There is increasing evidence that genome-wide association studies (GWAS) represent a powerful approach in the identification of genes involved in common human diseases. GWAS may use either population-based designs or traditional family-based designs. One of the biggest advantages of family-based GWAS is its robustness to possible effects of population stratification, which can inflate the false positive rate. However, this robustness can also lead to a loss of power. Thus, more powerful association tests that are robust to population stratification are needed for GWAS under family-based designs. On the other hand, simulation studies as well as studies of the genetic architectures of several common diseases suggest that causal variants include both common and rare. New technologies allow for sequencing of parts of the genome-or, in the future, the whole genome-of large groups of individuals. Statistical methods developed to detect associations of common variants may not be optimal in detecting associations of rare variants. So there is a great need to develop powerful statistical methods to detect rare variants for family-based sequence data. This proposed project will explore novel statistical methods and feasible algorithms to map complex disease genes for family-based GWAS. The first specific aim of this project is to develop a more powerful single-marker two-stage joint analysis for family-based GWAS that is robust to population stratification. The second specific aim is to develop a new association test that can detect rare variants under family-based designs. Using extensive simulation studies to compare the proposed methods with existing methods and applying the proposed methods to selected family-based GWAS data sets are the third specific aim. The last specific aim of this project is to develop computer software for the newly developed methods and release the software to the scientific community at no charge. The developments of these novel statistical methods and the user friendly tools generated from this project will aid researchers in genomic localization of genes that contribute to complex genetic traits. New, sound statistical methods will greatly benefit the scientific community.

Public Health Relevance

Although most published genome-wide association studies used population-based designs, family-based designs have played an important role in identifying disease-associated genes. Family-based designs offer advantages in terms of quality control and robustness to population stratification. However, statistical methods to analyze genome-wide association studies under family-based designs have not received as much attention as methods for population-based designs. This project will develop novel and powerful statistical methods to analyze data from family-based genome-wide association studies. The newly developed statistical methods will benefit all investigators conducting genome-wide family-based association studies, enhance the use of family- based designs, and accelerate the discovery of genes responsible for complex diseases.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Small Research Grants (R03)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Ramos, Erin
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Michigan Technological University
Biostatistics & Other Math Sci
Schools of Arts and Sciences
United States
Zip Code
Sha, Qiuying; Zhang, Shuanglin (2015) Test of rare variant association based on affected sib-pairs. Eur J Hum Genet 23:229-37
Sha, Qiuying; Zhang, Shuanglin (2014) A novel test for testing the optimally weighted combination of rare and common variants based on data of parents and affected children. Genet Epidemiol 38:135-43
Sha, Qiuying; Zhang, Shuanglin (2014) A rare variant association test based on combinations of single-variant tests. Genet Epidemiol 38:494-501