The Human Genome Project and follow-on projects such as the International HapMap, 1000 Genomes, and ENCODE Projects are providing powerful resources for the identification of genes that predispose to human diseases. Along with these resources have come increasingly efficient technologies for genotyping and DNA sequencing. These resources and technologies will be critical as we continue to seek to unravel the complex etiologic basis of common human diseases. In this proposal, I address a set of statistical problems that arise in human disease gene mapping. I describe how my colleagues and I will address these problems through analytic methods, computer simulation, and application to interesting complex disease genetics data, and how we will generalize these solutions through the production, distribution, and support of efficient computer software. Specifically, we will: 1. identify the range of genetic models consistent with available linkage and association information to aid in the efficient design of large-scale resequencing studies; 2. develop efficient multi-stage designs for large resequencing and follow-up association studies with a particular focus on the optimal combination of sequencing, genotyping, and genotype imputation; 3. identify the most probable set of causal variants among those tested in GWAS or resequencing studies; 4. develop methods for efficient association fine mapping of known causal loci and detection of additional causal loci given GWA and/or resequencing data on multiple ancestry groups;and 5. continue to develop, test, distribute, and support computer software based on the methods that arise from the other aims of this project, and update, distribute, and support our current software, including SIMLINK, RHMAP, RELPAIR, SIBMED, LocusZoom, Snipper, and Spotter. In addition, we will continue to be opportunistic in identifying and addressing important statistical problems that are related to the goals of this project. Under separate funding, we will apply the resulting methods to the analysis of data from genetic studies of type 2 diabetes and related quantitative traits.

Public Health Relevance

Studies to localize and identify genetic variants that predispose to human diseases have the potential to inform breakthrough strategies to develop new drugs, to develop genetic tests to stratify risk, and to enable more targeted approaches to prevention and treatment in the population. Efficient statistical and computational methods are critical for the success of such studies.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-GGG-M (91))
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Michigan Ann Arbor
Biostatistics & Other Math Sci
Schools of Public Health
Ann Arbor
United States
Zip Code
Wojcik, Genevieve L; Fuchsberger, Christian; Taliun, Daniel et al. (2018) Imputation-Aware Tag SNP Selection To Improve Power for Large-Scale, Multi-ethnic Association Studies. G3 (Bethesda) 8:3255-3267
Reppell, M; Zöllner, S (2018) An efficient algorithm for generating the internal branches of a Kingman coalescent. Theor Popul Biol 122:57-66
Jiang, Yu; Chen, Sai; McGuire, Daniel et al. (2018) Proper conditional analysis in the presence of missing data: Application to large scale meta-analysis of tobacco use phenotypes. PLoS Genet 14:e1007452
Dutta, Diptavo; Scott, Laura; Boehnke, Michael et al. (2018) Multi-SKAT: General framework to test for rare-variant association with multiple phenotypes. Genet Epidemiol :
Ray, Debashree; Boehnke, Michael (2018) Methods for meta-analysis of multiple traits using GWAS summary statistics. Genet Epidemiol 42:134-145
Scott, Robert A; Scott, Laura J; Mägi, Reedik et al. (2017) An Expanded Genome-Wide Association Study of Type 2 Diabetes in Europeans. Diabetes 66:2888-2902
Chiu, Chi-Yang; Jung, Jeesun; Chen, Wei et al. (2017) Meta-analysis of quantitative pleiotropic traits for next-generation sequencing with multivariate functional linear models. Eur J Hum Genet 25:350-359
Chiu, Chi-Yang; Jung, Jeesun; Wang, Yifan et al. (2017) A comparison study of multivariate fixed models and Gene Association with Multiple Traits (GAMuT) for next-generation sequencing. Genet Epidemiol 41:18-34
Taliun, Daniel; Chothani, Sonia P; Schönherr, Sebastian et al. (2017) LASER server: ancestry tracing with genotypes or sequence reads. Bioinformatics 33:2056-2058
McCarthy, Shane; Das, Sayantan; Kretzschmar, Warren et al. (2016) A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet 48:1279-83

Showing the most recent 10 out of 67 publications