This project is concerned with developing new statistical methodology for population genetic data. Attention will be focused on three main areas concerned with dependencies among sets of alleles: the characterization of population structure, the characterization of the association patterns within and between genetic markers and along haplotypes, and the characterization of relatedness and inbreeding for individuals. Theory will be developed at least in part in response to the needs of current large-scale SNP surveys for humans and in anticipation of whole-genome sequence data sets. The work is proposed by a group of investigators in the Department of Biostatistics at the University of Washington. They propose to continue collaboration with W.G. Hill at the University of Edinburgh and P.M. Visscher at the Queensland Institute of Medical Research. This extended group has interacted successfully over the previous award period, as evidenced by a set of 40 publications. The Beagle approach of S.R. and B.L. Browning will be applied to the detection of tracts of identity by descent. The resulting measures of relationship will be used to refine tests for marker-disease association and to estimate heritability of complex human traits. The population-specific measures of population structure described by B.S. Weir and W.G. Hill will be applied to recently published whole-genome SNP data sets and whole-genome sequence data sets. Methods will be sought to improve methods of drawing inferences about these quantities. Measures of identity by descent and of population structure have the potential to identify regions of the human genome that have been subject to natural selection, and these analyses will be conducted with attention to the large variation and skewness imposed by the evolutionary process. The work of C.C. Laurie and B.S. Weir on detecting chromosomal features, such as inversions, by examining correlations of individual SNPs with principal components derived from large sets of SNPs will be extended. The partial regression approach introduced for QTL mapping will be applied to this problem. Measures of linkage disequilibrium that do not depend on genotypic phase were introduced and have been used previously by these investigators. They will now be extended to the situation of disequilibrium between pairs of loci when several SNPs typed for each gene. Association mapping continues to be of considerable interest to human geneticists and the problem of accounting for (even low level) relatedness will be addressed. Ignoring individuals with at least one relative in a case-control study, for example, can lead to a loss of power. Previous work of Y. Choi and B.S. Weir that modified simple allelic association tests will be extended to the more appropriate logistic regression methods.

Public Health Relevance

As population genetic datasets grow, there is both the need and the opportunity to quantify the dependencies among alleles within and between individuals, or within and between populations. Individual-level dependencies address inbreeding and relatedness and can lead to estimates of heritability of complex human traits. Relatedness estimates can be used to modify tests of association between genetic markers and human diseases. Allelic dependencies at the population level provide characterization of population structure and can be used to infer the presence of natural selection in the history of the populations. Work is proposed to strengthen ways of estimating allelic dependencies, with attention being paid to the variation imposed by the evolutionary process as well as the variation from sampling individuals from current populations.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Eckstrand, Irene A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Washington
Biostatistics & Other Math Sci
Schools of Public Health
United States
Zip Code
Xue, Angli; Wu, Yang; Zhu, Zhihong et al. (2018) Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes. Nat Commun 9:2941
Graffelman, Jan; Weir, Bruce S (2018) Multi-allelic exact tests for Hardy-Weinberg equilibrium that account for gender. Mol Ecol Resour 18:461-473
Goudet, Jérôme; Kay, Tomas; Weir, Bruce S (2018) How to estimate kinship. Mol Ecol 27:4121-4135
Qi, Ting; Wu, Yang; Zeng, Jian et al. (2018) Identifying gene targets for brain-related traits using transcriptomic and methylomic data from blood. Nat Commun 9:2282
Graffelman, Jan; Weir, Bruce S (2018) On the testing of Hardy-Weinberg proportions and equality of allele frequencies in males and females at biallelic genetic markers. Genet Epidemiol 42:34-48
Yengo, Loic; Zhu, Zhihong; Wray, Naomi R et al. (2018) Reply to Kardos et al.: Estimation of inbreeding depression from SNP data. Proc Natl Acad Sci U S A 115:E2494-E2495
Zheng, Xiuwen; Gogarten, Stephanie M; Lawrence, Michael et al. (2017) SeqArray-a storage-efficient high-performance data format for WGS variant calls. Bioinformatics 33:2251-2257
Galván-Femenía, Iván; Graffelman, Jan; Barceló-I-Vidal, Carles (2017) Graphics for relatedness research. Mol Ecol Resour 17:1271-1282
Puig, X; Ginebra, J; Graffelman, J (2017) A Bayesian test for Hardy-Weinberg equilibrium of biallelic X-chromosomal markers. Heredity (Edinb) 119:226-236
Graffelman, Jan; Jain, Deepti; Weir, Bruce (2017) A genome-wide study of Hardy-Weinberg equilibrium with next generation sequence data. Hum Genet 136:727-741

Showing the most recent 10 out of 75 publications