Statistical gene mapping in pedigrees advances our knowledge of the genetic architecture of complex diseases (e.g., diabetes, Alzheimer's disease, asthma) and quantitative traits (e.g., bone density, blood pressure, milk production) in human and animal populations. Unfortunately most observed pedigree data are not complete and have missing data. Efficiently inferring haplotype configurations and calculating identity-by-descent (IBD) probabilities for complex pedigrees with large numbers of linked loci and missing marker data by using the observed genotype data (especially dense single nucleotide polymorphism (SNP) markers) are critical components and remain challenging in statistical gene mapping. The broad, long-term objectives of the proposed work are to develop efficient statistical and computational methods for haplotyping and gene mapping in large pedigrees with missing and phase unknown marker data.
The specific aims are to 1) extend our conditional enumeration haplotyping method that currently works with complete pedigree data to pedigrees with missing marker data, and then improve the method so that it can handle linkage disequilibrium (LD) between markers;2) develop a computationally efficient method for estimating IBD probabilities in large pedigrees with large numbers of linked loci and with missing marker data, develop a fine mapping method by modeling LD information between dense (SNP) markers, and evaluate the performance of the IBD probability estimation method in terms of quantitative trait loci (QTL) mapping accuracy in linkage analysis and fine mapping;3) apply the proposed methods to linkage analysis and fine mapping of two large, real human pedigree data sets (a 1623-person Hutterite pedigree and a 1412- person Amish pedigree);and 4) develop computer software to implement aims 1 and 2. Our approach to these aims is based on the computation of conditional probabilities of possible ordered genotypes at phase unknown markers and the calculation of likelihood of haplotype configurations. By setting a threshold value for the conditional probabilities of ordered genotypes at phase unknown markers and a threshold value for the conditional probabilities of haplotype configurations, the proposed haplotyping method identifies a subset of haplotype configurations with the highest likelihoods for a pedigree. IBD probabilities are estimated based on this subset of haplotype configurations, and then are used as input to variance components based QTL mapping methods in large pedigrees. The methodologies developed in this research will enhance our ability to map QTL and complex disease genes in human and animal populations.
Showing the most recent 10 out of 22 publications