Genetic linkage maps can be created using pedigree methods, in which individuals whose relationships are known can be studied. These allow us to infer the locations in the genome where genetic crossovers occur. They have the limitation that there are only a limited number of individuals in a pedigree. When we try to make a genetic map of markers that are close together in the genome, we may not see any crossovers between those markers, even on a rather large pedigree. Linkage disequilibrium mapping uses individuals sampled from a large population. They are connected by a pedigree which is much deeper in time, and thus has many more individuals in it and many more opportunities for crossover to occur. The difficulty is that we do not known the pedigree, and must use the genetic data to estimate it. The random trees of ancestry of gene copies in a large population are called coalescents. The widely-used statistical method known as maximum likelihood can be used to analyze linkage disequilibrium mapping, by summing up the likelihood over all the possible coalescent trees that could explain the data. The number of these trees is vast, but it has been possible to approximate likelihoods in coalescents successfully using random sampling methods. We have developed such a sampling method, a Metropolis-Hastings sampler, for the case of recombining loci. It is proposed to adapt this to linkage disequilibrium mapping. One of the problems that has to be solved to do this is to make use of data that consists of diploid genotypes. It is proposed to do this by an additional stage of random sampling, so as to sum over all the ways that the diploid genotypes could be resolved into haplotypes. We also need to be able to correct for the ascertainment bias that is introduced when a disease allele is preferentially sampled, with less attention paid to the normal allele. It is proposed to do this by treating the disease alleles as if they were a separate population, exchanging genetic material with the normal alleles by crossing-over and mutation. We have existing methods for coalescent likelihoods for geographically structured populations, and methods from these can be used to accomplish this. For some of the genetic markers, such as Single Nucleotide Polymorphisms, there are also ascertainment problems which arise because those sites that show no polymorphism are not scored. It is proposed to use a simple correction to the likelihood to cope with this. We will make available computer programs in C++ to compute the likelihoods, and distribute them, free, over the Internet as source code, documentation, and executables.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG001989-02
Application #
6138900
Study Section
Special Emphasis Panel (ZRG2-GNM (02))
Program Officer
Brooks, Lisa
Project Start
1999-01-01
Project End
2001-12-31
Budget Start
2000-01-01
Budget End
2000-12-31
Support Year
2
Fiscal Year
2000
Total Cost
$171,625
Indirect Cost
Name
University of Washington
Department
Genetics
Type
Schools of Arts and Sciences
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195