Analysis of genetic effects on disease conferred by clusters of related and highly polymorphic genes poses substantial problems in terms of the identification of the true disease locus once an association has been observed. Understanding and accounting for the molecular genetic properties that operate over the locus is essential for accurate interpretation of both disease association data and analysis of functional interactions between the molecules encoded in the major histocompatibility complex (MHC). Completion of the entire sequencing of the MHC has provided us with an invaluable tool for defining some of these properties. Using the 3.8 Mb sequence encompassing the MHC, we identified 443 microsatellite repeat sequences, each of which has a total length of 20 bp or greater. Of these, we determined that 249 of the markers were polymorphic using a rapid screening technique that measured amplicon size differences in a sample composed of pooled DNA from 36 individuals. The class of repeat, exact nucleotide position, relationship to known genes, PCR conditions, and D6S numbers for the 249 polymorphic microsatellites was compiled and published as a resource for the MHC community. Comparing patterns of linkage disequilibrium (LD) between pairs of markers with recombination fractions for the segment separating those pairs of loci can provide information regarding selective pressure to maintain linkage of specific combinations of alleles. Measurements of LD between pairs of HLA class I and II genes, particularly in the Centre d'Etude Polymorphisme Human (CEPH) families, have emerged over the past several years, revealing significant associations between loci separated by distances of >1 Mb. Previous studies of recombination using segregation analysis have suggested that, overall, recombination across the MHC is lower than expected, although the use of family material for the generation of such information is severely limited in power. Detailed mapping of recombination using sperm typing has been performed within a 200 kb segment of the class II locus in the region of the genes DNA and DMB, showing a strong correlation between recombination fraction and LD values for pairs of markers over this short segment. In order to generate a reliable estimate of the frequency and distribution of recombination events across the entire MHC, we used single-sperm typing of the microsatellite markers mentioned above. Genotyping of 20,031 single sperm from 12 individuals resulted in the identification and fine mapping of 325 recombinant chromosomes, illustrating several key principles of recombination in this region: 1) rates and distribution in the location of recombination events can differ significantly between individuals; 2) intense recombination hotspots occur every Mb or so, but are not evenly spaced; 3) warmspots of recombination are probably scattered throughout the complex since low levels of recombination occur fairly evenly across 100 kb segments between intense hotspots; and 4) specific sequence motifs associate significantly with recombination distribution. These data can now be used to properly select markers in genetic analyses such as disease association studies. Further, by considering LD values in light of recombination intensity, the data will enhance our ability to assess putative selection processes in the MHC. Through the International Histocompatibility Workshops, HLA allelic distributions across populations have been investigated. This organized transcontinental cooperation among HLA typing labs provides a unique and invaluable resource of HLA data that may not otherwise be available to the community. Historically, however, the data have suffered limitations in terms of typing resolution, consistency in typing protocols across laboratories, and numbers of samples typed for some populations. We have had the fortunate opportunity to genotype both the HLA class I and killer immunoglobulin-like receptors (KIR) loci in sets of samples (ranging in size from 23-129) from 29 worldwide populations. KIR and HLA typing have been performed on the majority of the 1612 samples sent to us and we plan to complete typing shortly. Once the genotyping is complete, we will determine KIR gene frequencies, HLA class I allele frequencies, and estimate haplotypes across the individual populations. We have also received 320 DNA samples representing distinct tribes from rural southern Cameroon that will be included in our population analysis of these genes. Consideration of these data in terms of physical map location of the various populations may provide clues to potential selection processes that have occurred differentially across the populations.

National Institute of Health (NIH)
Division of Basic Sciences - NCI (NCI)
Intramural Research (Z01)
Project #
Application #
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Basic Sciences
United States
Zip Code