A major goal in human genetics is to ascertain the relationship between DNA sequence variation and phenotypic variation. The conventional and contemporary approach is to link phenotypic variation, such as a disease, to a map of anonymous genetic markers. Subsequently, fine- structure genetic mapping, haplotype analysis and sequencing of genes in the target interval is used to identify the relevant sequence change(s) by the classical positional cloning paradigm. However, not a single complex disease gene has been identified by positional cloning. On the other hand, with the contemplated sequencing of a reference human genome and identification of all human genes, studies of complex genetic disorders are expected to be more efficient if one were to systematically search all human genes for functional variants by association and linkage disequilibrium studies. The relative merits of the two approaches depend on the relationship between functional variants in genes and neutral variation at linked genomic regions; they also depend on the nature and extent of sequence polymorphisms, and linkage disequilibrium, in genes, intergenic regions and flanking sites. This proposal is a pilot study, and a collaborative effort between human geneticists, large-scale sequences and population geneticists, which aims to elucidate the nature and pattern of sequence variation in the human genome. Specifically, the relationship between polymorphisms in human genes and linked variation in haplotypes, and the degree of linkage disequilibrium across large genomic segments, will be investigated. By studying multiple regions of the X chromosome in human, and some primate samples, we wish to ascertain the nature and frequency of sequence variation across large genomic segments. In addition, we shall determine the degree of linkage disequilibrium in these segments to assess the density of polymorphism needed to allow association mapping for complex human diseases and traits. To enable these studies, the following questions are proposed: I. What is the frequency of genomic variation in contiguous sequences? II. What is the extent of linkage disequilibrium in contiguous sequences?