With the recent completion of the human genome sequence, the sequencing of other vertebrate genomes has taken center stage. The sequencing of widely studied model organisms (e.g., mouse, rat, and zebrafish) is providing a valuable resource for future experimentation and important insights into vertebrate biology. Less clear is the relative value of other candidate genomes being considered for systematic sequencing, especially with regard to their potential contribution to the annotation and interpretation of the human genome sequence. To investigate such issues, we are generating large blocks of orthologous sequence from multiple vertebrates for detailed comparative analyses. Specifically, the same targeted genomic regions from multiple vertebrate specie are being isolated in large-insert clones and then sequenced. Efficient methods for designing orthologous hybridization probes and isolating bacterial artificial chromosome (BAC) clones from the different species have been developed and implemented. Following characterization by several mapping methods, tiling paths of BACs are then selected and systematically sequenced. In total, >250 Mb of comparative sequencing data is being generated each year, in conjunction with the Physical Mapping Section of the NHGRI Genome Technology Branch. The establishment of this comparative sequence resource is facilitating the development of new computational tools for multi-species sequence comparisons, providing insight about the appropriate degrees of sequencing finishing that should be pursued in the sequencing of other vertebrate species, and revealing the benefits of sequencing species from a range of different evolutionary distances from human. These efforts are being extensively focused on the launched ENCODE project, which aims to identify all functional in a targeted 1% of the human genome. Indeed, the great majority of NISC sequencing over the past year has been dedicated to the ENCODE project.? ? In addition to its inter-species sequence comparisons, NISC is broadening its portfolio by implementing a second major sequence-production pipeline one designed for performing intra-species (specifically, inter-human) sequence comparisons for medical research projects. In establishing and utilizing a PCR-based sequencing pipeline, NISC is once again capitalize on its unique circumstance of being embedded within the broader NIH Intramural Program with its outstanding clinical research infrastructure. Among the currently planned NISC medical sequencing projects is a large effort that directly interfaces with well-established clinical researchers at other NIH Institutes and utilizes the NIH Clinical Center to study the molecular basis for common human diseases, with an emphasis on the detection and study of rare disease-associated variants. Together, the two pipelines of the NISC Comparative Sequencing Program should continue to produce data at the cutting edge of genomics research, exploring how large-scale DNA sequencing can be used to characterize the human genome and to understand the genetic basis for human health and disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Intramural Research (Z01)
Project #
1Z01HG000196-07
Application #
7594318
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
7
Fiscal Year
2007
Total Cost
$7,457,180
Indirect Cost
Name
National Human Genome Research Institute
Department
Type
DUNS #
City
State
Country
United States
Zip Code
(2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447:799-816
Nikolaev, Sergey; Montoya-Burgos, Juan I; Margulies, Elliott H et al. (2007) Early history of mammals is elucidated with the ENCODE multiple species sequencing data. PLoS Genet 3:e2
Margulies, Elliott H; Cooper, Gregory M; Asimenos, George et al. (2007) Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res 17:760-74
Zhang, Wei; Bouffard, Gerard G; Wallace, Susan S et al. (2007) Estimation of DNA sequence context-dependent mutation rates using primate genomic sequences. J Mol Evol 65:207-14
Hurle, Belen; Swanson, Willie; NISC Comparative Sequencing Program et al. (2007) Comparative sequence analyses reveal rapid and divergent evolutionary changes of the WFDC locus in the primate lineage. Genome Res 17:276-86
Cretekos, Chris J; Deng, Jian-Min; Green, Eric D et al. (2007) Isolation, genomic structure and developmental expression of Fgf8 in the short-tailed fruit bat, Carollia perspicillata. Int J Dev Biol 51:333-8
Keebaugh, Alaine C; Sullivan, Robert T; NISC Comparative Sequencing Program et al. (2007) Gene duplication and inactivation in the HPRT gene family. Genomics 89:134-42
Morin, Ryan D; Chang, Elbert; Petrescu, Anca et al. (2006) Sequencing and analysis of 10,967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis reveals post-tetraploidization transcriptome remodeling. Genome Res 16:796-803
Margulies, Elliott H; Chen, Christina W; Green, Eric D (2006) Differences between pair-wise and multi-sequence alignment methods affect vertebrate genome comparisons. Trends Genet 22:187-93
She, Xinwei; Liu, Ge; Ventura, Mario et al. (2006) A preliminary comparative analysis of primate segmental duplications shows elevated substitution rates and a great-ape expansion of intrachromosomal duplications. Genome Res 16:576-83

Showing the most recent 10 out of 38 publications