As we enter the era of 'personalized genomics', there is enormous capacity for data generation by current technologies, and even greater potential with new methods. The Baylor College of Medicine-Human Genome Sequencing Center (BCM-HGSC) will lead this next phase with technical innovations and high throughput data production. These efforts will continue a history of outstanding production efficiencies, premiere data quality, major achievements in cost reduction, a stellar track record in creating and integrating technologies, and experience in driving community - engagement. A full menu of project options will be offered to the NHGRI, however the BCM-HGSC is primarily motivated to advance human genetics. We will therefore prioritize sequencing of primates and other close human relatives, and use novel sequencing platforms for the identification of inherited 'functional'mutations in human populations. The primate sequencing will benefit from all the advances in sequencing and assembly strategy, such as new software and BAC pooling, to ensure high quality and complete reconstruction of the genomes. With conservative projections >200 Gb of raw sequence data will be generated in the first year of this proposal. A Functional Mutation Discovery (FMD) project will identify the majority of putative functional mutations in all 20,000 human genes, in 1,800 people. The FMD project can proceed at this scale (nearly one billion 'haplicons') because the technical challenges are ideally suited to take advantage of new sequencing technologies and platforms, particularly that offered by 454 Life Sciences. By focusing on samples drawn from diverse populations, this project will populate a database for construction of libraries of genotyping probes that will be used by the wider range of individual investigators who are attempting disease gene discovery. The HGSC's unique position in the Texas Medical Center ensures that the mutation discovery will be geared to disease gene studies. Genome sequencing targets will also include insects and metagenomes, cancer cells, specific diseases and wide screens to discover genetic variants in humans and other species. 960 Mb of targeted finishing and long segments 'genome refinement'will be undertaken. Each project will emphasize 'complete packages'so that cDNAs and SNPS will be identified and research communities will be engaged. The program will rely heavily on the Genboree tool for integrated genomics and will use its unique architecture to enable proper management of sensitive DNA sequence data from patient sources.
Showing the most recent 10 out of 436 publications