While the Human Genome Project was completed in April of 2003, work on the final product of that project, the reference human genome sequence, has continued. The reference genome has proved to be a critical resource for the entire biomedical research community, but still suffers from gaps, tiling path errors, and regions represented by uncommon alleles. Numerous genomic landscape studies (e.g. cancer) have been made possible by the current state of the reference and have contributed to the understanding of the biology of diseases such as AML, lung adenocarcinoma, and breast cancer to name a few. It has become apparent that analysis of an individual's whole genome sequence will give health care providers the ability help diagnose and determine treatment options. In order to continue to advance medical science and ultimately improve the effectiveness of healthcare, the reference human genome sequence must be improved. In addition, genome sequence data from many individuals must be added to the reference, so that it better reflects the genetic diversity of the human population, and serves as an effective resource for all population groups. The long-term objective of the proposed project is to make the reference human genome sequence more accurate and more useful to a wider range of applications.
Specific aims i nclude: the identification and resolution of all.issues (misassemblies, sequence errors, and gaps), the addition of allelic diversity, the implementation of new software and analysis tools, and the development and deployment of community training materials.
The specific aims of the project will be achieved by aggressively identifying and cataloging errors and omissions in the current sequence, and by using a wide array of state-of-the-art methods, technologies, and resources to resolve them. All data and results will be immediately released to the research community via NCBI.

Public Health Relevance

The reference human genome sequence is a powerful tool for research. However, portions of the sequence remain incorrect or missing. Also, ethnically diverse genome sequence data from several individuals will be added so that the reference genome sequence will be more useful for a broader number of experimental and clinical applications. In addition to remedying these problems, we will also provide community training.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Biotechnology Resource Cooperative Agreements (U41)
Project #
5U41HG007635-03
Application #
9102245
Study Section
Special Emphasis Panel (ZHG1-HGR-M (J2))
Program Officer
Felsenfeld, Adam
Project Start
2014-09-15
Project End
2017-06-30
Budget Start
2016-07-01
Budget End
2017-06-30
Support Year
3
Fiscal Year
2016
Total Cost
$2,778,951
Indirect Cost
$371,108
Name
Washington University
Department
Genetics
Type
Schools of Medicine
DUNS #
068552207
City
Saint Louis
State
MO
Country
United States
Zip Code
63130
Kronenberg, Zev N; Fiddes, Ian T; Gordon, David et al. (2018) High-resolution comparative analysis of great ape genomes. Science 360:
Fiddes, Ian T; Armstrong, Joel; Diekhans, Mark et al. (2018) Comparative Annotation Toolkit (CAT)-simultaneous clade and personal genome annotation. Genome Res 28:1029-1038
Cantsilieris, Stuart; Nelson, Bradley J; Huddleston, John et al. (2018) Recurrent structural variation, clustered sites of selection, and disease risk for the complement factor H (CFH) gene family. Proc Natl Acad Sci U S A 115:E4433-E4442
Schneider, Valerie A; Graves-Lindsay, Tina; Howe, Kerstin et al. (2017) Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res 27:849-864
Huddleston, John; Chaisson, Mark J P; Steinberg, Karyn Meltz et al. (2017) Discovery and genotyping of structural variation from long-read haploid genome sequence data. Genome Res 27:677-685
Dennis, Megan Y; Harshman, Lana; Nelson, Bradley J et al. (2017) The evolution and population diversity of human-specific segmental duplications. Nat Ecol Evol 1:69
Gordon, David; Huddleston, John; Chaisson, Mark J P et al. (2016) Long-read sequence assembly of the gorilla genome. Science 352:aae0344
Shi, Lingling; Guo, Yunfei; Dong, Chengliang et al. (2016) Long-read sequencing and de novo assembly of a Chinese genome. Nat Commun 7:12065
Mohajeri, Kiana; Cantsilieris, Stuart; Huddleston, John et al. (2016) Interchromosomal core duplicons drive both evolutionary instability and disease susceptibility of the Chromosome 8p23.1 region. Genome Res 26:1453-1467
Chaisson, Mark J P; Huddleston, John; Dennis, Megan Y et al. (2015) Resolving the complexity of the human genome using single-molecule sequencing. Nature 517:608-11

Showing the most recent 10 out of 13 publications