While the Human Genome Project was completed in April of 2003, work on the final product of that project, the reference human genome sequence, has continued. The reference genome has proved to be a critical resource for the entire biomedical research community, but still suffers from gaps, tiling path errors, and regions represented by uncommon alleles. Numerous genomic landscape studies (e.g. cancer) have been made possible by the current state of the reference and have contributed to the understanding of the biology of diseases such as AML, lung adenocarcinoma, and breast cancer to name a few. It has become apparent that analysis of an individual's whole genome sequence will give health care providers the ability help diagnose and determine treatment options. In order to continue to advance medical science and ultimately improve the effectiveness of healthcare, the reference human genome sequence must be improved. In addition, genome sequence data from many individuals must be added to the reference, so that it better reflects the genetic diversity of the human population, and serves as an effective resource for all population groups. The long-term objective of the proposed project is to make the reference human genome sequence more accurate and more useful to a wider range of applications.
Specific aims i nclude: the identification and resolution of all.issues (misassemblies, sequence errors, and gaps), the addition of allelic diversity, the implementation of new software and analysis tools, and the development and deployment of community training materials.
The specific aims of the project will be achieved by aggressively identifying and cataloging errors and omissions in the current sequence, and by using a wide array of state-of-the-art methods, technologies, and resources to resolve them. All data and results will be immediately released to the research community via NCBI.
The reference human genome sequence is a powerful tool for research. However, portions of the sequence remain incorrect or missing. Also, ethnically diverse genome sequence data from several individuals will be added so that the reference genome sequence will be more useful for a broader number of experimental and clinical applications. In addition to remedying these problems, we will also provide community training.
Kronenberg, Zev N; Fiddes, Ian T; Gordon, David et al. (2018) High-resolution comparative analysis of great ape genomes. Science 360: |
Fiddes, Ian T; Armstrong, Joel; Diekhans, Mark et al. (2018) Comparative Annotation Toolkit (CAT)-simultaneous clade and personal genome annotation. Genome Res 28:1029-1038 |
Cantsilieris, Stuart; Nelson, Bradley J; Huddleston, John et al. (2018) Recurrent structural variation, clustered sites of selection, and disease risk for the complement factor H (CFH) gene family. Proc Natl Acad Sci U S A 115:E4433-E4442 |
Schneider, Valerie A; Graves-Lindsay, Tina; Howe, Kerstin et al. (2017) Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res 27:849-864 |
Huddleston, John; Chaisson, Mark J P; Steinberg, Karyn Meltz et al. (2017) Discovery and genotyping of structural variation from long-read haploid genome sequence data. Genome Res 27:677-685 |
Dennis, Megan Y; Harshman, Lana; Nelson, Bradley J et al. (2017) The evolution and population diversity of human-specific segmental duplications. Nat Ecol Evol 1:69 |
Gordon, David; Huddleston, John; Chaisson, Mark J P et al. (2016) Long-read sequence assembly of the gorilla genome. Science 352:aae0344 |
Shi, Lingling; Guo, Yunfei; Dong, Chengliang et al. (2016) Long-read sequencing and de novo assembly of a Chinese genome. Nat Commun 7:12065 |
Mohajeri, Kiana; Cantsilieris, Stuart; Huddleston, John et al. (2016) Interchromosomal core duplicons drive both evolutionary instability and disease susceptibility of the Chromosome 8p23.1 region. Genome Res 26:1453-1467 |
Chaisson, Mark J P; Huddleston, John; Dennis, Megan Y et al. (2015) Resolving the complexity of the human genome using single-molecule sequencing. Nature 517:608-11 |
Showing the most recent 10 out of 13 publications