(Overall: Human Genome Reference Center) The human reference genome is the foundational resource upon which the framework of modern human genetics and genomics has been constructed. It is the analytical substrate for nearly all human genomics applications including read alignment, variant detection, variant interpretation, functional annotation, population genetics, and epigenomic analysis. In a more basic sense, the reference genome also serves as a coordinate system for systematically reporting and comparing results across studies, and for cataloging the important genetic elements and variants that exist in humans. As genomic methods continue to march into the clinical realm, the reference genome will become increasingly important for genetic screening and precision medicine. Yet, there is a growing sense that the current reference genome has become obsolete. The primary limitation is that the reference does not adequately represent genomic diversity in the human population, and this leads to reference biases that adversely affect the accuracy of genetic analyses. To solve this, it is necessary to build a reference pan-human genome ? i.e., a pan-genome ? that represents the full complement of common variants, haplotypes and functional elements that exist in our collective genomes. To accomplish this goal, we propose to form the WashU-UCSC-EBI Human Genome Reference Center. Starting with the genome assemblies generated by the data production center, we will create a high quality map of sequence alignments and variants, and use the genome graph methods that we have pioneered to build a pan-genome resource that naturally represents genetic diversity. We will annotate the pan-genome for genes and other elements, and share this resource broadly and openly for public use. Working with the community, we will foster a new ecosystem of genome analysis tools that work with this new reference. We will maintain and gradually improve the reference by soliciting user feedback and establishing scalable bioinformatic methods and targeted sequenced protocols for resolving errors and improving specific genomic regions. We further propose to form a logistical coordination center that efficiently organizes communication and collaborative activities at the level of the entire consortium, ensuring that all program components are working hand-in-hand. Finally, and perhaps most importantly from the standpoint of user adoption, we have devised an integrated pan-genome transition plan that involves broad community engagement via outreach and education at the level of tool developers and end users. Taken together, these efforts will create a new human genome reference, software ecosystem, and expert user base to support the next generation of human genetics and clinical practice.

Public Health Relevance

The human reference genome is a scientific data resource that is intended to be a standardized representation of our species' collective genome. It is crucial for human biomedical research because it is used by virtually all studies that incorporate genetic information. This project aims to update and improve the human reference genome by vastly expanding the number of individuals and human populations that are represented.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Biotechnology Resource Cooperative Agreements (U41)
Project #
5U41HG010972-02
Application #
10020425
Study Section
Special Emphasis Panel (ZHG1)
Program Officer
Felsenfeld, Adam
Project Start
2019-09-18
Project End
2024-07-31
Budget Start
2020-08-01
Budget End
2021-07-31
Support Year
2
Fiscal Year
2020
Total Cost
Indirect Cost
Name
Washington University
Department
Genetics
Type
Schools of Medicine
DUNS #
068552207
City
Saint Louis
State
MO
Country
United States
Zip Code
63130