We propose to create the world's premier database of genetic variants relevant to clinical care (Clinically Relevant Genetic Variants Resource or CRVR). We will provide transparent data synthesis and consensus opinion on the clinical utility of a given genetic variant across a spectrum of genetic lesions including single nucleotide changes, small indels and structural variants. We will integrate with ClinVar, PharmGKB, and OMIM and draw upon NHGRI initiatives including the Genome Sequencing and Analysis and Mendelian Disorders Sequencing Centers, and the Clinical Sequencing Exploratory Research Centers. We will work closely with other CRVR sites and NHGRI funded initiatives to improve deposition of data from clinical laboratories. Our database will be built through three Aims.
Aim 1 will engage and energize the clinical genomics community around CRVR efforts. We will partner with the other CRVR and U41 investigators in this activity as they will focus on engagement of professional societies, clinical testing laboratories, and the broader clinical genomics community to ensure creation of a CRVR resource that meets anticipated community needs including assembly of Disease-Specific and Mutation Type Working Groups (DSWGs and MTWGs) comprised of expert clinical geneticists and molecular diagnosticians to establish metrics for the initial classification of variants and integration of guidelines from professional organizations.
Aim 2 will involve creation of a CRVR CoreDB resource through expert review of the existing literature, locus databases, and NHGRI initiatives. We will disseminate consensus findings on clinically relevant genetic variants and the clinical implications of these variants, with supporting evidence and documentation of the consensus process. Information will be aggregated using standard ontologies and advanced methodologies for handling heterogeneous data to create a Core Database (CoreDB). The consensus of expert review will be disseminated through a user-friendly web Portal (vetted by Genetic Counseling WG), web services for data mining, and consensus clinical guidelines to the appropriate clinical and research communities. The results will be organized by gene, variant, disease, pathway, and literature. Supporting evidence will also be curated and disseminated, and the resource will be updated continuously as new information accumulates.
Aim 3 will involve deployment of machine-learning algorithms for semi- automatic identification of putative Clinically Relevant Variants (CRVs). We will undertake data mining of the clinical and epidemiological genetics literature and existing databases to identify putative clinically important variants. This will involve mining data from ClinVar, OMIM, CSER, and the Mendelian centers aggregated in Aim 2. The Working Groups formed in Aim 1 will establish criteria and oversee curators vetting variants. We will develop and optimize disease- and gene-specific machine learning algorithms to facilitate rapid classification of variants based on data provided by genetic testing services via ClinVar. We will integrate population-genetic data inferred from at least 25 reference populations from the 1000 Genomes Project and other large endeavors into our machine learning approaches so as to infer the global relevance of CRVs discovered here.

Public Health Relevance

We propose to create a unified, public, and freely available database of genetic alterations relevant to clinical care. Our ultimate goal is to empower clinicians, genetic counselors, and patients to make informed decisions based on DNA testing. Because much of the information required for such decisions is scattered among public and private databases, we propose combining the medical literature, expert summary of millions of de- identified genetic tests, and results from current and past NIH-funded genetic studies into a single unified database.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project--Cooperative Agreements (U01)
Project #
Application #
Study Section
Special Emphasis Panel (ZHG1-HGR-M (M2))
Program Officer
Ramos, Erin
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Schools of Medicine
United States
Zip Code
Shringarpure, Suyash S; Bustamante, Carlos D; Lange, Kenneth et al. (2016) Efficient analysis of large datasets and sex bias with ADMIXTURE. BMC Bioinformatics 17:218
Ritter, Deborah I; Roychowdhury, Sameek; Roy, Angshumoy et al. (2016) Somatic cancer variant curation and harmonization through consensus minimum variant level data. Genome Med 8:117
Homburger, Julian R; Green, Eric M; Caleshu, Colleen et al. (2016) Multidimensional structure-function relationships in human β-cardiac myosin from population-scale genetic variation. Proc Natl Acad Sci U S A 113:6701-6
Hunter, Jessica Ezzell; Irving, Stephanie A; Biesecker, Leslie G et al. (2016) A standardized, evidence-based protocol to assess clinical actionability of genetic disorders associated with genomic variation. Genet Med 18:1258-1268
Bagley, Steven C; Sirota, Marina; Chen, Richard et al. (2016) Constraints on Biological Mechanism from Disease Comorbidity Using Electronic Medical Records and Database of Genetic Variants. PLoS Comput Biol 12:e1004885
Saliba, Jason; Zabriskie, Ryan; Ghosh, Rajarshi et al. (2016) Pharmacogenetic characterization of naturally occurring germline NT5C1A variants to chemotherapeutic nucleoside analogs. Pharmacogenet Genomics 26:271-9
Overby, C L; Heale, B; Aronson, S et al. (2016) Providing Access to Genomic Variant Knowledge in a Healthcare Setting: A Vision for the ClinGen Electronic Health Records Workgroup. Clin Pharmacol Ther 99:157-60
Rehm, Heidi L; Berg, Jonathan S; Brooks, Lisa D et al. (2015) ClinGen--the Clinical Genome Resource. N Engl J Med 372:2235-42
Costa, Helio A; Leitner, Michael G; Sos, Martin L et al. (2015) Discovery and functional characterization of a neomorphic PTEN mutation. Proc Natl Acad Sci U S A 112:13976-81
Kirkpatrick, Brianne E; Riggs, Erin Rooney; Azzariti, Danielle R et al. (2015) GenomeConnect: matchmaking between patients, clinical laboratories, and researchers to improve genomic knowledge. Hum Mutat 36:974-8

Showing the most recent 10 out of 14 publications