se GWAS data acquired from scientific publications, and to give the results structure, in order to summarize research findings to a broad scientific community. The Catalog is used by a growing user community of biologists and bioinformaticians worldwide. Over the next five years, the Catalog will continue to provide the most thoroughly curated resource for human variation data, by engaging journals in data recruitment, and by allowing co-submission/data transfer from other resources like dbGAP and the EGA. In order to underpin the Catalog?s relevance, a multi-stranded approach combining data generation, infrastructure development and liaison with the Catalog?s user community will be adopted.
The first Aim for the next five years is for the Catalog to continue to deliver the Catalog as a community resource with high quality content. The curation system will evolve from manual curation, towards identification of data for automated extraction and review of submitted metadata, supporting author deposition, and the development of supporting QC processes.
In Aim 2, the scope of the Catalog will be broadened to include new GWAS study designs, additional associated data, and emerging technologies. The Catalog?s eligibility criteria will ensure alignment with current research and the needs of the user community, but will be monitored and re-evaluated as needed. Building on previous pilots, the focus of Aim 2 will be on the inclusion of targeted array data and other genotyping methods, such as sequencing or imputation from family members.
In Aim 3, the Catalog will be delivered as a scalable and sustainable resource for the future, which will allow for an extended scope of data. The development and promotion of standard formats for GWAS study design and results will be critical to ensure an efficient process for incorporating data into the Catalog. Authors will be encouraged to submit all SNP-trait associations, irrespective of p-value: this will vastly expand the depth of data available, and the utility of the Catalog. The manual curation system will be re-developed, with process automation to increase curator efficiency. Curation resources will be allocated in order to prioritise studies with the highest utility, therefore expediting the publication of these data in the Catalog. Finally, the Catalog?s resources, interfaces, and data access will be improved for all researchers by enhancing data representation, the search functionality, data visualization and integration with data from other relevant resources. User needs will be identified through surveys, and combined with feedback from other communication routes; existing data curation processes will then be modified to improve data representation, visualization, access and versatility. The continuation of the Catalog, as the main resource for data published on diseases with complex genetic traits, is of crucial importance for the biomedical research community, as a more efficient and effective way to better understand and to prevent, or cure, diseases like cardiovascular conditions, cancer and diabetes.

Public Health Relevance

? GWAS renewal ?Establishing the GWAS Catalog as a resource for large-scale association studies? The NHGRI-EBI GWAS Catalog summarizes research results on human genetic variation from the scientific literature to provide quick access to the latest findings on complex diseases, such as heart disease, bipolar disorder or schizophrenia. These results are then used to inform future research and experiments to determine disease mechanisms, pathways and potential drug targets. The Catalog provides fundamental knowledge and is used by research and drug development communities for the benefit of human health.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Biotechnology Resource Cooperative Agreements (U41)
Project #
5U41HG007823-05
Application #
9565630
Study Section
Special Emphasis Panel (ZHG1)
Program Officer
Wiley, Kenneth L
Project Start
2014-09-01
Project End
2022-06-30
Budget Start
2018-07-01
Budget End
2019-06-30
Support Year
5
Fiscal Year
2018
Total Cost
Indirect Cost
Name
European Molecular Biology Laboratory
Department
Type
DUNS #
321691735
City
Heidelberg
State
Country
Germany
Zip Code
69117
Morales, Joannella; Welter, Danielle; Bowler, Emily H et al. (2018) A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog. Genome Biol 19:21
Newman, Victoria; Moore, Benjamin; Sparrow, Helen et al. (2018) The Ensembl Genome Browser: Strategies for Accessing Eukaryotic Genome Data. Methods Mol Biol 1757:115-139
MacArthur, Jacqueline; Bowler, Emily; Cerezo, Maria et al. (2017) The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res 45:D896-D901
Ruffier, Magali; Kähäri, Andreas; Komorowska, Monika et al. (2017) Ensembl core software resources: storage and programmatic access for DNA sequence and genome annotation. Database (Oxford) 2017:
Yates, Andrew; Akanni, Wasiu; Amode, M Ridwan et al. (2016) Ensembl 2016. Nucleic Acids Res 44:D710-6
Cunningham, Fiona; Moore, Barry; Ruiz-Schultz, Nicole et al. (2015) Improving the Sequence Ontology terminology for genomic variant annotation. J Biomed Semantics 6:32
Welter, Danielle; MacArthur, Jacqueline; Morales, Joannella et al. (2014) The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 42:D1001-6