This proposal describes plans to maintain and extend the UCSC Genome Browser at https://genome.ucsc.edu/ and its related web pages, databases, and computer programs. The web-based Genome Browser tool has been used by hundreds of thousands of biomedical researchers who seek to understand the human genome and those of other animals, particularly vertebrates and model organisms. The browser aggregates the research results of hundreds of biomedical labs ? including a wide range of biochemical assays, genetic studies, curational efforts, sequencing projects, computer analyses, and text-mining of scientific literature ? into a series of tracks aligned to the underlying DNA sequence. The genome provides a natural integration framework for these diverse data sources, which the browser exploits to its fullest at a variety of display scales ranging from the single base to individual genes, entire chromosomes, and ultimately to the genome as a whole. The Genome Browser is implemented using robust, fast, high-quality software capable of handling over one million hits per day. This web software provides a window into an exceptionally detailed and well-documented database that can be queried computationally as well as browsed graphically. The database is loaded with a suite of programs, developed both at UCSC and elsewhere, capable of distilling huge genomics data sets into high-quality annotations of the genome. Significant engineering effort is invested to ensure the quality of the software and data sets, including those developed by external contributors.
We aim to extend the software and database in significant ways. We plan to develop displays for personal, family, and tumor genomes that will help researchers separate significant mutations from the background of other genetic variation. We will make it possible to view regions together in a single window that are separate in the linear structure of the chromosome, but in close proximity in the higher-order three-dimensional structure, or that are related by biochemical pathways, homology, or other relationships. We plan to develop a version of the browser optimized for mobile devices such as smartphones and tablets. We will improve the search capabilities we offer, particularly of large remote datasets. We will continue to import genome assemblies for species of biomedical interest, and integrate a broad range of useful new genomic data from the scientific community, including the ENCODE project. We will collaborate in developing and deploying data exchange standards such as file formats, APIs, and controlled vocabularies that help other groups leverage our resource and extend the reach of the browser to any dataset in compliance with the APIs. We will support this work with effective scientific, project, and personnel management; a plan for broadly disseminating the software tools, libraries, source code and data; and well-established training and outreach mechanisms.

Public Health Relevance

The UCSC Genome Browser is a web-based tool that helps biomedical scientists understand the human genome and the genomes of many other species. It integrates the research work of scientific labs worldwide into a series of annotation tracks aligned to the DNA sequence from the Human Genome Project and related genome-sequencing efforts. Through the use of the browser, scientists and doctors can better understand the functions of genomic regions and the consequences of DNA variations observed in individuals.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Biotechnology Resource Cooperative Agreements (U41)
Project #
2U41HG002371-18
Application #
9357892
Study Section
National Human Genome Research Institute Initial Review Group (GNOM)
Program Officer
Di Francesco, Valentina
Project Start
2001-07-12
Project End
2022-06-30
Budget Start
2017-09-04
Budget End
2018-06-30
Support Year
18
Fiscal Year
2017
Total Cost
Indirect Cost
Name
University of California Santa Cruz
Department
Engineering (All Types)
Type
Biomed Engr/Col Engr/Engr Sta
DUNS #
125084723
City
Santa Cruz
State
CA
Country
United States
Zip Code
95064
GTEx Consortium (2018) Erratum: Genetic effects on gene expression across human tissues. Nature 553:530
Dyke, Stephanie O M; Linden, Mikael; Lappalainen, Ilkka et al. (2018) Registered access: authorizing data access. Eur J Hum Genet 26:1721-1731
Howard, Jonathan M; Lin, Hai; Wallace, Andrew J et al. (2018) HNRNPA1 promotes recognition of splice site decoys by U2AF2 in vivo. Genome Res 28:689-698
Brozovic, Matija; Dantec, Christelle; Dardaillon, Justine et al. (2018) ANISEED 2017: extending the integrated ascidian database to the exploration and evolutionary comparison of genome-scale datasets. Nucleic Acids Res 46:D718-D725
Casper, Jonathan; Zweig, Ann S; Villarreal, Chris et al. (2018) The UCSC Genome Browser database: 2018 update. Nucleic Acids Res 46:D762-D769
Canver, Matthew C; Haeussler, Maximilian; Bauer, Daniel E et al. (2018) Integrated design, execution, and analysis of arrayed and pooled CRISPR genome-editing experiments. Nat Protoc 13:946-986
GTEx Consortium; Laboratory, Data Analysis &Coordinating Center (LDACC)—Analysis Working Group; Statistical Methods groups—Analysis Working Group et al. (2017) Genetic effects on gene expression across human tissues. Nature 550:204-213
Saha, Ashis; Kim, Yungil; Gewirtz, Ariel D H et al. (2017) Co-expression networks reveal the tissue-specific regulation of transcription and splicing. Genome Res 27:1843-1858
Tyner, Cath; Barber, Galt P; Casper, Jonathan et al. (2017) The UCSC Genome Browser database: 2017 update. Nucleic Acids Res 45:D626-D634
Vivian, John; Rao, Arjun Arkal; Nothaft, Frank Austin et al. (2017) Toil enables reproducible, open source, big biomedical data analyses. Nat Biotechnol 35:314-316

Showing the most recent 10 out of 41 publications