A Turnkey System for High-throughput Variant Discovery and Interpretation

Ding, Li; Dooling, David

Abstract

High-throughput sequencing (HTS) platforms are revolutionizing genomics and health research. The incredible throughput of new sequencing instruments has enabled sequencing of genomes, exomes, methylomes, and transcriptomes in both research and clinical settings. As the cost of DNA sequencing has plummeted, two important trends have become apparent. First, the cost of analysis, in terms of computing resources and personnel, will soon surpass the cost of data generation. This will increase the pressing demand for analytical algorithms that run faster, with fewer CPU/memory resources, while processing overgrowing data sets. Second, the advent of HTS technologies has put low-cost, high-throughput sequencing into the hands of small research labs and clinical investigators;groups that are not accustomed to dealing with this type and scale of data. These developments will undoubtedly yield an unprecedented number of new discoveries, clinical insights, and medical breakthroughs in the coming years, provided the outstanding issues of HTS data analysis (short read lengths, inherent errors, and sheer number of sequence reads) can be conclusively resolved. Until now, most HTS has taken place in large genome centers with teams of bioinformaticians and substantial computing infrastructures. There is an urgent need to make their analysis tools and next-generation pipelines available to the wider research community as easy to install and use packages. We have spent several years developing a computational framework and innovative tools for HTS data analysis, with a particular focus on the discovery and interpretation of genetic variants. Our goal in this proposal is to make these tools available to the wider community, both individually and as part of a complete informatics solution from alignment to detection to interpretation. The solution we describe is flexible and powerful enough to be adopted by experienced laboratories, while at the same time providing high quality, push-button analysis of sequence data for those with little bioinformatics expertise. The framework will run in the cloud or on a single CPU, enabling researchers, educators, and clinicians to speed the transition from sequencing technology adoption to biological knowledge and clinical application.

Public Health Relevance

The promise of the personalized medicine will only be realized when each individual's genetic code can be read and analyzed in the clinical setting. Unfortunately, the associated technologies will generate massive amounts of data that are difficult to analyze and interpret. The software describe in this proposal will enable widespread and easy analysis and interpretation of genetic data, accelerating the overall understanding of genetic information and its application to human health.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project--Cooperative Agreements (U01)
Project #: 1U01HG006517-01
Application #: 8237076
Study Section: Special Emphasis Panel (ZHG1-HGR-M (O3))
Program Officer: Sofia, Heidi J

Project Start: 2012-02-01
Project End: 2015-12-31
Budget Start: 2012-02-01
Budget End: 2012-12-31
Support Year: 1
Fiscal Year: 2012
Total Cost: $805,000
Indirect Cost: $148,077

Institution

Name: Washington University
Department: Genetics
Type: Schools of Medicine
DUNS #: 068552207

City: Saint Louis
State: MO
Country: United States
Zip Code: 63130

Related projects


NIH 2015 U01 HG	A Turnkey System for High-throughput Variant Discovery and Interpretation Ding, Li / Washington University	$719,935
NIH 2014 U01 HG	A Turnkey System for High-throughput Variant Discovery and Interpretation Ding, Li / Washington University	$755,095
NIH 2013 U01 HG	A Turnkey System for High-throughput Variant Discovery and Interpretation Ding, Li / Washington University	$751,869
NIH 2012 U01 HG	A Turnkey System for High-throughput Variant Discovery and Interpretation Ding, Li; Dooling, David J. / Washington University	$805,000

Publications

Ellrott, Kyle; Bailey, Matthew H; Saksena, Gordon et al. (2018) Scalable Open Science Approach for Mutation Calling of Tumor Exomes Using Multiple Genomic Pipelines. Cell Syst 6:271-281.e7

Cao, Yanan; Zhou, Weiwei; Li, Lin et al. (2018) Pan-cancer analysis of somatic mutations across 21 neuroendocrine tumor types. Cell Res 28:601-604

Sengupta, Sohini; Sun, Sam Q; Huang, Kuan-Lin et al. (2018) Integrative omics analyses broaden treatment targets in human cancer. Genome Med 10:60

Huang, Kuan-Lin; Li, Shunqiang; Mertins, Philipp et al. (2017) Proteogenomic integration reveals therapeutic targets in breast cancer xenografts. Nat Commun 8:14864

Mashl, R Jay; Scott, Adam D; Huang, Kuan-Lin et al. (2017) GenomeVIP: a cloud platform for genomic variant discovery and interpretation. Genome Res 27:1450-1459

Wyczalkowski, Matthew A; Wylie, Kristine M; Cao, Song et al. (2017) BreakPoint Surveyor: a pipeline for structural variant visualization. Bioinformatics 33:3121-3122

Jones, K B; Barrott, J J; Xie, M et al. (2016) The impact of chromosomal translocation locus and fusion oncogene coding sequence in synovial sarcomagenesis. Oncogene 35:5021-32

Niu, Beifang; Scott, Adam D; Sengupta, Sohini et al. (2016) Protein-structure-guided discovery of functional mutations across 19 cancer types. Nat Genet 48:827-37

Ye, Kai; Wang, Jiayin; Jayasinghe, Reyka et al. (2016) Systematic discovery of complex insertions and deletions in human cancers. Nat Med 22:97-104

Manda, K R; Tripathi, P; Hsi, A C et al. (2016) NFATc1 promotes prostate tumorigenesis and overcomes PTEN loss-induced senescence. Oncogene 35:3282-92

Showing the most recent 10 out of 30 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: