Robust Software Tools for Variant Identification and Functional Assessment

Marth, Gabor; Rocha Abecasis, Goncalo

Abstract

Rapid advances in genome sequencing techniques and throughput are providing scientists with increasingly detailed views of individual genomes, furthering our understanding of genetic variation in a wide array of organisms, of human population history and of the biology of Mendelian disorders and complex traits. Experiments that were until recently restricted to very large genome centers, such as the resequencing of human genomes, can now be carried out by a wide range of investigators. While these technological advances will enable many new discoveries in human and model organism genetics, they also pose formidable computational challenges. RFA-HG-10-018, entitled """"""""Informatics Tools for High-Throughput Sequence Data Analysis"""""""", is intended to fund further development of existing software to ensure that any biological or biomedical research laboratory can benefit from advances in sequencing technologies. We have developed specialized, state-of-the-art tools for the processing and analysis of next generation sequence data. Our tools encompass many key steps in sequence data analysis, ranging from quality control, to read mapping, to the identification, genotyping and annotation of many classes of sequence variation, to downstream association analyses that seek to connect identified variants with organismal phenotypes. These tools have been used to support analysis of several large, challenging datasets including not only data from the 1000 Genomes Project but also >1000 whole genomes and >2500 exomes sequenced in medical sequencing projects. Here, we propose to develop these tools into easy-to-use, portable, well-documented packages and complete pipelines that facilitate biomedical research in a wide variety of settings. A key component of the proposal is the deployment of these tools in the Galaxy cloud, where they will be accessible to investigators without direct access to a local high-throughput computing and data storage facility.

Public Health Relevance

We are developing computer software to discover and interpret genetic differences between individual human genomes from DNA sequencing data. We are starting with existing computer programs and turning them into stable software packages that can be readily used by any biological laboratory. These methods will enhance the study of human genetic variability and the understanding of heritable human diseases.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project--Cooperative Agreements (U01)
Project #: 5U01HG006513-03
Application #: 8602845
Study Section: Special Emphasis Panel (ZHG1-HGR-M (O3))
Program Officer: Sofia, Heidi J

Project Start: 2012-02-01
Project End: 2015-12-31
Budget Start: 2014-01-01
Budget End: 2014-12-31
Support Year: 3
Fiscal Year: 2014
Total Cost: $865,480
Indirect Cost: $150,996

Institution

Name: Boston College
Department: Biology
Type: Schools of Arts and Sciences
DUNS #: 045896339

City: Chestnut Hill
State: MA
Country: United States
Zip Code: 02467

Related projects


NIH 2015 U01 HG	Robust Software Tools for Variant Identification and Functional Assessment Marth, Gabor T.; Abecasis, Goncalo / University of Utah	$941,051
NIH 2014 U01 HG	Robust Software Tools for Variant Identification and Functional Assessment Marth, Gabor T.; Rocha Abecasis, Goncalo / Boston College	$865,480
NIH 2014 U01 HG	Robust Software Tools for Variant Identification and Functional Assessment Marth, Gabor T.; Abecasis, Goncalo / University of Utah	$673,266
NIH 2014 U01 HG	Robust Software Tools for Variant Identification and Functional Assessment Marth, Gabor T.; Abecasis, Goncalo / University of Utah	$76,646
NIH 2013 U01 HG	Robust Software Tools for Variant Identification and Functional Assessment Marth, Gabor T.; Rocha Abecasis, Goncalo / Boston College	$969,494
NIH 2013 U01 HG	Robust Software Tools for Variant Identification and Functional Assessment Marth, Gabor T.; Rocha Abecasis, Goncalo / Boston College	$119,957
NIH 2013 U01 HG	Robust Software Tools for Variant Identification and Functional Assessment Marth, Gabor T.; Rocha Abecasis, Goncalo / Boston College	$94,570
NIH 2012 U01 HG	Robust Software Tools for Variant Identification and Functional Assessment Marth, Gabor T.; Rocha Abecasis, Goncalo / Boston College	$1,010,000

Publications

Chiang, Charleston W K; Marcus, Joseph H; Sidore, Carlo et al. (2018) Genomic history of the Sardinian population. Nat Genet 50:1426-1434

Than, Hein; Qiao, Yi; Huang, Xiaomeng et al. (2018) Ongoing clonal evolution in chronic myelomonocytic leukemia on hypomethylating agents: a computational perspective. Leukemia 32:2049-2054

Ostrander, Betsy E P; Butterfield, Russell J; Pedersen, Brent S et al. (2018) Whole-genome analysis for effective clinical diagnosis and gene discovery in early infantile epileptic encephalopathy. NPJ Genom Med 3:22

Ward, Alistair; Karren, Mary A; Di Sera, Tonya et al. (2017) Rapid clinical diagnostic variant investigation of genomic patient sequencing data with iobio web tools. J Clin Transl Sci 1:381-386

van den Berg, Marten E; Warren, Helen R; Cabrera, Claudia P et al. (2017) Discovery of novel heart rate-associated loci using the Exome Chip. Hum Mol Genet 26:2346-2363

Steri, Maristella; Orrù, Valeria; Idda, M Laura et al. (2017) Overexpression of the Cytokine BAFF and Autoimmunity Risk. N Engl J Med 376:1615-1626

Khorashad, J S; Tantravahi, S K; Yan, D et al. (2016) Rapid conversion of chronic myeloid leukemia to chronic myelomonocytic leukemia in a patient on imatinib therapy. Leukemia 30:2275-2279

Danjou, Fabrice; Zoledziewska, Magdalena; Sidore, Carlo et al. (2015) Genome-wide association analyses based on whole-genome sequencing in Sardinia provide insights into regulation of hemoglobin levels. Nat Genet 47:1264-71

Flickinger, Matthew; Jun, Goo; Abecasis, Gonçalo R et al. (2015) Correcting for Sample Contamination in Genotype Calling of DNA Sequence Data. Am J Hum Genet 97:284-90

Lo, Yancy; Kang, Hyun M; Nelson, Matthew R et al. (2015) Comparing variant calling algorithms for target-exon sequencing in a large sample. BMC Bioinformatics 16:75

Showing the most recent 10 out of 62 publications

Comments

Be the first to comment on Gabor Marth's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: