Whole-genome shotgun sequencing strategy and assembly

Jaffe, David

Abstract

Genome sequence, basic to biomedical research, is efficaciously produced by whole-genome shotgun (WGS) sequencing. Although WGS sequencing is a major NIH activity, we lack answers to fundamental questions about sequencing strategy and assembly of WGS data. Our work and the community's have focused on assembly of particular data sets and development of assembly algorithms. This grant focuses on mathematical underpinnings and rigorous analysis of genome sequencing and assembly, to improve our assembly tools and approaches. We will develop general methodology for optimally choosing specific sequencing strategies for new and varied organisms, fully exploiting data from emerging technologies. So that assembly is also optimal, we will develop algorithms that exploit the data's exact information content, retaining intrinsic ambiguity, and allowing assembly of genomes beyond current capabilities. We will develop strict internal consistency tests, guaranteeing accuracy and completeness of assembly units. A new assembly quality markup tool will label assembly regions from finished to inconsistent, by their inferred accuracy. This will guide finishing work (improving efficiency) and clearly describe reliability of particular assembly regions to end-users. In short, the work will produce better quality genome sequence at lower cost, marked to show reliability, thereby increasing utility for downstream analysis and laboratory experimentation.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project (R01)
Project #: 5R01HG003474-04
Application #: 7475279
Study Section: Special Emphasis Panel (ZRG1-BDMA (01))
Program Officer: Felsenfeld, Adam

Project Start: 2005-09-26
Project End: 2009-06-30
Budget Start: 2008-08-01
Budget End: 2009-06-30
Support Year: 4
Fiscal Year: 2008
Total Cost: $700,327
Indirect Cost

Institution

Name: Massachusetts Institute of Technology
Department
Type: Organized Research Units
DUNS #: 001425594

City: Cambridge
State: MA
Country: United States
Zip Code: 02139

Related projects


NIH 2013 R01 HG	Whole-genome shotgun sequencing strategy and assembly Jaffe, David B. / Broad Institute, Inc.	$725,799
NIH 2012 R01 HG	Whole-genome shotgun sequencing strategy and assembly Jaffe, David B. / Broad Institute, Inc.	$759,999
NIH 2011 R01 HG	Whole-genome shotgun sequencing strategy and assembly Jaffe, David B. / Broad Institute, Inc.	$859,485
NIH 2009 R01 HG	Whole-genome shotgun sequencing strategy and assembly Jaffe, David B. / Broad Institute, Inc.	$772,252
NIH 2008 R01 HG	Whole-genome shotgun sequencing strategy and assembly Jaffe, David B. / Massachusetts Institute of Technology	$700,327
NIH 2008 R01 HG	Whole-genome shotgun sequencing strategy and assembly Jaffe, David B. / Broad Institute, Inc.	$59,158
NIH 2007 R01 HG	Whole-genome shotgun sequencing strategy and assembly Jaffe, David B. / Massachusetts Institute of Technology	$751,656
NIH 2006 R01 HG	Whole-genome shotgun sequencing strategy and assembly Jaffe, David B. / Massachusetts Institute of Technology	$751,549
NIH 2005 R01 HG	Whole-genome shotgun sequencing strategy and assembly Jaffe, David B. / Massachusetts Institute of Technology	$726,285

Publications

Weisenfeld, Neil I; Yin, Shuangye; Sharpe, Ted et al. (2014) Comprehensive variation discovery in single human genomes. Nat Genet 46:1350-5

Ross, Michael G; Russ, Carsten; Costello, Maura et al. (2013) Characterizing and measuring bias in sequence data. Genome Biol 14:R51

Goldberg, Jonathan M; Griggs, Allison D; Smith, Janet L et al. (2013) Kinannote, a computer program to identify and classify members of the eukaryotic protein kinase superfamily. Bioinformatics 29:2387-94

Amemiya, Chris T; Alföldi, Jessica; Lee, Alison P et al. (2013) The African coelacanth genome provides insights into tetrapod evolution. Nature 496:311-6

Ribeiro, Filipe J; Przybylski, Dariusz; Yin, Shuangye et al. (2012) Finished bacterial genomes from shotgun sequence data. Genome Res 22:2270-7

Williams, Louise J S; Tabbaa, Diana G; Li, Na et al. (2012) Paired-end sequencing of Fosmid libraries by Illumina. Genome Res 22:2241-9

Calvo, Sarah E; Compton, Alison G; Hershman, Steven G et al. (2012) Molecular diagnosis of infantile mitochondrial disease with targeted next-generation sequencing. Sci Transl Med 4:118ra10

Jones, Felicity C; Grabherr, Manfred G; Chan, Yingguang Frank et al. (2012) The genomic basis of adaptive evolution in threespine sticklebacks. Nature 484:55-61

Gnerre, Sante; Maccallum, Iain; Przybylski, Dariusz et al. (2011) High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A 108:1513-8

Earl, Dent; Bradnam, Keith; St John, John et al. (2011) Assemblathon 1: a competitive assessment of de novo short read assembly methods. Genome Res 21:2224-41

Showing the most recent 10 out of 14 publications

Comments

Be the first to comment on David Jaffe's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: