Advances in computational gene finding

Korf, Ian

Abstract

The technological achievements of the past 20 years have made sequencing a genome a relatively simple task. Decoding this information, however, has proved to be much more difficult, and is one of the great challenges for this century. Advancing our understanding of how genes are structured and regulated will eventually lead to novel therapeutics for combating cancer and other diseases, to cheaper and more nutritious food, to less wasteful materials and energy sources, and to a greater understanding of ourselves. One of the enduring, and most important products resulting from the genome era will be the catalogs of genes for each organism. Producing these catalogs is a difficult task even under the best of circumstances. The pace of genome sequencing continues to increase, and these new genomes represent a wealth of information if we can understand them. This proposal seeks to improve our knowledge of genomes by advancing the state of the art in computational gene finding. Our algorithms leverage untapped and new sources of information, and are expected to improve our ability to find both novel genes and genes with known homologs. Our specific plans include (a) automated training of gene prediction programs for any genome, (b) developing the first algorithm that merges a generalized hidden Markov model for gene structure with a profile hidden Markov model for protein family structure, (c) creating the first gene finder that incorporates information about DMA duplex stability under superhelical stresses, (d) building new algorithms that take advantage of high-throughput transcript profiling technologies such as whole genome expression arrays and massively parallel sequencing methods, and (e) providing web-based applications and support via the Internet.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project (R01)
Project #: 5R01HG004348-05
Application #: 8105504
Study Section: Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer: Bonazzi, Vivien

Project Start: 2007-09-01
Project End: 2013-06-30
Budget Start: 2011-07-01
Budget End: 2013-06-30
Support Year: 5
Fiscal Year: 2011
Total Cost: $313,824
Indirect Cost

Institution

Name: University of California Davis
Department: Biochemistry
Type: Schools of Medicine
DUNS #: 047120084

City: Davis
State: CA
Country: United States
Zip Code: 95618

Related projects


NIH 2011 R01 HG	Advances in computational gene finding Korf, Ian F. / University of California Davis	$313,824
NIH 2011 R01 HG	Advances in computational gene finding Korf, Ian F. / University of California Davis	$171,839
NIH 2010 R01 HG	Advances in computational gene finding Korf, Ian F. / University of California Davis	$307,762
NIH 2009 R01 HG	Advances in computational gene finding Korf, Ian F. / University of California Davis	$301,816
NIH 2008 R01 HG	Advances in computational gene finding Korf, Ian F. / University of California Davis	$293,026
NIH 2007 R01 HG	Advances in computational gene finding Korf, Ian F. / University of California Davis	$290,000

Publications

Georges, Arthur; Li, Qiye; Lian, Jinmin et al. (2015) High-coverage sequencing and annotated assembly of the genome of the Australian dragon lizard Pogona vitticeps. Gigascience 4:45

Lott, Paul C; Korf, Ian (2014) StochHMM: a flexible hidden Markov model tool and C++ library. Bioinformatics 30:1625-6

Bradnam, Keith R; Fass, Joseph N; Alexandrov, Anton et al. (2013) Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. Gigascience 2:10

Zhabinskaya, Dina; Benham, Craig J (2012) Theoretical analysis of competing conformational transitions in superhelical DNA. PLoS Comput Biol 8:e1002484

Ginno, Paul A; Lott, Paul L; Christensen, Holly C et al. (2012) R-loop formation is a distinctive characteristic of unmethylated human CpG island promoters. Mol Cell 45:814-25

Parra, G; Bradnam, K; Rose, Alan B et al. (2011) Comparative and functional analysis of intron-mediated enhancement signals reveals conserved features among plants. Nucleic Acids Res 39:5328-37

Zhabinskaya, Dina; Benham, Craig J (2011) Theoretical analysis of the stress induced B-Z transition in superhelical DNA. PLoS Comput Biol 7:e1001051

Blahnik, Kimberly R; Dou, Lei; Echipare, Lorigail et al. (2011) Characterization of the contradictory chromatin signatures at the 3' exons of zinc finger genes. PLoS One 6:e17121

Parra, Genis; Bradnam, Keith; Ning, Zemin et al. (2009) Assessing the gene space in draft genomes. Nucleic Acids Res 37:289-97

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: