Techniques are proposed for identifying full-length cDNA clones based on single-pass 5' EST data. The purpose of this classification is to select clones that are good candidates for full insert sequencing. To find significant alternative transcripts that has not yet been described. The project is driven by a collaborative effort between the laboratories of Thomas Casavant and Bento Soares. The Soares lab is well known for their capabilities in producing high-quality cDNA libraries, enriched for full-length mRNA transcripts. The Casavant lab has significant experience in managing and analyzing large amounts of EST data, and full-insert sequence assembly. The project will first work to further develop a pipeline to handle the specialized analysis unique to 5' ESTs from full-length enriched libraries. The pipeline will use primarily homology-based methods to identify ESTs that should be selected for full insert sequencing and assembly. Software will also be developed to identify ESTs that are candidates for full length sequencing that do not have evidence for this assignment from homology to known genes. This has the potential for finding interesting transcripts from previously uncharacterized genes, and proteins. Finally, approaches that use existing genomic based prediction tools will be explored for their utility in correctly assigning clones by using a combination of EST and genomic sequence data. The results from each of the methods will be evaluated for their effectiveness in selection of sequence confirmed clones.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Postdoctoral Individual National Research Service Award (F32)
Project #
1F32HG002881-01A1
Application #
6738388
Study Section
Special Emphasis Panel (ZRG1-F08 (20))
Program Officer
Graham, Bettie
Project Start
2003-12-12
Project End
2006-12-11
Budget Start
2003-12-12
Budget End
2004-12-11
Support Year
1
Fiscal Year
2003
Total Cost
$41,608
Indirect Cost
Name
University of Iowa
Department
Type
Organized Research Units
DUNS #
062761671
City
Iowa City
State
IA
Country
United States
Zip Code
52242
Kalari, Krishna R; Casavant, Melanie; Bair, Thomas B et al. (2006) First exons and introns--a survey of GC content and gene structure in the human genome. In Silico Biol 6:237-42
Bonaldo, Maria F; Bair, Thomas B; Scheetz, Todd E et al. (2004) 1274 full-open reading frames of transcripts expressed in the developing mouse nervous system. Genome Res 14:2053-63