: Because of the sheer volume of raw sequence data that is currently available, we need fast computational methods in order to identify potentially interesting functional genomic elements. In particular, there is also an accumulation of large quantities of data concerning conserved features from related genomes; these also need fast computational classification methods, which one could use to select a few prime candidates for experimental functional verification. We propose a probabilistic model to computationally classify conserved regions from related genomes using a combined approach that incorporates together structural (intrinsic) information with comparative (extrinsic) information. Our method contains three different models to account for three different types of structural alignment: RNA, coding or """"""""something else"""""""". Because the algorithm is probabilistic, that gives us the ability to naturally compare the different models, and allows us to classify conserved features into one of the three functional classes. We are mostly interested in using this algorithm as a screen for novel RNA gene identification, and ultimately will perform experimental verification of putative novel RNA genes predicted with this algorithm in model organisms such as C.elegans and E. coli.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Scientist Development Award - Research & Training (K01)
Project #
1K01HG002305-01
Application #
6321572
Study Section
Ethical, Legal, Social Implications Review Committee (GNOM)
Program Officer
Good, Peter J
Project Start
2001-05-01
Project End
2004-04-30
Budget Start
2001-05-01
Budget End
2002-04-30
Support Year
1
Fiscal Year
2001
Total Cost
$92,365
Indirect Cost
Name
Washington University
Department
Genetics
Type
Schools of Medicine
DUNS #
062761671
City
Saint Louis
State
MO
Country
United States
Zip Code
63130