The long term objective of this research is to provide a computer software medium able to represent and analyze a number of problems in computational biology. The simulation tool will allow biologists to design models of protein families, compute multiple alignments, perform simulations of protein interactions, analyze DNA (or RNA) sequences, etc. The methodology consists in utilizing recent progress in machine learning algorithms to extract pertinent information from biological data. The interest of this approach stems from the richness and availability of sequence data bases and from the lack of complete theories covering all the underlying biological mechanisms. Specifically, we use learning systems such as Neural Networks or Hidden Markov Models (HMMs) to parse DNA sequences and to construct models of protein families with clear medical interest. These families include immunoglobulins, kinases (involved in the regulation of basic cellular processes), G-coupled receptors (involved in the transduction of signals carried by hormones and neurotransmitters), growth factors, and several retroviral proteins (such as HIV membrane proteins). These models provide new solutions for several computational problems such as multiple sequence alignments, motif detections, data base searches, protein classifications and genome parsing. Direct commercial applications of such software tools are immediate in biological laboratories and biotech industries.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Small Business Innovation Research Grants (SBIR) - Phase I (R43)
Project #
1R43LM005780-01
Application #
2238174
Study Section
Special Emphasis Panel (ZRG7-SSS-2 (02))
Program Officer
Ye, Jane
Project Start
1994-09-20
Project End
1995-03-19
Budget Start
1994-09-20
Budget End
1995-03-19
Support Year
1
Fiscal Year
1994
Total Cost
Indirect Cost
Name
Net-ID, Inc.
Department
Type
DUNS #
City
San Francisco
State
CA
Country
United States
Zip Code
94107
Baldi, P; Brunak, S; Chauvin, Y et al. (1996) Naturally occurring nucleosome positioning signals in human exons and introns. J Mol Biol 263:503-10
Baldi, P; Chauvin, Y (1996) Hybrid modeling, HMM/NN architectures, and protein applications. Neural Comput 8:1541-65
Pedersen, A G; Baldi, P; Brunak, S et al. (1996) Characterization of prokaryotic and eukaryotic promoters using hidden Markov models. Proc Int Conf Intell Syst Mol Biol 4:182-91
Baldi, P; Brunak, S; Chauvin, Y et al. (1995) Periodic sequence patterns in human exons. Proc Int Conf Intell Syst Mol Biol 3:30-8
Baldi, P; Chauvin, Y (1995) Protein modeling with hybrid Hidden Markov Model/neural network architectures. Proc Int Conf Intell Syst Mol Biol 3:39-47