The long term objective of this research is to provide a computer software medium able to represent and analyze a number of problems in computational biology. The simulation tool will allow biologists to design models of protein families, compute multiple alignments, perform simulations of protein interactions, analyze DNA (or RNA) sequences, etc. The methodology consists in utilizing recent progress in machine learning algorithms to extract pertinent information from biological data. The interest of this approach stems from the richness and availability of sequence data bases and from the lack of complete theories covering all the underlying biological mechanisms. Specifically, we use learning systems such as Neural Networks or Hidden Markov Models (HMMs) to parse DNA sequences and to construct models of protein families with clear medical interest. These families include immunoglobulins, kinases (involved in the regulation of basic cellular processes), G-coupled receptors (involved in the transduction of signals carried by hormones and neurotransmitters), growth factors, and several retroviral proteins (such as HIV membrane proteins). These models provide new solutions for several computational problems such as multiple sequence alignments, motif detections, data base searches, protein classifications and genome parsing. Direct commercial applications of such software tools are immediate in biological laboratories and biotech industries.
Baldi, P; Brunak, S; Chauvin, Y et al. (1996) Naturally occurring nucleosome positioning signals in human exons and introns. J Mol Biol 263:503-10 |
Baldi, P; Chauvin, Y (1996) Hybrid modeling, HMM/NN architectures, and protein applications. Neural Comput 8:1541-65 |
Pedersen, A G; Baldi, P; Brunak, S et al. (1996) Characterization of prokaryotic and eukaryotic promoters using hidden Markov models. Proc Int Conf Intell Syst Mol Biol 4:182-91 |
Baldi, P; Brunak, S; Chauvin, Y et al. (1995) Periodic sequence patterns in human exons. Proc Int Conf Intell Syst Mol Biol 3:30-8 |
Baldi, P; Chauvin, Y (1995) Protein modeling with hybrid Hidden Markov Model/neural network architectures. Proc Int Conf Intell Syst Mol Biol 3:39-47 |