The principal aim of the course is to significantly enhance the productivity of experimental molecular biologists by training them in the computational techniques necessary to extract the maximum amount of information from macrornolecular sequences. Such sequences are a major source of data in molecular biology and genomics and are available to experimentalists from their own experimental activities, from sequence databases, structure databases, and repositories for the complete genomes of individual species The core techniques in which we will train experimentalists are collectively known as multiple sequence analysis and are a particularly powerful adjunct to experimental technique of site-directed mutagenesis. We supplement these core multiple sequence analysis techniques with techniques for integrating the results of these analyses with known structures of proteins and nucleic acids. ? ? The computational techniques are organized around methods for both global and local multiple sequence alignment and for identifying informative patterns in families of sequences, either based on these alignments or identified by analysis of unaligned sequences. Recent presentations of the weeklong course emphasized advanced techniques for discovering informative patterns in groups of sequences that have not been aligned and which may be unaligned. Also stressed were how to use such patterns to identify additional macromolecular sequences related to those under investigation, relating those patterns to structural motifs and their functional correlates and how the information discovered can be used to guide laboratory experiments. Future editions of the course will also include information on the computational complexities of analyzing complete genomes and microarray data.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Continuing Education Training Grants (T15)
Project #
5T15HG000015-12
Application #
6665243
Study Section
Ethical, Legal, Social Implications Review Committee (GNOM)
Program Officer
Good, Peter J
Project Start
1991-04-03
Project End
2004-07-31
Budget Start
2003-08-06
Budget End
2004-07-31
Support Year
12
Fiscal Year
2003
Total Cost
$62,261
Indirect Cost
Name
Carnegie-Mellon University
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
052184116
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213
Nicholas Jr, Hugh B; Ropelewski, Alexander J; Deerfield 2nd, David W (2002) Strategies for multiple sequence alignment. Biotechniques 32:572-4, 576, 578 passim
Elsik, C G; Williams, C G (2001) Families of clustered microsatellites in a conifer genome. Mol Genet Genomics 265:535-42
Pogue-Geile, K L; Greenberger, J S (2000) Effect of the irradiated microenvironment on the expression and retrotransposition of intracisternal type A particles in hematopoietic cells. Exp Hematol 28:680-9
Elsik, C G; Williams, C G (2000) Retroelements contribute to the excess low-copy-number DNA in pine. Mol Gen Genet 264:47-55
Nicholas Jr, H B; Deerfield 2nd, D W; Ropelewski, A J (2000) Strategies for searching sequence databases. Biotechniques 28:1174-8, 1180, 1182 passim
Dingwall, A; Garman, J D; Shapiro, L (1992) Organization and ordered expression of Caulobacter genes encoding flagellar basal body rod and ring proteins. J Mol Biol 228:1147-62