This project, impacting genome analysis, structural proteomics, and macromolecular imaging by linking massive and exponentially growing data in each of these fields to the molecular basis of biological functions, aims at acquiring a 220-CPU Opteron distributed memory computing cluster. The Structural and Computational Biology and Molecular Biophysics (SCBMB) Program using the cluster encompasses six institutions: Baylor College of Medicine, Rice University, the University of Houston, the University of Texas MD Anderson, the University of Texas Health Sciences Center-Houston, and the University of Texas Medical Branch-Galveston. The infrastructure will service the following research activities: Electron Cryo-Microscopic Reconstruction of Biological Assemblies at the National Center for Macromolecular Imaging (NCMI) (mainly Electronic Cryomicroscopy and Image Reconstruction), Characterization of Functional Surfaces for Structural Proteomics (involving the Evolutionary Trace Method (ET) and utilizing High-Throughput Identification and Geometric Matching of Functional Surfaces), Comparative Genomics (including Parallel Comparison of DNA sequences via Positional Hashing (Pash) and Comparative Sequence Assembly and Mapping) Micro-RNA (miRNA) (binding target complementary mRNS by regulating gene expressions) Each research project has high computational demands that will be met with the proposed parallel environments. The derivation of meaningful inferences from raw biological data requires intensive computation involving data sets. For example, to reconstruct 3-D images of macromolecular machines, the labs need to transform and auto-correlate gigabytes of voxels from electron cryomicroscope data; to identify functional sites in protein structures, the lab requires all-against-all comparisons of thousands of protein structures; and to identify mammalian genes and detect novel micro-RNAs, the labs require cross-comparisons of billions of basepairs of DNA sequence. These applications all share the common feature of repeated but relatively independent computations that can be split among many CPUs with modest need of communication through a common file server. These shared characteristics can be serviced well by the requested cluster.

Broader Impact: The infrastructure enhances the educational experience at participating institutions. Students in the SCBMB graduate program (encompassing 6 institutions) will use the system, significantly enhancing their research opportunities. Baylor has outreach programs for undergraduates and for high school. Workshops, software tool development, and technology transfer will serve to disseminate the results.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
0420984
Program Officer
Rita V. Rodriguez
Project Start
Project End
Budget Start
2004-09-01
Budget End
2007-08-31
Support Year
Fiscal Year
2004
Total Cost
$300,000
Indirect Cost
Name
Baylor College of Medicine
Department
Type
DUNS #
City
Houston
State
TX
Country
United States
Zip Code
77030