One of the fundamental goals of modern molecular medicine is to understand how the structure of biological macromolecules produces their function. The National Library of Medicine has, as one of its primary missions, the task of supporting technologies to represent, manage and manipulate information about biological structure. In the last two decades, a wealth of information has been accumulated about the structures and functions of hundreds of important molecules. The best views of molecular structure come from the high resolution structures that are widely available through the Brookhaven Protein Data Bank. The structure entries in this database typically contain links to the primary literature. These links are not crucial, however, because the structures are very well defined, and in many ways self-validating. For the majority of biological molecules, however, high resolution structures are not available. Instead, our understanding of their structure comes from multiple experimental, theoretical and statistical data sources that appear in the literature and provide important fragments of information. It is therefore critical that structural coordinate entries be tightly associated with relevant structural data (whether or not these data have been used to compute the structure or are consistent with it). The hypothesis of this work is that integrated information resources that contain both structural coordinates and the relevant available experimental data can be used to support (1) interactive evaluation of the consistency between structures and data, and (2) computation of new three-dimensional models that are maximally consistent with the available data. In order to test this hypothesis, we propose to build a system called RiboWeb. The system will focus on the structure of the 30S ribosomal subunit in procaryotes. This critical cellular component initiate the translation of mRNA into protein. It is the site of action of numerous antibiotics, and a detailed understanding of its structure would shed light on its critical function. RiboWeb will be composed of (l) a standardized representation of the primary data relevant to the structure of the 30S subunit, (2) links to the Medline references reporting these data and the special purpose databases containing ribosomal sequences and secondary structures, (3) a database of the previously proposed 305 structures, and (4) a software component that not only can test for compatibility and consistency between the primary data and the structural models, but also can compute new models based on user-specified interpretations of the primary data. Building upon our recent work in producing preliminary models of the 30S subunit, we propose to make this resource available to our collaborators in the field of ribosomal structural biology on the internet, and to test it by Creating new models of the 305 subunit that better integrate the existing body of structural data. At the end of the grant period, RiboWeb will be a prototype for new structural information resources that tightly link coordinates with experimental (and other) data sources.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Florance, Valerie
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code
Chang, Jeffrey T; Altman, Russ B (2004) Extracting and characterizing gene-drug relationships from the literature. Pharmacogenetics 14:577-86
Chang, Jeffrey T; Schutze, Hinrich; Altman, Russ B (2004) GAPSCORE: finding gene and protein names one word at a time. Bioinformatics 20:216-25
Mooney, Sean D; Altman, Russ B (2003) MutDB: annotating human variation with functionally relevant data. Bioinformatics 19:1858-60
Mooney, Sean D; Klein, Teri E; Altman, Russ B et al. (2003) A functional analysis of disease-associated mutations in the androgen receptor gene. Nucleic Acids Res 31:e42
Raychaudhuri, Soumya; Altman, Russ B (2003) A literature-based method for assessing the functional coherence of a gene group. Bioinformatics 19:396-401
Troyanskaya, Olga G; Dolinski, Kara; Owen, Art B et al. (2003) A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae). Proc Natl Acad Sci U S A 100:8348-53
Troyanskaya, Olga G; Garber, Mitchell E; Brown, Patrick O et al. (2002) Nonparametric methods for identifying differentially expressed genes in microarray data. Bioinformatics 18:1454-61
Raychaudhuri, Soumya; Chang, Jeffrey T; Sutphin, Patrick D et al. (2002) Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature. Genome Res 12:203-14
Waugh, Allison; Gendron, Patrick; Altman, Russ et al. (2002) RNAML: a standard syntax for exchanging RNA information. RNA 8:707-17
Chang, Jeffrey T; Schutze, Hinrich; Altman, Russ B (2002) Creating an online dictionary of abbreviations from MEDLINE. J Am Med Inform Assoc 9:612-20

Showing the most recent 10 out of 16 publications