Human studies, encompassing interventional and observational studies, are one of the most central and valuable activities in biomedical research. Human studies are the means to translating biomedical discoveries into clinical interventions, and to understanding how existing clinical interventions can be optimized for improved health outcomes. Because human studies are expensive, logistically complex, and labor intensive, it is vital that these studies are performed and used efficiently and effectively. Resources should be directed to scientifically promising understudied areas and away from unnecessarily duplicative research. Study results should be available for data mining, synthesis, re-analysis, and reuse. The """"""""human study-ome,"""""""" to coin a new vernacular, should be standardized, computable, and national. However, current reuse of human studies data from different sources is prohibitively difficult. A major reason is the absence of a consensus human studies ontology that covers such entities as eligibility rules, outcome measures, and study design in the scientific depth and modeling rigor needed that is needed to enable data mining and discovery of integrated human studies data. In particular, an ontology of human studies design is absolutely critical for integrating raw patient-level observations because without the context of the study de- sign, the raw observations are not interpretable. Existing information models and data standards like BRIDG, CDISC SDTM and are primarily operational or administrative in focus, and are insufficient for supporting search and inference of human studies data for scientific purposes. Our long-term goal is a federated, national scientifically detailed database of the design and results of all hub- man studies (the human study-ome) that is anchored by a well-formed common ontology bound to standard terminologies and information models. This grant proposal will define detailed scientific use cases for this ad- tabase, formalize the Ontology of Clinical Research (OCRe), and use OCRe to integrate human studies de- scriptions from the University of California San Francisco (UCSF), Mayo Clinic, and Washington University St. Louis (WU) over the caGrid federated query architecture. We will offer graphical user interfaces for scientists to ask detailed questions of study designs across the 3 institutions. Other institutions will have a pathway through the i2b2, caBIG, or any grid-based platform to federate their human studies data through caGrid as well. Suc- cessful federation of human studies design data will lay the groundwork for future federation of participant-level data. We will also build and sustain an OCRe developer and user community with outreach to OBO Foundry, the National Center for Biomedical Ontology, and the BRIDG community. Thus, this project will demonstrate the value of and provide the tools, technology and best practices to achieve a national computable scientific database of human studies. Such a database will be an incomparably rich resource for data mining, inferencing, and reuse of human studies data for clinical and translational research and discovery.

Public Health Relevance

People volunteer for clinical trials hoping to contribute to public knowledge about what works and what doesn't work to fight disease and to keep us healthy. The information from clinical trials is extremely valuable and scientists should be able to share that information widely so that as many scientists as possible can learn as much as possible from the trials. This project builds a system that allows scientists to share their clinical trial data and to use powerful computing technology to make new discoveries from the shared data.

National Institute of Health (NIH)
National Center for Research Resources (NCRR)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BST-G (51))
Program Officer
Brazhnik, Olga
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Francisco
Internal Medicine/Medicine
Schools of Medicine
San Francisco
United States
Zip Code
Sim, Ida; Tu, Samson W; Carini, Simona et al. (2014) The Ontology of Clinical Research (OCRe): an informatics foundation for the science of clinical research. J Biomed Inform 52:78-91
Brinkley, James F; Detwiler, Landon T; Structural Informatics Group (2012) A query integrator and manager for the query web. J Biomed Inform 45:975-91
Boland, Mary Regina; Tu, Samson W; Carini, Simona et al. (2012) EliXR-TIME: A Temporal Knowledge Representation for Clinical Research Eligibility Criteria. AMIA Jt Summits Transl Sci Proc 2012:71-80
Sim, Ida; Carini, Simona; Tu, Samson W et al. (2012) Ontology-based federated data access to human studies information. AMIA Annu Symp Proc 2012:856-65
Wynden, Rob; Weiner, Mark G; Sim, Ida et al. (2010) Ontology mapping and data discovery for the translational investigator. AMIA Jt Summits Transl Sci Proc 2010:66-70
Sim, Ida; Carini, Simona; Tu, Samson et al. (2010) The human studies database project: federating human studies design data using the ontology of clinical research. AMIA Jt Summits Transl Sci Proc 2010:51-5
Sim, Ida; Chute, Christopher G; Lehmann, Harold et al. (2009) Keeping raw data in context. Science 323:713
Carini, Simona; Pollock, Brad H; Lehmann, Harold P et al. (2009) Development and evaluation of a study design typology for human research. AMIA Annu Symp Proc 2009:81-5