Biorepositories are critical to enabling modern molecular-based research that will drive the development of a new generation of targeted diagnostics and therapies as well as personalized medicine to improve clinical outcomes for patients. The use of data and biospecimen resources collected during research are constrained by the informed consent that research participants give to research teams, research protocol documents, and the constraints imposed on the research by the IRB itself. Currently there is a lack of a common model of consent that limits how easily data (and research specimens) from multiple research projects, or multiple institutions can be combined for large-scale retrospective studies. Manually examining and reconciling potentially millions of informed consent forms from different biobanks becomes an expensive and possibly irreconcilable problem. The application of suitable metadata in support of the complex set of regulatory, legal, privacy and security requirement processes and information flows involved in regulated research is a field in an early phases of development. The complicated legal and technical requirements involved in the processes challenge our ability to effectively build information systems that support sharing of research data, specimens and other research artifacts at scale. Additionally, much of the regulatory processes involved in research are still based on paper-based workflows. We posit that by developing suitable machine-based metadata representation of regulatory processes focusing on informed consents and the associated documents would enhance the ability of regulatory bodies such as Institutional Review Boards to 1) review proposals in a more streamlined fashion, 2) have the potential to provide a more comprehensive understanding of risk along multiple axes, and 3) provide a formal and computable basis for data sharing and information release policies. More specifically we will focus on three specific aims: 1) Develop standard-conforming metadata representations of informed consent; 2) Develop NLP-based automatic annotation tool for informed consent documents; and 3) Evaluate the metadata-ontology-based approach for semantically representing the domain and demonstrate its capacity for answering competency questions. Our proposed approach is novel for the following reasons: 1) it provides the first metadata ontology, to the best of our knowledge, to represent the informed consent space while considering the US common rules; 2) it combines natural language processing technologies with ontology-based approaches for semantic annotation as well as ontology enrichment; and 3) it engages stakeholders in the ontology development and evaluation process and uses competency questions to verify the coverage of the ontology.

Public Health Relevance

The application of suitable metadata in support of the complex set of regulatory, legal, privacy and security requirement processes and information flows involved in regulated research is a field in an early phases of development. The complicated legal and technical requirements involved in the processes challenge our ability to effectively build information systems that support sharing of research data, specimens and other research artifacts at scale. In response to RFA-CA-15-017, this proposed project will develop suitable machine-based metadata representation and automatic annotation of regulatory processes focusing on informed consents and the associated documents. The proposed approach will presumably enhance the ability of regulatory bodies such as Institutional Review Boards to (a) review proposals in a more streamlined fashion, (b) have the potential to provide a more comprehensive understanding of risk along multiple axes, and (c) provide a formal and computable basis for data sharing and information release policies.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project--Cooperative Agreements (U01)
Project #
1U01HG009454-01
Application #
9161167
Study Section
Special Emphasis Panel (ZRG1-BST-U (50)R)
Program Officer
Sofia, Heidi J
Project Start
2016-09-28
Project End
2019-07-31
Budget Start
2016-09-28
Budget End
2017-07-31
Support Year
1
Fiscal Year
2016
Total Cost
$471,848
Indirect Cost
$122,099
Name
University of Texas Health Science Center Houston
Department
Biology
Type
Schools of Medicine
DUNS #
800771594
City
Houston
State
TX
Country
United States
Zip Code
77225
Amith, Muhammad; He, Zhe; Bian, Jiang et al. (2018) Assessing the practice of biomedical ontology evaluation: Gaps and opportunities. J Biomed Inform 80:1-13