Domain-specific common data elements (CDEs) are emerging as an effective approach to standards-based clinical research data storage and retrieval and have been broadly adopted. For example, the National Cancer Institute (NCI) created the Cancer Data Standards Repository (caDSR) based on the ISO/IEC 11179 standard for metadata repositories. However, cancer clinical research community faces significant challenges related to scalability, governance, and data quality for CDE modeling. In particular, the lack of robust, principled and automated QA algorithms contributes to CDE content errors that can have a significant negative impact on downstream CDE uses. Our overall goal is to build a novel quality assurance (QA) framework to overcome methodological and computational challenges with respect to error detection in the modeling of common data elements (CDEs), recognition of duplicates or similar CDEs, and CDE usability, thereby producing high-quality CDEs for cancer clinical research studies. Our proposed approach is to design, develop and evaluate an integrative platform known as caCDE-QA that implements a suite of QA tools to audit experimental cancer study CDEs represented in a semantic web framework, deploying a QA web-portal with standard semantic services for community collaboration.
Our specific aims are: (1) To develop a suite of QA tools for validation and harmonization of cancer study CDEs. (2) To apply the QA tools to audit experimental cancer study CDEs represented in a semantic web framework. We will also evaluate the performance of the QA tools in terms of efficiency, accuracy and usability by comparing with the baseline tools that exist in the NCI caDSR and CIMI communities. (3) To deploy and evaluate a QA web-portal for collaborative CDE review and harmonization. We will coordinate community-based efforts soliciting requirements regarding cancer study CDE discovery and harmonization and fostering a specification of the common data element services (CDES) standard. We will disseminate and test the newly developed QA methods and tools in collaboration with the Clinical Data Interchange Standards Consortium (CDISC) and CIMI Communities. This project will contribute novel QA methods and tools for validation and semantic harmonization of cancer study CDEs. This is of great significance in that it will be enabling efficient CDE modeling and producing high-quality reusable CDEs, which are critical for facilitating cancer clinical research data sharing and accelerating systematic clinical outcomes capturing.

Public Health Relevance

This project is to build a novel quality assurance (QA) framework to overcome methodological and computational challenges with respect to error detection in the modeling of common data elements (CDEs), recognition of duplicates or similar CDEs, and CDE usability, thereby producing high-quality CDEs for cancer clinical research studies. The ultimate goal is to enable standard cancer clinical research data sharing and systematic clinical outcomes capturing.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Research Project--Cooperative Agreements (U01)
Project #
1U01CA180940-01A1
Application #
8765818
Study Section
Special Emphasis Panel (ZCA1-SRLB-4 (M1))
Program Officer
Vydelingum, Nadarajen A
Project Start
2014-08-15
Project End
2017-07-31
Budget Start
2014-08-15
Budget End
2015-07-31
Support Year
1
Fiscal Year
2014
Total Cost
$355,061
Indirect Cost
$130,609
Name
Mayo Clinic, Rochester
Department
Type
DUNS #
006471700
City
Rochester
State
MN
Country
United States
Zip Code
55905
Sharma, Deepak Kumar; Peterson, Kevin Jerrold; Hong, Na et al. (2018) The D2Refine Platform for the Standardization of Clinical Research Study Data Dictionaries: Usability Study. JMIR Hum Factors 5:e10205
Chen, Henry W; Du, Jingcheng; Song, Hsing-Yi et al. (2018) Representation of Time-Relevant Common Data Elements in the Cancer Data Standards Repository: Statistical Evaluation of an Ontological Approach. JMIR Med Inform 6:e7
Hong, Na; Prodduturi, Naresh; Wang, Chen et al. (2017) Shiny FHIR: An Integrated Framework Leveraging Shiny R and HL7 FHIR to Empower Standards-Based Clinical Data Applications. Stud Health Technol Inform 245:868-872
Li, Zheng; Hong, Na; Robertson, Melissa et al. (2017) Preoperative red cell distribution width and neutrophil-to-lymphocyte ratio predict survival in patients with epithelial ovarian cancer. Sci Rep 7:43001
Solbrig, Harold R; Prud'hommeaux, Eric; Grieve, Grahame et al. (2017) Modeling and validating HL7 FHIR profiles using semantic web Shape Expressions (ShEx). J Biomed Inform 67:90-100
Jiang, Guoqian; Kiefer, Richard; Prud'hommeaux, Eric et al. (2017) Building Interoperable FHIR-Based Vocabulary Mapping Services: A Case Study of OHDSI Vocabularies and Mappings. Stud Health Technol Inform 245:1327
Sharma, Deepak K; Solbrig, Harold R; Tao, Cui et al. (2017) Building a semantic web-based metadata repository for facilitating detailed clinical modeling in cancer genome studies. J Biomed Semantics 8:19
Sharma, Deepak K; Solbrig, Harold R; Prud'hommeaux, Eric et al. (2017) D2Refine: A Platform for Clinical Research Study Data Element Harmonization and Standardization. AMIA Jt Summits Transl Sci Proc 2017:259-267
Jiang, Guoqian; Kiefer, Richard C; Sharma, Deepak K et al. (2017) A Consensus-Based Approach for Harmonizing the OHDSI Common Data Model with HL7 FHIR. Stud Health Technol Inform 245:887-891
Peterson, Kevin J; Jiang, Guoqian; Brue, Scott M et al. (2017) Mining Hierarchies and Similarity Clusters from Value Set Repositories. AMIA Annu Symp Proc 2017:1372-1381

Showing the most recent 10 out of 21 publications