Domain-specific common data elements (CDEs) are emerging as an effective approach to standards-based clinical research data storage and retrieval and have been broadly adopted. For example, the National Cancer Institute (NCI) created the Cancer Data Standards Repository (caDSR) based on the ISO/IEC 11179 standard for metadata repositories. However, cancer clinical research community faces significant challenges related to scalability, governance, and data quality for CDE modeling. In particular, the lack of robust, principled and automated QA algorithms contributes to CDE content errors that can have a significant negative impact on downstream CDE uses. Our overall goal is to build a novel quality assurance (QA) framework to overcome methodological and computational challenges with respect to error detection in the modeling of common data elements (CDEs), recognition of duplicates or similar CDEs, and CDE usability, thereby producing high-quality CDEs for cancer clinical research studies. Our proposed approach is to design, develop and evaluate an integrative platform known as caCDE-QA that implements a suite of QA tools to audit experimental cancer study CDEs represented in a semantic web framework, deploying a QA web-portal with standard semantic services for community collaboration.
Our specific aims are: (1) To develop a suite of QA tools for validation and harmonization of cancer study CDEs. (2) To apply the QA tools to audit experimental cancer study CDEs represented in a semantic web framework. We will also evaluate the performance of the QA tools in terms of efficiency, accuracy and usability by comparing with the baseline tools that exist in the NCI caDSR and CIMI communities. (3) To deploy and evaluate a QA web-portal for collaborative CDE review and harmonization. We will coordinate community-based efforts soliciting requirements regarding cancer study CDE discovery and harmonization and fostering a specification of the common data element services (CDES) standard. We will disseminate and test the newly developed QA methods and tools in collaboration with the Clinical Data Interchange Standards Consortium (CDISC) and CIMI Communities. This project will contribute novel QA methods and tools for validation and semantic harmonization of cancer study CDEs. This is of great significance in that it will be enabling efficient CDE modeling and producing high-quality reusable CDEs, which are critical for facilitating cancer clinical research data sharing and accelerating systematic clinical outcomes capturing.

Public Health Relevance

This project is to build a novel quality assurance (QA) framework to overcome methodological and computational challenges with respect to error detection in the modeling of common data elements (CDEs), recognition of duplicates or similar CDEs, and CDE usability, thereby producing high-quality CDEs for cancer clinical research studies. The ultimate goal is to enable standard cancer clinical research data sharing and systematic clinical outcomes capturing.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Research Project--Cooperative Agreements (U01)
Project #
1U01CA180940-01A1
Application #
8765818
Study Section
Special Emphasis Panel (ZCA1-SRLB-4 (M1))
Program Officer
Vydelingum, Nadarajen A
Project Start
2014-08-15
Project End
2017-07-31
Budget Start
2014-08-15
Budget End
2015-07-31
Support Year
1
Fiscal Year
2014
Total Cost
$355,061
Indirect Cost
$130,609
Name
Mayo Clinic, Rochester
Department
Type
DUNS #
006471700
City
Rochester
State
MN
Country
United States
Zip Code
55905
Hoxha, Julia; Jiang, Guoqian; Weng, Chunhua (2016) Automated learning of domain taxonomies from text using background knowledge. J Biomed Inform 63:295-306
Hong, Na; Pathak, Jyotishman; Chute, Christopher G et al. (2016) Developing a modular architecture for creation of rule-based clinical diagnostic criteria. BioData Min 9:33
Zimmermann, Michael T; Jiang, Guoqian; Wang, Chen (2016) Single-sample expression-based chemo-sensitivity score improves survival associations independently from genomic mutations for ovarian cancer Patients. AMIA Jt Summits Transl Sci Proc 2016:94-100
Hong, Na; Li, Dingcheng; Yu, Yue et al. (2016) A computational framework for converting textual clinical diagnostic criteria into the quality data model. J Biomed Inform 63:11-21
Jiang, Guoqian; Solbrig, Harold R; Pathak, Jyotishman et al. (2015) Developing a Standards-Based Information Model for Representing Computable Diagnostic Criteria: A Feasibility Study of the NQF Quality Data Model. Stud Health Technol Inform 216:1097
Jiang, Guoqian; Solbrig, Harold R; Prud'hommeaux, Eric et al. (2015) Quality Assurance of Cancer Study Common Data Elements Using A Post-Coordination Approach. AMIA Annu Symp Proc 2015:659-68
Jiang, Guoqian; Solbrig, Harold R; Kiefer, Richard et al. (2015) A Standards-based Semantic Metadata Repository to Support EHR-driven Phenotype Authoring and Execution. Stud Health Technol Inform 216:1098
Priya, Sambhawa; Jiang, Guoqian; Dasari, Surendra et al. (2015) A Semantic Web-based System for Mining Genetic Mutations in Cancer Clinical Trials. AMIA Jt Summits Transl Sci Proc 2015:142-6
Jiang, Guoqian; Liu, Hongfang; Solbrig, Harold R et al. (2015) Mining severe drug-drug interaction adverse events using Semantic Web technologies: a case study. BioData Min 8:12
Wang, Liwei; Jiang, Guoqian; Li, Dingcheng et al. (2014) Standardizing adverse drug event reporting data. J Biomed Semantics 5:36

Showing the most recent 10 out of 12 publications