Cancer Data Standards Repository (caDSR)? The CBIIT provides the cancer Data Standards Repository (caDSR) to support development and deployment of Common Data Elements (CDEs), electronic case report forms, and data models in cancer research. The caDSR is comprised of a suite of tools for creating, sharing and deploying CDEs including a public CDE Browser that lets the user search for data elements, create forms and download CDEs. caCORE provides web services to access caDSR content programmatically. The caDSR, based on open source standards, is freely available for use by other government agencies and for download and use by interested parties.? ? The content of the repository is comprised of metadata data about data for which a Data Element, the smallest unit of data, is the pinnacle. The metadata registry is based on an international Information Technology standard, the ISO/IEC 11179, which is used to register the descriptive information needed to render cancer research data reusable and interoperable. The metadata of a Data Element is broken down into conceptual entities, called Administered Items, the Data Element being one type of Administered Item. Each type of Administered Item is reusable across one or many Data Elements, playing a critical role in defining the conceptual characteristics (semantics defined by EVS) and representation (syntax defined by caDSR) of data. The goal of the metadata repository is to support both human and computer interpretation as well as to provide computable interoperability across user domains. Usage of the caDSR includes clinical and research applications sponsored by NCI and in the caBIG community.? ? The caDSR supports CBIITs mission to enable advanced biomedical informatics computing based on common data elements and common semantics across both clinical and basic scientific research domains. The caDSR is part of the NCI caCORE Foundational Infrastructure, along with caBIO and EVS.? ? The caDSR toolset is comprised of: ? UML Modeling Tool? CDE Browser? Administration Tool? Sentinel Tool and Sentinel API? Curation Tool? APIs for accessing caDSR content? Form Builder? CDE Comparison Matrix? ? Common Data Element (CDE) Development Use and Harmonization? Common Data Elements (CDEs) provide the cornerstone for semantic interoperability of data. A CDE is a computer object created using a standardized set of syntax and semantics, thus allowing for interoperability among various systems. These standardized characteristics are defined in the metadata describing the date element concept and a valid value domain for an appropriate response. All subcomponents of the data element, plus the data element itself, are reusable across one or many data elements, playing a critical role in defining the characteristics necessary for interoperability. Each CDE is created using metadata standards that are housed in caDSR and terminology defined by EVS. ? ? Training sessions are also currently available to teach caDSR users and metadata consumers how CDEs fit into the caCORE Infrastructure and how to use the caDSR Tools to search for, retrieve, analyze and curate CDEs.? ? A harmonization team of NCI stakeholders and contractor staff is developing a process for harmonization, curation, and governance of CDEs. The mission of CDE harmonization is to have registered in the caDSR unique, well-formed CDE for each item of research data to be collected. ? ? caFramework? caFramework is a caCORE application development framework that enables software developers and end users with minimal programming expertise to create web applications that leverage caCORE components, including caCORE SDK, EVS and caDSR. These applications will retrieve stored information models and data standards, and leverage caDSR metadata, to collect, validate, and persist data in back end repositories. Applications developed using caFramework will also include application and system level security and create semantically interoperable data. ? ? The end users of this toolkit are both software engineers and domain or subject matter experts with a basic level of knowledge of caCORE infrastructure, including an understanding of domain models, the caDSR and EVS. These users will be able to generate caBIG Silver level compatible data services that are caGRID enabled. ? ? The project has two objectives:? ? Design, deliver and support a rapid application development (RAD) tool for forms rendering and data collection. RAD will incorporate existing caCORE technologies and components including caDSR tools, caDSR metadata, EVS, CSM, HL7 SDK, caAdaptor, caGRID and caCORE SDK. ? Perform knowledge acquisition and use case analysis for additional modules of caFramework to wrap around caCORE SDK that will further enhance the adoption and deployment of caCORE-like systems.? ? caFramework is part of the caCORE infrastructure and part of the formal caCORE product releases. It forms the primary face of caCORE infrastructure to end users and enable systems created by independent sites and researchers to be interoperable.? ? caCORE software Development Kit (SDK)? The caCORE Software Development Kit is a set of open source software tools, standards, and documentation. These tools will aid intermediate level java programmers in using or extending the capabilities of caCORE to aid in the creation of caCORE like systems. A caCORE like system is a system designed with a Model Driven Architecture, an n-tier architecture, controlled vocabularies, and registered metadata. Such a system is semantically integrated in that all exposed API elements have runtime accessible metadata that defines the meaning of the elements using controlled terminology. The caCORE SDK will consist of the following tools to implement this formal software engineering methodology:? ? Semantic Integration Workbench: The semantic integration workbench supplies a set of user interfaces to assist in creating a semantically interoperable system. The first step is to take a UML model, and searching for concepts in the NCI Thesaurus that map to its classes and attributes. A set of mapped concepts is returned as a report that is reviewed by experts, until each attribute is described by a set of codes that refer to terms in a controlled vocabulary. The SIW then allows the developer to review and annotates the UML model with the concept codes and controlled vocabulary terms for later loading into a metadata repository. ? Code Generator: The code generator takes the UML model and creates a functional Java software system using Java JET technology. This software system runs in an appropriate web services container such as Apache Tomcat. ? UML Loader: The UML loader takes the annotated XMI file and loads the system's metadata into the caDSR. The UML loader is not being distributed with this version of the SDK; however, CBIIT will load annotated XMI files for users that have created systems via the caCORE SDK.? ? The caCORE SDK is designed to allow a moderately experienced Java programmer to create a Silver compatible software system.