We will address Key Capabilities (KC) #4 ?Cloud-Agnostic Architecture and Frameworks?, #6 ?Research Ethics, Privacy, Security?, and #7 ?Indexing and Searching?. A cloud-agnostic architecture (KC#4) that supports proper access to indexed data (KC#7), software, and workflows, and that complies with institutional and Commons-wide policies and best practices for privacy protection and security (KC#6) must be tested in a real, diverse compute environment that includes features such as search engine, secure storage and compute, and access controls, and implemented in more than one commercial cloud environment. Without such implementation, it is difficult to assess its practicality and to test various features. Our three KCs are tightly connected, yet they designed so that their components can be plugged into different systems. Our overall vision is that other awardees focusing on FAIR Guidelines/Metrics (KC#1) will need a cross-cloud environment architecture that allows their implementation of interfaces to capture and report FAIR-ness statistics related to digital objects with Global Unique Identifiers (KC#2) and make them searchable via several digital object search engines, according to indexing strategies and tools we are proposing here (KC#7), and available across cloud environments via a flexible architecture we are also proposing (KC#4). This architecture will accommodate various workspaces for computation (KC#5), developed by other specialized teams, and will rely on open standard APIs, also organized by other teams (KC#3). Not all data sets can be made available to everyone. TOPMed data sets, for example, are composed of data from various studies, each of which has its own consent forms and conditions for access (e.g., available only for genetic studies on cardiovascular disease). Additionally, we expect researchers to bring their own data into the workspaces for computation, and these data may need to remain under access controls for additional reasons (e.g., intellectual property, publication embargos). A system to manage individuals? permissions per dataset (consent, data use agreements), and integration into cloud providers? Identity and Access Management infrastructure will be needed. We propose these capabilities (KC#6), which will be implemented on the overall cloud-agnostic architecture and framework (KC#4). Data will be searchable and deliverable in compliance with permissions encoded in metadata (KC#7), and the architecture will address the needs of various use cases developed by us and other teams (KC#8). Our team of investigators and developers from six institutions has complementary expertise and resources that will be leveraged to produce viable products in 180 days. Our investigators have ongoing collaborations with other applicants. Our proposed architecture, policies, indexing/searching processes and tools will be developed in close consultation with awardees for KCs #1, 2, 3, 5, and 8 (e.g., Dr. Subramanian?s team). Furthermore, should the proposed KCs overlap with others, such as Dr. Craven?s, we will collaborate to ensure our projects are interoperable, non-duplicative, and leverage each other?s strengths. We have a track record of participating in and leading consortia in different programs (e.g., PCORnet, All of UsSM Research Program) and we have several research projects that can be leveraged so we can ?hit the ground running.?

Agency
National Institute of Health (NIH)
Institute
Office of The Director, National Institutes of Health (OD)
Project #
3OT3OD025462-01S1
Application #
9672007
Study Section
Data Coordination, Mapping, and Modeling (DCMM)
Program Officer
Kutkat, Lora
Project Start
2017-09-30
Project End
2018-11-30
Budget Start
2017-09-30
Budget End
2018-11-30
Support Year
1
Fiscal Year
2018
Total Cost
Indirect Cost
Name
University of California, San Diego
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
804355790
City
La Jolla
State
CA
Country
United States
Zip Code
92093