A patient?s genetic variant must be contextualized against a population-based reference and detailed phenotype to assess its pathogenicity and impact on prognosis, based on the care trajectories and outcomes of other patients with the variant, or similar variants of a particular gene. However, CTSA researchers do not have ready access to a definitive and representative reference dataset linking the genome to diagnosis, clinical progression, therapeutic response, and precision-adjusted laboratory reference ranges with the appropriate consents to recontact patients if needed. In preliminary work, three of the leading children?s hospitals in the CTSA program formed the Genomics Research and Innovation Network (GRIN) leveraging a combined, ethnically diverse population with unparalleled representation across the pediatric disease spectrum. GRIN sites broadly consent patients into compatible biobanking protocols. The next logical step is a truly federated CTSA-wide biobanking initiative, with the informatics supporting a Genomics Information Commons (GIC). With phenotype data produced as a byproduct of care, we develop the GIC technology, regulatory, and policy backbone, recognizing both heterogeneity of IT systems across CTSA hospitals and local control imperatives for a successful federated network. First, adhering to well-established common data models, each site exposes data to investigators across the secure PIC-SURE meta application programming interface (API), fostering incorporation of multiple heterogeneous clinical, omics, and environmental datasets. We demonstrate the self-scaling nature of the GIC as two additional CTSAs join in a modular fashion. Second, we develop two portals for researchers: (A) Prep-to- research portal. Investigators can execute genotype, phenotype, or combined genotype/phenotype queries, and receive aggregate results in real time; and (B) Study portal. With proper approvals, patient-level data are readily transferred to a cloud-hosted environment with data science tools (Jupyter Notebooks, R Studio), SMART on FHIR apps and resources, and API access to external data sources (e.g., gnomAD, NHANES). Third, we develop a GIC toolkit with policies for broadly consented biobank enrollment, investigator access, material transfer, and collaboration to enable new sites to participate and/or self-organize into collaboration networks. Finally, we leverage the GIC to build, and make publicly available, a knowledge resource of genetically-adjusted, precision laboratory reference ranges across demographically diverse populations.

Public Health Relevance

We capacitate a CTSA-wide, federated, Genomics Information Commons (GIC), underpinned by coordinated, local, broadly consented biobanking initiatives. The goal is rapid identification and analysis of representative cohorts, made possible by a federated IT infrastructure, with advanced phenotype and genotype query capability, available to all participating CTSA investigators. The GIC is self-scaling, so that CTSCs can modularly join, contribute, and benefit.

National Institute of Health (NIH)
National Center for Advancing Translational Sciences (NCATS)
Research Project--Cooperative Agreements (U01)
Project #
Application #
Study Section
Special Emphasis Panel (ZTR1)
Program Officer
Gannot, Gallya
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Boston Children's Hospital
United States
Zip Code