The main goal of the Data Organizing Core, DOC, of the Illuminating the Druggable Genome Knowledge Management Center (IDG KMC) is to evaluate, organize and rank all prospective disease-linked proteins for four protein superfamilies: G-protein-coupled receptors (GPCRs), nuclear receptors (NRs), ion channels (IC) and kinases. As main knowledge repository, the DOC will develop the """"""""Target Central"""""""" Resource Database (TCRD) by combining data extracted from multiple sources linking disease, pathway, protein, chemical, gene, bioactivity, drug discovery and clinical information elements from databases, literature, patents, drug labels and other documents. TCRD will serve as central source for the IDG Query Platform, which is developed by KMC's User Interface Portal (UIP) core. DOC will develop tools for algorithmic processing and prediction, which will improve disease-protein associations supported by human curation. Four External Target Panels will curate emerging associations, ranking appropriate proteins. DOC will stratify proteins into 4 classes (Tclin - clinical;Tchem - manipulated by chemicals;Tmacro - manipulated by macromolecules;and Tdark - the genomic """"""""dark matter""""""""), supported by tissue and cellular localization data for proteins (TTL) and diseases. Oprea at UNM will lead the DOC, supported by team leaders Brunak and Jensen (at Center for Protein Research, Denmark), Overington (European Bioinformatics Institute) and Schurer (University of Miami), respectively.
Specific Aims : 1. Develop tools for the automated extraction and processing of data, deposited into TCRD;2. Develop tools for the semi-automated data extraction for pathways, diseases and associated ontologies, which will support TTL stratification;3. Develop tools for expert curation of literature and patent data, approved drug labels and clinical trials;4. Develop analytics, modeling and visualization tools for disease-based target prioritization. Preliminary stratification (e.g., Tclin 22%, Tdark 30%) of disease-protein associations was performed for each protein superfamily, using automated tools. Within 12 months, the TCRD-based IDG Querly Platform will be operational, improving target prioritization for the research community at large and the IDG Consortium, in exploring """"""""dark matter"""""""" for GPCRs, NRs, ICs and kinases.

Public Health Relevance

The Data Organizing Core will combine unrelated informational elements from biology, chemistry and clinical sciences, and distil them into knowledge, associating diseases and proteins, to rank proteins for druggability using facts, inferences and predictions. The results, captured in the Target Central repository, will assist IDG Consortium members and other scientists to focus on the less studied, dark area of the genome..

National Institute of Health (NIH)
National Cancer Institute (NCI)
Specialized Center--Cooperative Agreements (U54)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BST-M (50))
Program Officer
Zenklusen, Jean C
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of New Mexico Health Sciences Center
United States
Zip Code
Oprea, Tudor I; Bologa, Cristian G; Brunak, Søren et al. (2018) Unexplored therapeutic opportunities in the human genome. Nat Rev Drug Discov 17:377
Collins, Kyla A L; Stuhlmiller, Timothy J; Zawistowski, Jon S et al. (2018) Proteomic analysis defines kinase taxonomies specific for subtypes of breast cancer. Oncotarget 9:15480-15497
Oprea, Tudor I; Bologa, Cristian G; Brunak, Søren et al. (2018) Unexplored therapeutic opportunities in the human genome. Nat Rev Drug Discov 17:317-332
Sinha, Swati; Eisenhaber, Birgit; Jensen, Lars Juhl et al. (2018) Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000. Proteomics 18:e1800093
Stathias, Vasileios; Jermakowicz, Anna M; Maloof, Marie E et al. (2018) Drug and disease signature integration identifies synergistic combinations in glioblastoma. Nat Commun 9:5315
Lin, Yu; Mehta, Saurabh; Küçük-McGinty, Hande et al. (2017) Drug target ontology to classify and integrate drug discovery data. J Biomed Semantics 8:50
Cannon, Daniel C; Yang, Jeremy J; Mathias, Stephen L et al. (2017) TIN-X: target importance and novelty explorer. Bioinformatics 33:2601-2603
Ursu, Oleg; Holmes, Jayme; Knockel, Jeffrey et al. (2017) DrugCentral: online drug compendium. Nucleic Acids Res 45:D932-D939
Nguyen, Dac-Trung; Mathias, Stephen; Bologa, Cristian et al. (2017) Pharos: Collating protein information to shed light on the druggable genome. Nucleic Acids Res 45:D995-D1002
Nelson, Stuart J; Oprea, Tudor I; Ursu, Oleg et al. (2017) Formalizing drug indications on the road to therapeutic intent. J Am Med Inform Assoc 24:1169-1172

Showing the most recent 10 out of 20 publications