The Collective Intelligence, Knowledge Infrastructure, and High Performance Program, which operates within the High Performance Computing and Informatics Office (HPCIO), Division of Computational Bioscience of CIT, is collaborating with NIH investigators to build a critical mass in collective intelligence that is envisioned to encompass a number of pertinent and related disciplines in biomedical research including semantic interoperability, knowledge engineering, computational linguistics, text and data mining, natural language processing, machine learning, and visualization. The program is intended to foster advances in critical domains at NIH including biomedical and clinical informatics, translational research, proteomics research, genomics, systems biology, and portfolio analysis. In 2009, collaborations in support of these goals included the following. - The human salivary protein catalog has been made available online on a community-based Web portal developed by HPCIO, in collaboration with NIDCR, to enable scientists to add their own research data, share results, and discover new knowledge. This is a major step towards the discovery and use of saliva biomarkers to diagnose oral and systemic diseases. - HPCIO has developed context-sensitive text-mining methodology for identifying High-Risk, High-Reward (HRHR) research based from NIH Summary Statements. The method, which uses natural language processing to parse text and classify documents, has been successful in retrospective analysis of the most recent five-years summaries. This work is being conducted in collaboration with Division of Program Coordination, Planning, and Strategic Initiatives (DPCPSI), NIH Office of the Director (OD). - HPCIO is developing a corpus of annotated NIH medical records for use in developing methods of document de-identification. The goal is to create a gold standard that de-identification algorithms for use at NIH can be measured against. This work, in collaboration with the Clinical Center, will enhance the availability of medical records stored within the Biomedical Translational Research Information System. - HPCIO is working with the caBIG Clinical Trial Management System Workspace (in collaboration with NCI) to develop a Protocol Lifecycle Tracking (PLT) tool. By providing real-time protocol status information on all relevant trials to clinicians and researchers, bottlenecks and latencies in protocol management can be identified and corrected by those responsible for conduct of a trail and the overall success of a clinical trial program. - As a component of the Molecular Libraries Roadmap imitative, the Common Assay Reporting System (CARS) allows investigators and program directors to track the status of assay projects related information at each screening center within the Molecular Libraries Program Center Network (MLPCN). The system also provides a means for collecting, sharing and retrieving of bioassay information among the centers and program office at NIH. - HPCIO is collaborating with NCI to develop deep knowledge bases representing NCIs scientific portfolio. The effort will explore several different representation paradigms (which store not only scientific concepts but the relationships between concepts as well) to evaluate their effectiveness at various tasks including document categorization, clustering, and visualization. A similar collaboration has recently been initiated with DPCPSI/OD.

Agency
National Institute of Health (NIH)
Institute
Center for Information Technology (CIT)
Type
Scientific Computing Intramural Research (ZIH)
Project #
1ZIHCT000200-21
Application #
8149742
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
21
Fiscal Year
2010
Total Cost
$2,823,000
Indirect Cost
Name
Center for Information Technology
Department
Type
DUNS #
City
State
Country
Zip Code
Schmitz, Roland; Wright, George W; Huang, Da Wei et al. (2018) Genetics and Pathogenesis of Diffuse Large B-Cell Lymphoma. N Engl J Med 378:1396-1407
Martins, Andrew J; Narayanan, Manikandan; PrĂ¼stel, Thorsten et al. (2017) Environment Tunes Propagation of Cell-to-Cell Variation in the Human Macrophage Gene Network. Cell Syst 4:379-392.e12
Wilcox, Amber N; Silverman, Debra T; Friesen, Melissa C et al. (2016) Smoking status, usual adult occupation, and risk of recurrent urothelial bladder carcinoma: data from The Cancer Genome Atlas (TCGA) Project. Cancer Causes Control 27:1429-1435
Liang, Ma; Raley, Castle; Zheng, Xin et al. (2016) Distinguishing highly similar gene isoforms with a clustering-based bioinformatics analysis of PacBio single-molecule long reads. BioData Min 9:13
Lau, William W; Tsang, John S (2016) Humoral Fingerprinting of Immune Responses: 'Super-Resolution', High-Dimensional Serology. Trends Immunol 37:167-169
Lau, William W; Sparks, Rachel; OMiCC Jamboree Working Group et al. (2016) Meta-analysis of crowdsourced data compendia suggests pan-disease transcriptional signatures of autoimmunity. F1000Res 5:2884
Sparks, Rachel; Lau, William W; Tsang, John S (2016) Expanding the Immunology Toolbox: Embracing Public-Data Reuse and Crowdsourcing. Immunity 45:1191-1204
Russ, Daniel E; Ho, Kwan-Yuet; Colt, Joanne S et al. (2016) Computer-based coding of free-text job descriptions to efficiently identify occupations in epidemiological studies. Occup Environ Med 73:417-24
Maudsley, Stuart; Martin, Bronwen; Gesty-Palmer, Diane et al. (2015) Delineation of a conserved arrestin-biased signaling repertoire in vivo. Mol Pharmacol 87:706-17
Russ, Daniel E; Ho, Kwan-Yuet; Longo, Nancy S (2015) HTJoinSolver: Human immunoglobulin VDJ partitioning using approximate dynamic programming constrained by conserved motifs. BMC Bioinformatics 16:170

Showing the most recent 10 out of 14 publications