The Division of Computational Bioscience (DCB) Scientific Computing Facility (SCF) collaborates with DCB staff and their NIH investigators to build and provide a diverse but robust scientific computing system. This system provides the infrastructure needed by the collaborations at the same time as complying with all security, confidentiality and integrity requirements, especially those concerned with storing PII and other sensitive data. With common smooth software and hardware refreshes, collaborators are allowed to work uninterrupted. Expert staff are able to provide centralised support and advice to enable the best use of resources in support of the Division's Intramural Research Program. The Facility comprises of low availability equipment rooms containing a wide range of hardware for scientific computing with approximately 200 continuously running servers, which provide compute nodes and data storage. There are separate racks for machines used for storing and processing sensitive information, such as personal health information and grant information, with appropriate extra layers of protection. DCB also rents space from the Division of Computer System Services (DCSS) of the Center for Information Technology (CIT) for high availability production systems, such as those supporting the Salivary Proteome Wiki. A common and federated environment across all computational systems, enables DCB staff to quickly build the data analysis tools they need throughout the lifetime of a collaboration. Common file systems, unified authentication, system management, and other system coordinating features are fully integrated into an infrastructure containing the variety of supporting software. Commodity hardware, free software and commercial software are integrated to provide efficient and sustainable solutions. Staff perform the many common security, maintenance, backup and reporting tasks that are required for the day to day operation of scientific computational systems on behalf of DCB. In particular, managing the Federally mandated Certification and Accreditation process on DCB's Computational Facility that would otherwise be a significant burden to DCB developers and investigators as well as their collaborators. When an collaborative project is concluded, staff work with the ICs to relocate systems to space within the ICs or the CIT Data Center as appropriate. Facility staff are also refining customised versions of various Linux distributions to comply with NIH computer configuration policies, including the policies on user account life-cycle, user account passwords, and Incident Response Team (IRT) scanning. This """"""""Lab Linux"""""""" configuration leverages the existing IT infrastructure at NIH, thereby making it easy for NIH laboratory staff to use and maintain the system. Many prominent investigators panels and science leaders have reported a lack of skilled professional computational experts to help with running similar resources across NIH. Laboratories tend to rely on the incidental expertise found among their, often temporary, staff. With the large amounts of data processing that is required for projects at NIH, the Facility provides the necessary expertise to DCB. Moreover, DCB is receiving an increasing number of requests from across the ICs to jump-start and support laboratory-based scientific computing infrastructures directly because of the success of DCB's own Facility. In 2014, supported collaborations include: . The Human Salivary Protein Catalog, with NIDCR, supporting the Salivary Proteome community. . Molecular Libraries Program (MLP), part of the NIH Common Fund, to develop the Common Assay Reporting System (CARS). . Molecular Structure Determination (NIDDK, NIDCR, NCI, NHLBI). . The Center for Molecular Modeling's MMIGNET programme supporting NIH as a whole. . Microarray Database System (mAdb) with NCI and NIAID. . The Genetic Association Database archive of human genetic association studies of complex diseases and disorders, in collaboration with NIA. . Undiagnosed Diseases Program Portal (ORDR, NHGRI, CC) enabling experts from across the world help NIH to diagnose rare diseases. . Portfolio analysis and portfolio visualization resources supporting many groups throughout NIH, including OD, NCI and NIGMS. . A collaborative effort with NHLBI to design and execute a program which computes billions of SNP-gene expression association tests, in an effort to find expression - single nucleotide polymorphisms, eSNPs. Efforts so far are producing a 20-fold speed up over previous efforts. . A joint project with NIDDK to apply random forest learning machines to identify and refine transcription start sites in the fruit fly genome based on data obtained from cap analysis gene expression (CAGE) data. . A joint project, with NCI and a NCI-funded consortium, implementing an analysis pipeline and providing the necessary bioinformatics tools for transcriptome analysis and biological interpretation of RNA-seq/Exon capture data from next generation sequencing. . A joint project with NHLBI and other ICs to provide a nexus of computational system administration support and set policies and standards for NIH.

Agency
National Institute of Health (NIH)
Institute
Center for Information Technology (CIT)
Type
Scientific Computing Intramural Research (ZIH)
Project #
1ZIHCT000274-03
Application #
8941590
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
3
Fiscal Year
2014
Total Cost
Indirect Cost
Name
Computer Research and Technology
Department
Type
DUNS #
City
State
Country
Zip Code