A crucial component to the recent major advances in genomic research has been the uniting of advances in biology with those in computers, informatics and networking. As technologies have advanced allowing high throughput, Genomics scale data collection, the technological burden has shifted increasingly to analysis and informatics. This project was established to ensure that necessary computational tools and resources are available to the NIH community. An integrated system for the storage, management, analysis and viewing of microArray data has been developed to support the NCI Advanced Technology Center microArray facility. The mAdb (microArray database) system provides a secure data management system for gathering, storing, and managing experimental information and expression array data. A variety of web accessible tools have been implemented to support the multiple analytical approaches needed to decipher array data in a more meaningful way. Important to the mAdb system design is compatibility with any platform (Unix, Windows or Macintosh) capable of running an Internet browser. We have taken an evolutionary developmental approach to designing and implementing the mAdb system, which provides for continuous evaluation, improvement, flexibility and quick turnaround. In addition, tools for mining UniGene for tissue-specific gene sets and that allow comparison of various microArray gene sets have been made available to the community. A natural extension of mAdb has been the inclusion of additional data resources. This includes supporting information from various data sources (e.g. Gene Ontology, GenBank, Entrez Gene, UniGene, BioSystems Pathways, Biocarta Pathways, COSMIC, 1000 Genomes and GeneCards) to enable drilling down into the rapidly expanding biological knowledgebase. In order to have effective use of the informational resource developed to support microArray analysis, ongoing user training and support is provided through CIT facilities for this collaborative effort. While ongoing development of new and improved analysis tools continues, the mAdb system is in routine service, supporting over 1500 NIH researchers and collaborators and containing over 100,000 microArray experiments. A critical design element for the mAdb system was to accommodate scalability to allow expansion to support other ICDs. The design allows us to support separate web servers serving different user communities from a single code base. The mAdb system has been set up on separate web servers to support users of the NIAID microArray core facility and the Lymphoma Leukemia Molecular Profiling Project (LLMPP)/Strategic Partnering to Evaluate Cancer Signature (SPECS) consortium. The LLMPP/SPECS project is using microArrays and other high throughput whole genome technologies to define the molecular profiles of all types of human lymphoid malignancies. One primary goal of this project is to redefine the classification of human lymphoid malignancies in molecular terms. A second major goal is to define molecular correlates of clinical parameters that can be used in prognosis and in the selection of appropriate therapy for these patients. As members of the international LLMPP/SPECS consortium, we provide the informatics development and support critical to the success of this project. A database and tools have been implemented to facilitate integrating and analyzing clinical parameters with genomic/genetic data from high throughput technologies. Data for 3,000 clinical cases has been uploaded into the system. An analysis pipeline was developed and implemented for processing next generation sequence data generated from RNAseq libraries. Variation results derived from more than 500 lymphoma, pancreatic and prostrate RNA samples have been stored in a database, classified and integrated with relevant external annotations.

Project Start
Project End
Budget Start
Budget End
Support Year
18
Fiscal Year
2013
Total Cost
$936,720
Indirect Cost
Name
Center for Information Technology
Department
Type
DUNS #
City
State
Country
Zip Code
Xiao, Wenming; Tran, Bao; Staudt, Louis M et al. (2013) High-throughput RNA sequencing in B-cell lymphomas. Methods Mol Biol 971:295-312
Lenz, G; Wright, G; Dave, S S et al. (2008) Stromal gene signatures in large-B-cell lymphomas. N Engl J Med 359:2313-23