A crucial component to the recent major advances in genomic research has been the uniting of advances in biology with those in computers, informatics and networking. As technologies have advanced allowing high throughput, Genomics scale data collection, the technological burden has shifted increasingly to analysis and informatics. This project was established to ensure that necessary computational tools and resources are available to the NIH community. An integrated system for the storage, management, analysis and viewing of microArray data has been developed to support the NCI Advanced Technology Center microArray facility. The mAdb (microArray database) system provides a secure data management system for gathering, storing, and managing experimental information and expression array data. A variety of web accessible tools have been implemented to support the multiple analytical approaches needed to decipher array data in a more meaningful way. Important to the mAdb system design is compatibility with any platform (Unix, Windows or Macintosh) capable of running an Internet browser. We have taken an evolutionary developmental approach to designing and implementing the mAdb system, which provides for continuous evaluation, improvement, flexibility and quick turnaround. In addition, tools for mining UniGene for tissue-specific gene sets and that allow comparison of various microArray gene sets have been made available to the community. A natural extension of mAdb has been the inclusion of additional data resources. This includes supporting information from various data sources (e.g. Gene Ontology, GenBank, Entrez Gene, UniGene, BioSystems Pathways, Biocarta Pathways, COSMIC, 1000 Genomes and GeneCards) to enable drilling down into the rapidly expanding biological knowledgebase. In order to have effective use of the informational resource developed to support microArray analysis, ongoing user training and support is provided through CIT facilities for this collaborative effort. While ongoing development of new and improved analysis tools continues, the mAdb system is in routine service, supporting over 1500 NIH researchers and collaborators and containing over 100,000 microArray experiments. A critical design element for the mAdb system was to accommodate scalability to allow expansion to support other ICDs. The design allows us to support separate web servers serving different user communities from a single code base. The mAdb system has been set up on separate web servers to support users of the NIAID microArray core facility and the Lymphoma Leukemia Molecular Profiling Project (LLMPP)/Strategic Partnering to Evaluate Cancer Signature (SPECS) consortium. The LLMPP/SPECS project is using microArrays and other high throughput whole genome technologies to define the molecular profiles of all types of human lymphoid malignancies. One primary goal of this project is to redefine the classification of human lymphoid malignancies in molecular terms. A second major goal is to define molecular correlates of clinical parameters that can be used in prognosis and in the selection of appropriate therapy for these patients. As members of the international LLMPP/SPECS consortium, we provide the informatics development and support critical to the success of this project. A database and tools have been implemented to facilitate integrating and analyzing clinical parameters with genomic/genetic data from high throughput technologies. Data for 3,000 clinical cases has been uploaded into the system. An analysis pipeline was developed and implemented for processing next generation sequence data generated from RNAseq libraries. Variation results derived from more than 500 lymphoma, pancreatic and prostrate RNA samples have been stored in a database, classified and integrated with relevant external annotations.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Center for Information Technology
Zip Code
Hoskins, Jason W; Jia, Jinping; Flandez, Marta et al. (2014) Transcriptome analysis of pancreatic cancer reveals a tumor suppressor function for HNF1A. Carcinogenesis 35:2670-8
Jia, Jinping; Parikh, Hemang; Xiao, Wenming et al. (2013) An integrated transcriptome and epigenome analysis identifies a novel candidate gene for pancreatic cancer. BMC Med Genomics 6:33
Xiao, Wenming; Tran, Bao; Staudt, Louis M et al. (2013) High-throughput RNA sequencing in B-cell lymphomas. Methods Mol Biol 971:295-312
Schmitz, Roland; Young, Ryan M; Ceribelli, Michele et al. (2012) Burkitt lymphoma pathogenesis and therapeutic targets from structural and functional genomics. Nature 490:116-20
Yang, Yibin; Shaffer 3rd, Arthur L; Emre, N C Tolga et al. (2012) Exploiting synthetic lethality for the therapy of ABC diffuse large B cell lymphoma. Cancer Cell 21:723-37
Snow, Andrew L; Xiao, Wenming; Stinson, Jeffrey R et al. (2012) Congenital B cell lymphocytosis explained by novel germline CARD11 mutations. J Exp Med 209:2247-61
Ngo, Vu N; Young, Ryan M; Schmitz, Roland et al. (2011) Oncogenically active MYD88 mutations in human lymphoma. Nature 470:115-9
Hartmann, Elena M; Campo, Elias; Wright, George et al. (2010) Pathway discovery in mantle cell lymphoma by integrated analysis of high-resolution gene expression and copy number profiling. Blood 116:953-61
Davis, R Eric; Ngo, Vu N; Lenz, Georg et al. (2010) Chronic active B-cell-receptor signalling in diffuse large B-cell lymphoma. Nature 463:88-92
Rui, Lixin; Emre, N C Tolga; Kruhlak, Michael J et al. (2010) Cooperative epigenetic modulation by cancer amplicon genes. Cancer Cell 18:590-605

Showing the most recent 10 out of 11 publications