A decade ago when microarray was first invented, it was hailed as """"""""an array of hope"""""""" in Nature Genetics and has received a considerable amount of attention in biomedicine. Subsequently it has been called """"""""an array of problems"""""""" in Nature Review. An inherent problem with microarray gene expression is that structural information is missing, which limits its ability in biological discovery. To overcome the poor reproducibility and accuracy of microarray imaging, there needs to be a shift in fundamental paradigms to those able to incorporate complementary and multiscale structural imaging information into microarray imaging. Fortunately, the latest progress in high resolution biomolecular imaging probe development coupled with advanced image analysis makes integrative and systematic studies of cellular systems possible. A cell can be labeled using multiscale and multimodality imaging, providing both structural and functional information. With multiscale imaging spreadsheets now available, there is an overwhelming need within the life sciences community to manage this information effectively, to analyze it comprehensively, and to apply the resulting knowledge in the understanding of the genetic system of a cell. However, the management and mining of this large-scale imaging information is limited by today's computational approaches and knowledge-sharing infrastructure. These problems represent a major impediment to progress in the emerging area of bio-molecular image informatics. Therefore, the goal of this project is to develop a unique genomic image management and mining system that can allow geneticists to search, correlate and integrate this multiscale and multi-modality imaging information in an easily operable fashion and further enable new biological discovery. In particular, this system will fill a void left in the current image database systems such as Open Microscope Environment (OME), e.g., the lack of analytic tools for integrative data analysis. To realize this goal, we are bringing together a strong interdisciplinary team consisting of imaging engineers, geneticists and industrial imaging scientists. Building on our diverse and complementary expertise, we are able to provide innovative and interdisciplinary approaches that combine the latest progress in image processing, imaging database design and machine learning with the development of high resolution and high throughput molecular imaging probes in genomics. More specifically, we will accomplish the following specific aims. First, we will develop a suite of algorithms for content extraction and information retrieval from high resolution fluorescence in situ hybridization (FISH) images. This visual system will effectively manage imaging phenotype information, facilitating knowledge discovery such as identifying visually similar subtypes. Second, we will correlate quantitative traits extracted from FISH imaging with genomic structural rearrangements and gene expression patterns. Finally, we will develop a data integration approach to fuse disparate information from multi-modality imaging databases for improved characterization of biological systems.

Public Health Relevance

The anticipated outcome of the project will include a publicly accessible imaging database analysis system to facilitate multiscale genomic image information integration and knowledge mining. The proposed approach challenges the current paradigm by the integration of high resolution structural imaging with functional information, which promises to overcome the poor accuracy and reproducibility problems plagued with microarray imaging. Given the ubiquitous use of microarray imaging in biomedicine, the project is thus expected to be of great impact on the biomedical community.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Tulane University
Biomedical Engineering
Schools of Arts and Sciences
New Orleans
United States
Zip Code
Lin, Dongdong; Calhoun, Vince D; Wang, Yu-Ping (2014) Correspondence between fMRI and SNP data by group sparse canonical correlation analysis. Med Image Anal 18:891-902
Duan, Junbo; Deng, Hong-Wen; Wang, Yu-Ping (2014) Common copy number variation detection from multiple sequenced samples. IEEE Trans Biomed Eng 61:928-37
Zhang, Lei; Pei, Yu-Fang; Fu, Xiaoying et al. (2014) FISH: fast and accurate diploid genotype imputation via segmental hidden Markov model. Bioinformatics 30:3142
Pei, Yu-Fang; Zhang, Lei; Papasian, Christopher J et al. (2014) On individual genome-wide association studies and their meta-analysis. Hum Genet 133:265-79
Cao, Hongbao; Duan, Junbo; Lin, Dongdong et al. (2014) Sparse representation based biomarker selection for schizophrenia with integrated analysis of fMRI and SNPs. Neuroimage 102 Pt 1:220-8
Duan, Junbo; Zhang, Ji-Gang; Deng, Hong-Wen et al. (2013) CNV-TV: a robust method to discover copy number variation from short sequencing reads. BMC Bioinformatics 14:150
Lin, Dongdong; Zhang, Jigang; Li, Jingyao et al. (2013) Group sparse canonical correlation analysis for genomic data integration. BMC Bioinformatics 14:245
Cao, Hongbao; Duan, Junbo; Lin, Dongdong et al. (2013) Integrating fMRI and SNP data for biomarker identification for schizophrenia with a sparse representation based variable selection method. BMC Med Genomics 6 Suppl 3:S2
Cao, Hongbao; Lei, Shufeng; Deng, Hong-Wen et al. (2012) Identification of genes for complex diseases using integrated analysis of multiple types of genomic data. PLoS One 7:e42755
Tang, Wenlong; Cao, Hongbao; Zhang, Ji-Gang et al. (2012) Subtyping of Gliomaby Combining Gene Expression and CNVs Data Based on a Compressive Sensing Approach. Adv Genet Eng 1:101

Showing the most recent 10 out of 15 publications