The long-term goal of the proposed work is to develop and validate novel and effective computational tools to decompose composite molecular signatures in microarray and imaging studies: independent component analysis (ICA) and iterative selection-normalization (ISN). The technology development is driven by the hypotheses that: (1) accurate separation of the gene expression profiles of mixed cell populations (e.g., malignant/stromal cells), and/or (2) accurate normalization of the gene expression profiles across multiple phenotypes, will improve the sensitivity and specificity for the measurement of molecular signatures and for early detection and diagnosis of diseases. The newly invented ICA is one of the most exciting computational tools that promises for separating hidden sources from mixed signals when many classic methods fail completely, and the ISN is the first attempt of a closed-loop approach that integrates cross-phenotype normalization with cluster analysis and gene selection thus immune to complex data variations. Currently, there is no existing method to accurately separate/normalize composite molecular signatures, and, to our best knowledge, no one has attempted to use ICA/ISN for this purpose. As a result, a limitation of related data processing is the inability of quantitative separation and normalization of mixed cross-phenotype signals. The R21 phase will focus on: (1) develop and test the ICA based blind source separation (BSS) method, using well-established cell line microarray experiment, to extract the gene expression profile of targeted cells from observed mixtures, defined by differentially-expressed genes; (2) develop and test the ISN based crossphenotype normalization (CPN) method to simultaneously identify constantly-expressed genes and normalize gene expression profiles by linear regressions. The R33 phase will focus on: (1) apply and test the performance of ICA-BSS method, using gene expression datasets derived by microarrays from multiple real biopsy specimens of solid tumors; (2) develop and test a population-based ISN-CPN algorithm, using multiphenotype samples, to simultaneously identify constantly-expressed genes and normalize gene expression profiles by applicable nonlinear regressions. (3) apply and test the ICA-BSS method, using in-vivo molecular imaging datasets, to separate specific and nonspecific bindings with improved signal-to-noise ratio.
Showing the most recent 10 out of 14 publications