Gene expression measurement using cDNA and oligo arrays is exploding in popularity, yet many technical problems continue to face users. One of the more fascinating problems results from the large, sometimes overwhelming volume of data generated in these experiments. Image capture, processing, interpretation and quantification remain as important fundamental issues. Quality control and statistical design of experiments must be adequately addressed for fruitful results to be obtained. Numerous statistical, image processing and bioinformatics problems confront users of these technologies. As arrays can be constructed to contain thousands of spots, automated analysis of the resulting images is required. The technology itself must be improved to couple it with new tissue sampling technologies such as laser capture microdissection (LCM). Accordingly, this projects seeks to address problems in this area at the statistical, numerical, computational, and informatics levels. Progress in FY2000:Working with laboratories in NCI, NICHD, NIA, NIDDK, NINDS and NIDCR, we have developed an applied software for the analysis of array images from major commercial sources as well as from custom arrays. The program PSCAN was developed to facilitate the image-processing steps of the analysis and produces optimal estimates of spot intensities. The program is written in MATLAB, and the code is being made publicly available, and a Web distribution site has been established. Numerous improvements to the image processing steps have been achieved including: improved spot detection, location and quantification algorithms, improved user interfaces, linkage with web-based information, improved data storage formats and the user-interface. Our analysis method relies on a number of data visualization tools, and allows users to identify significantly over- or under- expressed genes in a comparative study. Importantly, these techniques also allow users to identify experimental artifacts, outliers and other data anomalies which are present and a large percentage of hybrization studies, such as non-constant background hybridization, image defects, dropouts, printing artifacts, spot bleeds, etc. We have generalized the program into program F-SCAN for analysis of two-label arrays using such labels as with fluors Cy3 and Cy5. New algorithms for spot detection, shape determination, and robust methods for signal and background estimation have been developed and extensively tested. These algorithms compare favorably with algorithms used in leading commercial software, and are being trained to reject common artifacts in fluorescently labeled images. In one collaboration with NIA, our methods were applied to early screening studies using commercial arrays, clones containing interesting genes selected, custom arrays manufactured and then used in a second series of studies. A manuscript describing this work is in preparation. We have also developed a method for mapping over and under-expressed genes onto the location within the human genome of each gene. We are now investigating commercially available datamining and visualization software applicable to gene expression studies. Despite the current high cost of most such products, they may become suitable for use at NIH under an enterprise-wide cost-sharing mechanism, and may speed discovery of gene-function using large-scale gene expression studies coupled with newly available human genome sequence data.

Agency
National Institute of Health (NIH)
Institute
Center for Information Technology (CIT)
Type
Intramural Research (Z01)
Project #
1Z01CT000266-03
Application #
6431909
Study Section
(MSCL)
Project Start
Project End
Budget Start
Budget End
Support Year
3
Fiscal Year
2000
Total Cost
Indirect Cost
Name
Computer Research and Technology
Department
Type
DUNS #
City
State
Country
United States
Zip Code
Deans, Katherine J; Minneci, Peter C; Chen, Hao et al. (2009) Impact of animal strain on gene expression in a rat model of acute cardiac rejection. BMC Genomics 10:280
Raghavachari, Nalini; Xu, Xiuli; Munson, Peter J et al. (2009) Characterization of whole blood gene expression profiles as a sequel to globin mRNA reduction in patients with sickle cell disease. PLoS One 4:e6484
Greenwell-Wild, Teresa; Vazquez, Nancy; Jin, Wenwen et al. (2009) Interleukin-27 inhibition of HIV-1 involves an intermediate induction of type I interferon. Blood 114:1864-74
Nares, Salvador; Moutsopoulos, Niki M; Angelov, Nikola et al. (2009) Rapid myeloid cell transcriptional and proteomic responses to periodontopathogenic Porphyromonas gingivalis. Am J Pathol 174:1400-14
Raat, Nicolaas J H; Noguchi, Audrey C; Liu, Virginia B et al. (2009) Dietary nitrate and nitrite modulate blood and organ nitrite and the cellular ischemic stress response. Free Radic Biol Med 47:510-7
Woszczek, Grzegorz; Chen, Li-Yuan; Nagineni, Sahrudaya et al. (2008) Leukotriene D(4) induces gene expression in human monocytes through cysteinyl leukotriene type I receptor. J Allergy Clin Immunol 121:215-221.e1
Hernandez-Novoa, Beatriz; Bishop, Lisa; Logun, Carolea et al. (2008) Immune responses to Pneumocystis murina are robust in healthy mice but largely absent in CD40 ligand-deficient mice. J Leukoc Biol 84:420-30
Coppey, Mathieu; Boettiger, Alistair N; Berezhkovskii, Alexander M et al. (2008) Nuclear trapping shapes the terminal gradient in the Drosophila embryo. Curr Biol 18:915-9
Raghavachari, Nalini; Xu, Xiuli; Harris, Amy et al. (2007) Amplified expression profiling of platelet transcriptome reveals changes in arginine metabolic pathways in patients with sickle cell disease. Circulation 115:1551-62
Elshal, Mohamed F; Khan, Sameena S; Raghavachari, Nalini et al. (2007) A unique population of effector memory lymphocytes identified by CD146 having a distinct immunophenotypic and genomic profile. BMC Immunol 8:29

Showing the most recent 10 out of 46 publications