In recent years, we've focused on the OME analysis system and developing robust general image analysis methodology, culminating in our pattern recognition tool called WND-CHRM. We have validated this pattern-recognition approach to biological image analysis using diverse imaging modalities ranging from fluorescence microscopy to X-rays of human knees. We have also validated a range of applications from scoring image-based assays to diagnosis of disease to prediction of future disease risk. The specific applications of this approach are covered in reports AG000674-07 and AG000685-04. A major effort in the previous year has been to expand the functionality of WND-CHRM into quantitative image-comparison assays as well as spatial pattern analysis. A major effort was also undertaken to rewrite the WND-CHRM code-base to make it more modular, better organized and easier to use. WND-CHARM is a generalized pattern-recognition algorithm that can be used to analyze any type of image. Unlike most approaches to image processing in current use, this method relies on training a machine classifier to automatically recognize differences between training image classes (i.e. controls), rather than relying on an a-priori model of what is being imaged. This approach has been demonstrated to be effective at discerning differences even when they cannot be easily perceived manually. The output of a trained machine classifier is qualitative: for a given test image, it reports the class of training images that the test image is most similar to. In a scientific setting, it is often not sufficient to know what class an image belongs to, but how similar it is to the given training classes. An example is a quantitative imaging assay where the set of training image-classes comprise a standard curve, and the classifier's task is to arrive at a continuous score by interpolating between the defined classes. This type of classification can be called an """"""""ordered-class problem"""""""". There also exist a set of problems where the classes do not have an inherent order, and instead of an interpolated continuous score, the desired output is a measure of the similarity between classes. A familiar visualization of classes that have varying degrees of similarity to each other is a dendrogram, or for example, a phylogenetic tree representing evolutionary distance. This type of classification can be called a """"""""class-similarity problem"""""""". The current implementation of WND-CHARM addresses both of these quantitative imaging problems automatically. In addition to reporting the qualitative class assignment, it reports a continuous value if the class names can be interpreted numerically, and it computes pair-wise similarities between all of the classes. If a dendrogram visualization package is installed on the system (PHYLIP), it automatically generates a dendrogram based on the pair-wise class-distance matrix. This type of visualization has proven useful as an independent validation for ordered-class problems, since a well-ordered set of classes will produce a linear, or elongated dendrogram without major branch-points. The program that implements the WND-CHRM algorithm (called wndchrm), has been made publicly available on Google Code (http://wnd-charm.googlecode.com/). A major release of the code (version 1.31), covering the areas discussed above has been made available on the project's site as well. This version represents a first pass at reorganizing the code-base by making it more self-consistent and reliable without major architectural changes. It also represents a substantial effort in validation, testing and resolution of bugs. The site provides an interface for reporting bugs and requesting new features, and we have made extensive use of this facility within our own group. Although the site has only been active since February 2011, it is visited an average of 30 times per week, and the software has been downloaded over 100 times. The visits represent 59 countries, though nearly half of the visits are from the US. Whole-image analysis has proven very useful, but it is not always possible to compare whole images to each other. Examples of relatively homogenous images are those of cultured cells, or tissues like muscle, liver, and certain types of tumors. Our work on human knee X-Rays was the first application where a certain degree of pre-processing was necessary to make images of different subjects comparable to each other. In this case, we simply found the center of the knee joint in each image, and extracted a fixed radius around this center for all patients. A much more complicated alignment problem exists in images with complex anatomy. Possibly the most extreme example of this are stained sections of brain tissue. A solution to the alignment problem would allow the use of generalized pattern recognition to address morphological differences in an anatomical context. For example, what areas of the brain correlate with cognitive decline or age? What is the degree of overlap between these areas? Spatially-resolved pattern analysis places an extreme burden on the performance of our software. Instead of an entire image being considered at once, or split into a small number of tiles on a grid, to achieve spatial resolution, each image must be sampled thousands or millions of times. In order to make this type of application practical, the computational strategy used in the software must be reconsidered. Previously, all of the 3,000 low-level image features were calculated for each image sample, even when most of them were later found to be irrelevant to the classification problem because they lacked discrimination power. The major change in strategy to enable spatially-resolved pattern recognition is to eliminate unnecessary calculations. This requires an on-demand computing strategy for image features, which is a major architectural goal of the wndchrm rewrite. Incremental improvements in performance are also expected from making more extensive use of optimized libraries, which also helps to reduce the burden of code that our group has to maintain. Lastly, a better-managed mechanism for maintaining the results of image-feature computation is necessary when the volume is increased by several orders of magnitude. We are adopting SQLlite for this task, which is a common solution to high-volume/low-complexity database needs. Currently, these various efforts are maintained in separate branches in our Google Code software repository, and we will merge them as they become more mature. The majority of our software-development efforts recently have been dedicated to the WND-CHRM analysis tool. With the addition of quantitative and spatially-resolved pattern analysis, it represents a substantial portion of what is possible with image analysis without a-priori models. Meanwhile, the OMERO project has matured under the guidance of Jason Swedlow, and is now a good, stable and usable implementation of the image and meta-data management concepts within OME. In the coming year, we will begin a substantial effort of integrating our ideas from developing WND-CHRM into the original concepts developed for OME. Together with Jason Swedlow, we have applied for and received funding from the Wellcome Trust to bring these two projects together, and we will begin these efforts in the coming year.
Shamir, Lior; Delaney, John D; Orlov, Nikita et al. (2010) Pattern recognition software and techniques for biological image analysis. PLoS Comput Biol 6:e1000974 |
Shamir, Lior; Ling, Shari; Rahimi, Salim et al. (2009) Biometric identification using knee X-rays. Int J Biom 1:365-370 |
Shamir, Lior; Eckley, D Mark; Delaney, John et al. (2009) An Image Informatics Method for Automated Quantitative Analysis of Phenotype Visual Similarities. IEEE NIH Life Sci Syst Appl Workshop 2009:96-99 |
Swedlow, Jason R; Goldberg, Ilya G; Eliceiri, Kevin W et al. (2009) Bioimage informatics for experimental biology. Annu Rev Biophys 38:327-46 |
Shamir, Lior; Wolkow, Catherine A; Goldberg, Ilya G (2009) Quantitative measurement of aging using image texture entropy. Bioinformatics 25:3060-3 |
Orlov, Nikita; Shamir, Lior; Macura, Tomasz et al. (2008) WND-CHARM: Multi-purpose image classification using compound image transforms. Pattern Recognit Lett 29:1684-1693 |
Shamir, Lior; Orlov, Nikita; Eckley, D Mark et al. (2008) Wndchrm - an open source utility for biological image analysis. Source Code Biol Med 3:13 |
Shamir, Lior; Orlov, Nikita; Mark Eckley, David et al. (2008) IICBU 2008: a proposed benchmark suite for biological image analysis. Med Biol Eng Comput 46:943-7 |