Recent work in automated image classification has demonstrated that in several cases, a supervised machine learning approach can equal or even surpass image classification by human experts. Although it may seem dubious that machines can surpass a human?s pattern recognition skills, modern imaging systems far exceed the human eye in spatial and spectral resolution as well as dynamic range and can provide a machine classifier with a far more precise set of data than the eye can provide the human brain. Thus, even though the human brain is a superior classifier, it is presented with inferior data. The combination of modern digital microscopy with state of the art machine pattern recognition is being used to quantitatively study morphological phenomena in cells and tissues.? ? Automated image classification can be divided into two approaches: model-based and model-free. In traditional model based systems, a model of what is being imaged is manually constructed, and used as the basis for classification or for reporting quantitative information. Model-free systems make no assumptions of the underlying model and attempt to build a classifier based only on information in the images and a manually pre-classified training set. Model-based systems have the advantage that it is known precisely what the machine is looking at and it is possible to guide the machine to look at what is judged to be important, while ignoring artifacts and noise. The chief disadvantage is that what is judged to be important is subject to bias, and is an attempt to anthropomorphise the machine to see in much the same way that a human does ? something that is not always appropriate or possible. The second disadvantage is that these machine vision approaches are highly specific to what is being imaged and can seldom be generalized to classify or measure something completely different. Their chief advantage is that model-based systems offer a direct and intuitive way to gather quantitative information on the image content.? ? Model-free classification treats all images equivalently, and performs exactly the same operations whether building a classifier for grades of melanoma, or sub-cellular organelles, pollen grains, etc. Our classifiers reduce each image to a vector of ?signatures?. Each signature is a numeric value produced by an algorithm sensitive to a specific type of image content - various textures, intensity statistics, distribution of objects, etc. A large collection of signatures (>1200) ensures that there is a sufficient variety of sensors available for many kinds of images. Because it is impractical to use this high-dimensionality space for classification directly, we use an iterative process that eliminates all signatures with weak classification power. The algorithm usually converges to one or two dimensions more than the number of classes in the training set. The product of this training is a Naive Bayesian network capable of classifying images that were not part of the original training set. One of the strengths of using Bayesian networks is that the result of classifying an image is a probability distribution of it belonging to all of the classes in the training set. This probability distribution provides a quality of fit, which is an important metric for rejecting unclassifiable images, and provides a means for measuring image similarity.? ? We have made substantial progress in extending our classifiers to produce quantitative measures of image similarity rather than being restricted to """"""""either-or"""""""" classification. We have validated that our computed image similarity correlates to progress along an independently characterized morphological process. A natural test case for this is age-related morphological change. We were able to demonstrate that our calculated morphological age obtained from images of two different tissues in the worm correlates well with the known age of the worm. Additionally we were able to demonstrate that the algorithms can correctly interpolate age groups that were never used in training. Preliminary results indicate that image distance can be used in standard clustering algorithms to define new classes of morphology not defined during training.? ? We have tested the generality of our classification algorithms on 10 different imaging problems thus far and were able to classify images reliably in all cases. We are currently applying this technique to three problems in biology that have previously not been accessible to automated unbiased analysis: Effects of aging on tissue morphology, high content screening assays, and automated medical diagnostics.? ? To study the effects of aging on tissue, Josiah Johnston is collaborating with Catherine Wolkow of the LNS (NIA-IRP) to conduct a longitudinal study of aging in the worm C. elegans. The central question of this collaboration is: In a population of genetically identical worms living in close proximity, what is it that makes the individual worms age at different rates and die at different times? We have recently shown that the morphology of the pharynx as assayed by our pattern recognition techniques correlates well with the known age of the worm, and is modulated in expected ways in mutants with decreased pharynx activity as well as those causing developmental abnormalities in this organ. We have collected longitudinal data by imaging worm pharynxes non-invasively at mid-life, and measuring pharynx pumping rate at several points along the worm lifespan. The time of death is recorded for each worm and tied to the pumping rates as well as the pharynx image at mid-life. The worm population is divided into three groups based on the observed lifespan, and the groups are used to train classifiers. Preliminary results indicate that our algorithms are able to distinguish short lived and medium-lived worms using images of their pharynx. If these preliminary results are confirmed with additional data, we will use this classifier to separate a population of worms at mid-life into groups based on predicted life-span. Transcription profiling experiments will then be used to find genes whose expression pattern modulates lifespan relative to a genetically identical cohort.? ? The effects of aging and diet on mouse tissue morphology is being studied by Tomasz Macura in collaboration with Kevin Becker using tissue arrays from calorically restricted and normally fed mice prepared for the AGEMAP project. We plan to use our classifiers to detect differences in morphology in different tissues as a consequence of aging and diet, as well as measure aging rates of different tissues and the influence of diet on these rates. Currently we have collected a full set of liver data, and are analyzing it for morphological differences due to diet at four different ages.? ? With Dr. Mark Eckley, we are establishing a high-density high-throughput RNAi screening platform using cells grown on microscope slides printed with double-stranded RNA. Our classifiers are able to detect cells accumulating at different stages of the cell cycle, as well as various abnormal nuclear morphologies. We plan to use our algorithms to determine the set of morphologies attainable by nuclei, mitochondria, and cytoskeleton as a result of gene knock-down, and use these groups to begin classifying genes by phenotypic outcome.? ? With Nikita Orlov, we have begun a collaborative project with Dr. Elaine Jaffe of NCI to classify types of human lymphoma. Preliminary results indicate that our algorithms can match the average performance of human pathologists. We are now refining our training set to consist of diagnostic features in the sample rather than using a random collection of images for training.

National Institute of Health (NIH)
National Institute on Aging (NIA)
Intramural Research (Z01)
Project #
Application #
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
United States
Zip Code
Shamir, Lior; Ling, Shari; Rahimi, Salim et al. (2009) Biometric identification using knee X-rays. Int J Biom 1:365-370
Shamir, Lior; Ling, Shari M; Scott Jr, William W et al. (2009) Knee x-ray image analysis method for automated detection of osteoarthritis. IEEE Trans Biomed Eng 56:407-15
Orlov, Nikita; Shamir, Lior; Macura, Tomasz et al. (2008) WND-CHARM: Multi-purpose image classification using compound image transforms. Pattern Recognit Lett 29:1684-1693
Shamir, Lior; Orlov, Nikita; Eckley, D Mark et al. (2008) Wndchrm - an open source utility for biological image analysis. Source Code Biol Med 3:13
Shamir, Lior; Orlov, Nikita; Mark Eckley, David et al. (2008) IICBU 2008: a proposed benchmark suite for biological image analysis. Med Biol Eng Comput 46:943-7
Chow, David K; Glenn, Charles F; Johnston, Josiah L et al. (2006) Sarcopenia in the Caenorhabditis elegans pharynx correlates with muscle contraction rate over lifespan. Exp Gerontol 41:252-60
Yoshikawa, Toshiyuki; Piao, Yulan; Zhong, Jinhui et al. (2006) High-throughput screen for genes predominantly expressed in the ICM of mouse blastocysts by whole mount in situ hybridization. Gene Expr Patterns 6:213-24
Glenn, Charles F; Chow, David K; David, Lawrence et al. (2004) Behavioral deficits during early stages of aging in Caenorhabditis elegans result from locomotory deficits possibly linked to muscle frailty. J Gerontol A Biol Sci Med Sci 59:1251-60