This is a proposal to study structural pattern analysis of human G- banded chromosomes by computer. A database of approximately 7,000 digitized images of band-density profiles is available; each type is represented by about the same number of samples. A mapped of the density profiles into finite strings of symbols will be developed to cast the problem as string/sequence structure analysis. A specific mapping to be evaluated carefully is based on """"""""difference symbol"""""""" strings; this mapping is simple to compute and should facilitate automatic machine learning (inference) of pattern structure from training samples. For each chromosome type, a set of training strings will be used to infer a Markov chain as a statistical/structural model. The inference algorithm will use dynamic programming with a relative frequency cost function to compute optimal string alignments sequentially as a search for recurrent substring patterns call """"""""landmark substrings"""""""". The inferred Markov networks will be analyzed themselves for ensemble properties of the training data and will be used in classification experiments with both the training data and separate test data.
Granum, E; Thomason, M G (1990) Automatically inferred Markov network models for classification of chromosomal band pattern structures. Cytometry 11:26-39 |