This project will develop and test methods by which existing domain knowledge can be combined with large amounts of training data to yield high-performance classification systems. The successful development of such methods would address fundamental questions in machine learning and solve critical applications problems in science and engineering. In particular, the methods will be developed and tested in the domain of ecosystem modelling, where they may be able to provide more reliable estimates of the effect of increased atmospheric carbon dioxide (CO2) on the geographical distribution of plant ecosystems (biomes). Two approaches will be investigated: (a) a theory-centered approach, in which a partial theory of plant physiology is refined so that it can predict biome distribution, and (b) a rule-centered approach in a standard machine learning algorithm is applied to discover biome distribution rules. The rule-centered approach will use the partial theory of plant physiology to suggest relevant features and to rationalize the discovered rules. Both approaches will entail the development of new learning methods including (a) constraint reasoning techniques for the interpretation of data and the refinement of the domain theory, (b) constructive induction techniques for defining candidate features, and (c) rationalization techniques for ensuring the consistency of discovered rules with the domain theory.//