The focus of the proposed research is the design and implementation of autonomous agents-computer programs that have a sustained interaction with a dynamic, incompletely predictable environment. Programming such agents is very difficult, largely because the programmer rarely actually knows the correct behavior for the agent. For example, typically, the programmer ends up learning, during the debugging process, how the agent's sensors and effectors interact with the environment. Thus, the agent, rather than the programmer, should learn this from experience. This learning, called reinforcement learning, will be investigated. In particular, two special problems will be studied: that they require the input and output spaces to be completely enumerated (and have polynomial time complexity) and that they cannot use prior information about the world, information which ought to be able to speed and deepen the learning process. //