Many software developers view evaluation as a burden both because it currently takes significant effort and because most programmers are not trained in it. In the context of the PI's area of expertise (planning and agent design in Artificial Intelligence), the purpose of this project is to address two gaps in current methodology and pedagogy: the dearth of evaluation methods for AI software agents and the absence of empirical and evaluation methods in the undergraduate and graduate computer science curricula. Towards these ends, the research component creates new methods for evaluation that support AI agent design and development in the context of two agent projects: a simulated robot and an information gathering agent for the WWW. The teaching component integrates evaluation methods into the undergraduate and graduate CS curriculum by incorporating modules into existing courses where possible and designing new courses where necessary. Each component exploits existing and new research projects for exploring issues of evaluation in motivating design. The expected results are not only new software evaluation methods, documented examples of their use and two thoroughly evaluated software agents, but also experienced, new practitioners.