This project develops and evaluates a system for assessing proofs in undergraduate mathematics and computer science classes via peer evaluation. It approaches the problem as an instance of human computation, using computer technology to harness the collective capability of large numbers of people to do useful work. This requires breaking down the task of assessing a proof to make it possible for multiple, nonexpert people to contribute to the assessment. It requires instituting mechanisms to ensure the quality, uniformity, and integrity of the assessment process. The work builds on previous experience with a prototype system that had been used for two years in computer science and mathematics courses that targeted handling large classes using teams of undergraduate graders. Experience indicated the feasibility of the basic strategy, while this project works to ensure the quality of the assessments and of the learning experience for the assessors. In addition, this project takes initial steps toward an assessment system that can scale to support web-scale courses.
This work creates structured frameworks in which to analyze proofs, so that assessment can be decomposed into a number of tasks, suitable for nonexperts. It explores ways to maximize the learning experience students gain by critically evaluating each other's proofs. It creates a platform for performing quantitative experiments on the most effective methods for teaching students how to read and evaluate proofs.