The project aims to develop a comprehensive and principled framework for reasoning about the behavior of artificial intelligence (AI) systems, which includes explaining their decisions, assessing their robustness and providing formal guarantees on their behavior. These considerations are critical to the success of AI systems in real-world applications, such as the safety of self-driving cars. The project is particularly concerned with machine learning systems, which are learned from data and have an inherent numeric nature that contributes to their opacity and the difficulty in reasoning about their behavior.
The project is based on a fundamental observation: Even though machine learning systems are numeric in nature, they often implement symbolic decision functions. The project will specifically aim to (1) compile the numeric machine learning systems into equivalent, symbolic and tractable systems, and then (2) reason about the behavior of the machine learning systems by operating on their compiled symbolic versions. This allows one to bring in a wealth of techniques from classical AI, and computer science more broadly, to bear on this important problem.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.