Artificial Intelligence (AI) in general, and especially Machine Learning (ML), is playing an increasingly important role in many aspects of our life. AI/ML model engineering (e.g., model training, tuning, and maintenance) is hence becoming an essential part of modern software-systems engineering. Just like software inevitably contains bugs and software debugging is a key step in software development, AI/ML models may have buggy behaviors, and thus model debugging will be a critical step in software engineering. AI model bugs may lead to undesirable consequences such as low model accuracy and vulnerabilities to security attacks, which substantially hinder the application of AI models, especially in safety-critical areas. The current practice of AI model debugging mainly focuses on tuning parameters and providing additional training data. However, it does not try to diagnose the root cause from the observable symptoms and then repair accordingly. There are no general debugging tools that work for a large set of models. The current state of practice is to perform tedious and redundant tests on individual implementations and even at a per-training-session level.
AI models, especially neural network models, are essentially programs (e.g., in Python) that compute state-variable values, called neuron activations, through multiple program phases (called layers). The values of neurons in a layer are computed from those of the previous layer through matrix multiplication and application of an activation function, which is basically a threshholding function to determine if values will be used in the computation of the next layer. The project will develop techniques that consider AI models as programs with specific semantics such that AI model debugging can substantially benefit from analyzing these programs and their execution states. As such, the substantial experience of software debugging that is built up by the software-engineering and program-analysis community over decades of intensive R&D can be leveraged to build general and novel AI model debuggers. Just like software-debugging tools help developers to inspect program states and identify root causes, these AI model debuggers will allow data engineers to compare internal neuron-activation values of correctly classified and mis-classified cases to identify root cause features that lead to misclassification, by providing guidance in selecting more inputs to improve model behavior related to these features.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.