Creating and distributing software written in Python, in a secure manner, is surprisingly difficult. And as many recent incidents demonstrate, the security of this software chain is dramatically vulnerable. Right now, in nearly all Python packaging and distribution tools, there are no mechanisms in place for someone who downloads software to understand whether a malicious party has not inserted or removed code, or if the code was even written by the right developers! This work will for the first time capture metadata about the steps of the Python software supply chain systematically. This project will carry information between the steps of the chain in a way that an external party can verify author signing and repository signing of packages. This project will also be breaking ground for researchers and developers who want to improve how other interpreted languages handle managing dependencies. The project's impacts are particularly strong in academia, science, and industry, where Python is the most widely used programming language; millions of users will be more protected against a variety of attacks.
This project transitions two security mechanisms -- backtracking dependency resolution and The Update Framework (TUF) -- into practical use in the core Python infrastructure. Backtracking dependency resolution ensures that users get understandable package dependency installation, even in the face of attacks or missing metadata. TUF ensures that even a compromise of the major package infrastructure will have severely limited impact on clients. Together, the resolver and TUF work will ensure that important research transitions into substantial security improvements for all Python software, and will positively impact millions of developers and many more users.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.