This research seeks to provide a new framework to support distributed programs that access large amounts of data stored across millions of servers. The framework consists of two parts: a set of extensible modules to provide basic system services plus higher-level policies built on top of these modules to coordinate distributed data access. This research leverages this framework in the context of four key subsystems: (1) dynamic data replication to improve performance, improve reliability, and reduce the need for human management of storage, (2) extensible and scalable data consistency to coordinate data access across multiple nodes, (3) scalable location services for mobile objects to track where objects are stored in the dynamic replication framework, and (4) market-based distributed resource management to balance large numbers of applications competing for limited system resources. The impact of this research will be a better understanding of the fundamental mechanisms required by large-scale data systems and a set of techniques that will support new classes of data-intensive applications in this challenging environment. This project also places a heavy emphasis on modernizing the teaching of operating systems and distributed systems to give students the skills and experience they need to understand and build large-scale distributed systems.