With the continuous growth of data production and the ever expanding computing infrastructure, future extreme-scale data intensive computing systems are facing unprecedented design and control challenges to meet the continuous and increasing demand for information processing. Scalability, robustness, continuous availability, and service quality are the key attributes desired to ensure that the system designed today is capable of operating with the same efficiency on the extreme scale of the future. The goal of the research described in this proposal is to develop theoretical foundations and practical control algorithms that enable the scalable design and efficient management of future extreme-scale data-intensive computing. Specifically, the researchers will 1) identify fundamental design principles needed to achieve scalability when developing network infrastructure and software systems in large scale. Here the concerns are to understand the performance degradation limited by various factors, including network structure, processor speeds, buffering/storage capacities, etc. 2) develop distributed control strategies on operator placement, data storage, load shedding, and resource allocation so as to enable efficient in-network information processing. The project will produce a deeper and quantitative understanding on the fundamental design principles and control strategies needed to achieve scalability, robustness and quality of service for future data-intensive information service systems. These advances will occur through a collaborative effort spanning multiple disciplines ranging from performance modeling, networking, queueing, to optimization.