Despite continuous efforts and investments to upgrade the networking infrastructure of research and education institutions to meet the needs of large-scale science applications, the data transfers on these networks often perform very poorly. Understanding the underlying reasons for poor transfer performance is important yet challenging due to the sophisticated design of today's cyberinfrastructures. This project offers a set of novel models and algorithms to detect and mitigate performance issues of data transfers in research networks. The proposed suite of tools helps researchers and system administrators to pinpoint the root cause of performance problems of data transfers so that necessary actions can be taken swiftly to minimize their impact on ongoing transfers. The project will also integrate the research into all levels of education, including science projects with K-12 students, development of new curriculum modules for graduate- and undergraduate-level courses, and summer workshops specifically for minority groups.

Understanding the true underlying reasons for poor transfer performance is key to mitigating them and delivering the promised transfer speeds. However, the involvement of multiple end systems, dynamically changing background traffic, and the complexity of today's networking infrastructures turns it into a complicated and time-consuming process. This project develops a novel anomaly-detection and performance-optimization framework for end-to-end data transfers at scale. The framework helps to predict, understand, diagnose, and optimize wide-area file transfers in today's extreme-scale cyberinfrastructures. To achieve this goal, it derives deep-neural-network-based predictive models that can relate transfer settings to throughput. These models are then used to estimate the optimal configuration for new transfers. The framework also gathers performance metrics for end-system and network resources periodically to keep track of system utilization. When a transfer anomaly is detected, the collected metrics are fed into anomaly-classification models to identify the root causes. Once the underlying reasons of performance problems are identified, the framework launches a real-time optimization process to reconfigure the transfer settings such that the impact of anomalies can be alleviated.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Type
Standard Grant (Standard)
Application #
2007829
Program Officer
Almadena Chtchelkanova
Project Start
Project End
Budget Start
2020-08-01
Budget End
2023-07-31
Support Year
Fiscal Year
2020
Total Cost
$224,982
Indirect Cost
Name
Suny at Buffalo
Department
Type
DUNS #
City
Buffalo
State
NY
Country
United States
Zip Code
14228