The goal of the project is to develop models and statistical inference techniques in the context of space-time computer network traffic and environmental data. The PIs propose to extend their joint work on global network traffic modeling via multivariate spatio-temporal processes and to address the network kriging, prediction, and optimal monitoring design problems. New problems involving extremal dependence in computer network traffic as well as environmental data will be also addressed. To do so, the PIs propose to use established techniques as well as to develop new tools involving max-stable processes, multivariate, functional, and hidden regular variation.
One of the main themes of the proposal is to understand and model the statistical aspects of traffic propagation in computer networks. This research would help predict, detect, monitor, and manage computer network traffic in a more principled way. The proposed methodology focuses on characterizing the global statistical behavior in both space and time, which would provide a more comprehensive picture of the entire network, namely the traffic loads on all links, routes, at concurrent as well as different points of time. This could enable the practitioners to predict the traffic load on an unobserved link or route by monitoring a select set of links or routes. Another aspect of the proposed research involves applying the Extreme Value Theory to understand and model the statistical dependence of extreme delays and traffic loads in computer networks. This could help identify bottlenecks and ?weak links? when unusually extreme traffic volumes arise. Related important problems arise in environmental applications, where extremes play a critical role. For example, the adequate modeling of the probability of extreme precipitation events to occur at the same time over different spatial locations is essential to be able to quantify the risk of floods. The proposed research would help model and estimate such probabilities of concurrent extremes and evaluate important environmental risks such as pollutions, floods, droughts, hot-spells, etc.
The project was motivated from practical problems arising in the study of Internet traffic, environmental and financial data. The PIs have developed new framework for modeling the statistical dependence of network traffic data measured at different sites and times on the network. In his context, efficient algorithms and software have been developed, which allow the network traffic engineers to perform optimal monitoring of the network for the purpose of traffic prediction. The developed methodology can be used to detect statistical anomalies and help efficiently highlight possible security or infrastructure failures. Another part of the project involved the statistical modeling and analysis of extreme events. The PI and a collaborator have developed new unifying theory for the representation of random processes used for modeling extreme events. This theory is likely to generate many new models as well as new findings on the structure of existing models of extremes. Further, the PI and his student focused on multi-dimensional data coming from environmental, insurance or financial applications. Developed were new methods to fit extreme value models. These models were applied to analyze precipitation and weather extremes. A third part of the project involves the fundamental problem of quantifying the risk associated with a large portfolio of assets. While much effort has already been devoted to this problem, the uncertainty about how to appropriately model the statistical dependence of the assets (especially in the regime of extreme losses) makes this an ongoing and difficult challenge. Failures to appropriately quantify the dependence among extreme losses can lead to under-estimating risk and ultimately to cascading contagion effect on the financial system. The PI and a student have focused on the problem of providing lower and upper bounds for a popular measure of risk -- the so-called value-at-risk. They have shown that universal upper and lower bounds for extreme value-at-risk can be obtained by solving a very high-dimensional optimization problem. These upper/lower bounds are essentially universal in the sense that they are largely independent of the underlying statistical model for the dependence among the assets. The upper/lower bounds can be made more precise by incorporating more data and expert information in a principled way. The so-obtained results are rather unexpected. Their full theoretical development and subsequent implementation in practical statistical algorithms are likely to have broad impact on the way risk is modeled, understood and quantified.