Large-scale datasets generated by dynamical systems are encountered in many science and engineering disciplines. For instance, in climate, atmosphere, and ocean sciences the dynamics take place in an infinite-dimensional phase space where the coupled nonlinear partial differential equations for fluid flow and thermodynamics are defined, and the observed data correspond to functions of that phase space, such as temperature or circulation over a given geographical region. Examples also abound in materials science and molecular dynamics. A major challenge is to utilize the vast amount of data that is being collected by observational networks or output by large-scale numerical models to understand the operating physics and make inferences about aspects of the system which are not accessible to observation, including the future state of the system. This project seeks to develop novel techniques for data analysis and prediction in dynamical systems, taking into account model error and spatiotemporal data relationships. Applications are proposed in two high-impact areas in climate-atmosphere-ocean science, namely, tracking and forecasting of multiscale convective waves in the tropical atmosphere and reconstruction and forecasting of Arctic sea-ice thickness. This research will create, document, and make available software for analyzing large-scale data from complex dynamical systems. It will also contribute to curricular development and training of graduate students in this interdisciplinary arena.
The general framework of this research is dynamical systems operating in high-dimensional phase spaces, but generating data with low-dimensional, nonlinear geometric structures. Kernel methods form a natural mathematical framework to construct function spaces on these low-dimensional objects with a well-defined notion of smoothness, which can be used to carry out a variety of data analysis tasks such as dimension reduction, feature extraction, and prediction. For appropriately designed kernels, these tasks can be interpreted in terms of a Riemannian geometry induced on the data. Kernels also provide operators to extend functions on a reference dataset to another dataset of interest. The dynamical systems addressed in the project can represent either nature, or a numerical model approximating nature. In many real-world complex applications the low-dimensional data structures generated by nature and the model will differ. To extract the features of the imperfect model which are maximally consistent with nature, or to assign weights in multi-model ensembles for prediction, this project will study a novel approach where kernel-based out-of-sample extension operators are used to define appropriate metrics for model error. Taking dynamics into account through Takens delay-coordinate maps and other features, these error metrics are incorporated into modified kernels, biasing the geometry of the model data to extract states with high fidelity relative to nature. This project is to use the modified kernels in regularized schemes for learning functional relationships between quantities of interest in dynamical systems. These methods will be applied in reconstruction and forecasting of Arctic sea-ice thickness from observations of oceanic and atmospheric variables, and blended parametric-nonparametric forecasting of large-scale convective organization in the tropics. A further goal of the project is to extend these ideas to operator-valued kernels (so-called multitask kernels) for analysis of vector-valued observables, such as spatially extended fields. Compared to the canonical scalar-valued kernels, these kernels should have significantly higher skill in capturing spatiotemporal intermittency, with geometry and dynamics also playing a role through delay-coordinate maps. This project is to apply these kernels to objectively extract traveling convective waves in the tropical atmosphere from large datasets acquired via remote sensing.