The project is concerned with the estimation of volatility and related quantities for high frequency data using the nonparametric "latent semi-martingale model." The existence of market microstructure (statistically similar to measurement error) is crucial to this problem as it substantially affects estimators. The main questions to be asked by this project are the following: What information is there in the data? What quantities can consistently be estimated on the basis of daily data or even five minute data? A satisfactory answer to these questions will serve several purposes: (1) an understanding of what information (parameter estimates) researchers can hope to extract from data; (2) data compression, because the amounts of data are huge, and estimation of all relevant parameters will provide an (at least approximately) "sufficient" summary of the data; and (3) broader scientific and social goals as discussed below. The data to be looked at include financial transaction and quote data and, in some circumstances, order book data. High dimensional data also will be considered. A major tool will be the use of statistical contiguity which simplifies and clarifies the structure of the data.
The availability of high frequency financial data has exploded in recent years. This has opened the possibility of estimating quantities like volatility on a daily basis with high precision. Such estimates are of substantial interest to investors, regulators, and policymakers. An academic literature on this topic is in development, drawing on diverse areas ranging from finance via econometrics and statistics to pure probability. The main topic of this project is to find ways of turning these data into knowledge. Instead of massive, barely structured data, the project seeks to provide estimators of (economically or otherwise) interpretable parameters. The theoretical approach is to see the high frequency data in the context of continuous-time finance models, as used in asset pricing, portfolio management, options trading, and risk management. So far, such models have often relied on hypothetical high frequency data. By bringing them together with actual high frequency data, the project aims, in the long run, for a "grand unified theory" of finance with both theoretical and empirical components. As a corollary, this has the potential to induce transformational change by integrating risk management with business and regulatory decisions and data with models. This may help avoid some of the gravest model-with-no-data mistakes of the last few years. Methods of high frequency data also have application outside finance, such as in neural science and turbulence, and other areas where streaming data are available. Environmental science and monitoring also often present forms of high frequency data. Meanwhile, likelihood theory (which is central to this project) for time dependent data is a setting that connects finance and economics to a great variety of scientific endeavors, including biological and medical science, with mutual feedback between methods in these areas.