Finite-memory models of time series are used in statistics, information theory, bioinformatics and various other disciplines. Specific areas of applications include lossless data compression, universal prediction of individual sequences, linguistics to compare dialects of languages and genetics for modeling of DNA sequences. The Markov random field model can be regarded as a generalization of the finite-memory property to spatial processes. Markov random fields are special Gibbs fields, therefore they provide essential models in statistical physics for modeling interactive particle systems. They are also used in several other fields, including image processing and pattern recognition. Both models involve information sufficient in many applications but general enough to model large amount of data and to be computed in feasible time.

A discrete-time stochastic process is a Markov chain of order k if the memory depth of the process is k. Estimating the order from a sample, a finite-length observation of the process, is a statistical model selection problem. The project considers the penalized maximum likelihood method and studies statistical inference on the Markov order for stationary ergodic and other processes. A stochastic process on a multidimensional integer lattice is a Markov random field if the distribution at a site depends only on the values of sites from a finite neighborhood, called basic neighborhood. The project addresses the problem of statistical estimation of the basic neighborhood from a sample, a single realization of the process observed in a finite region. Since large sample sizes are typical in applications, asymptotic behavior - such as consistency of the estimation procedures - is studied in addition to error probabilities for finite sample sizes. Computational complexity is a fundamental aspect in model selection problems so this property of the developed statistical procedures is studied as well. The research includes collaborative work with researchers in the area, and it also involves the potential of broadening the applications of the results in collaboration with researchers from the areas of applications, namely, from bioinformatics and engineering. The project provides opportunity for graduate students to collaborate in research.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1407819
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2014-08-15
Budget End
2018-07-31
Support Year
Fiscal Year
2014
Total Cost
$150,000
Indirect Cost
Name
University of Kansas
Department
Type
DUNS #
City
Lawrence
State
KS
Country
United States
Zip Code
66045