Algebraic statistics is concerned with the use of commutative algebra, algebraic geometry and combinatorics in statistical inference. The connection between algebra and statistics arises from the fact that many statistical models for discrete random variables have the structure of algebraic varieties. This underlying algebraic structure can be exploited to develop new tools for analyzing statistical data and also suggests new research directions in algebra and combinatorics. The research undertaken by the PI concerns the study of the algebraic structure of statistical models. In particular, the research focuses on the study of the ideals defining the statistical models as algebraic varieties, ways that the ideal generators and Grobner bases can be used as tools in statistical inference, and the development of algebraic techniques for studying and computing these defining ideals. The particular models studied by the PI are log-linear models, phylogenetic models, and mixture models. Among the varieties that arise in this study are toric varieties, determinantal varieties, secant varieties, and many new varieties which deserve further study.
Statistical models for discrete data are increasingly used throughout the social and biological sciences. Of particular note is the emergence of statistical techniques for the analysis of biological sequence data (i.e. DNA, RNA, and protein sequences). These statistical models are families of probability distributions inside a finite dimensional space, and these families often have an algebraic structure. The PI proposes to further his work exploring the algebraic structure of these statistical models, increasing the interaction between two important areas (algebra and statistics) in the mathematical sciences.