This project is concerned with statistical models whose parameter spaces have singularities. The investigator studies how singularities impact the behavior of existing statistical methods and develops new techniques for adequate assessment of statistical significance. The focus is on algebraic statistical models, that is, models that have (semi-)algebraic sets as parameter spaces. The class of algebraic models comprises many of the singular models employed in practice and can be studied using tools from computational algebraic geometry. Importantly, the well-behaved local geometry of semi-algebraic sets makes it possible to obtain general results without having to assume difficult to verify regularity conditions. The statistical techniques under study include classical procedures from likelihood inference such as likelihood ratio and Wald tests as well as information criteria.
Modern scientific studies often require analysis of data on several jointly observed variables. Statistical models of dependence relationships among the different variables are often formulated using additional variables that are not observable (or hidden). A common feature of hidden variable models is that their statistical properties are not entirely understood because of a lack of smoothness properties that makes them irregular. This is the primary motivation for this project that develops theory and methods that have a bearing on problems such as determining the number and type of unobserved variables to be included in a statistical model. Such problems arise in particular in applications in the social sciences where key concepts such as intelligence are not directly observable, and in computational biology where hidden variables are employed, for example, when DNA of present-day species is used to validate evolutionary theories that involve extinct species. More broadly, the work is relevant for any study, medical or otherwise, in which the existence of influential unobserved variables cannot be excluded.