This proposal aims to develop new statistical learning tools geared towards the challenging problem of understanding population level variation, extracting features and gaining knowledge from a set of complex data objects. Object Oriented Data Analysis (OODA) is an outgrowth of Functional Data Analysis, in which the basic elements of data analysis are curves. The basic elements of OODA are complex data objects including tree-structured objects. In medical image analysis, tree-structured objects are found to be efficient for data representation when the focus of the medical study involves variation in branching structures. The proposed work is driven by a data set of human brain artery systems and will clearly have an impact on many other scientific fields involving populations of tree-structured objects. Analysis of complex data objects, such as trees, general graphs, networks and shapes, poses serious challenges towards methodological development since traditional statistical models for multivariate data and functional data rely on linear operations in Euclidean spaces or vector spaces. Thus, it requires the development of novel and nontraditional techniques in a whole new statistical paradigm for extracting patterns and information from data objects. The proposed work is targeted to address some fundamental issues, including one-dimensional representation in tree space. The first goal of this project is to provide tools for data exploration and summarization. Next, the investigator will study probability distributions (mixture models), which can be used as the basis of statistical inference. The investigator will further study modeling of tree-structured data to explain the relationship between tree-structured covariates and numerical response, and/or between numerical covariates and tree-structured response. Kernel based methods and logistic regression will be implemented for classification in tree spaces.

Highly sophisticated data collection processes in science and technology from the last two decades motivate the study of complex data objects. The proposed work will open up a new area of statistical research, lay down a foundation and enrich the toolkit available for the analysis of object oriented data. The investigator will continue to implement newly developed modeling procedures to the human brain artery data, and help to improve existing brain tumor diagnosis procedure. This will also have a major impact on object oriented data analysis by developing interdisciplinary research among various scientific fields. It is expected that the ideas and methods resulting from this proposal will go beyond the motivating example of analyzing human brain artery data, and will provide researchers deeper insights in the discipline where the data were collected.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1106975
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2011-09-01
Budget End
2014-08-31
Support Year
Fiscal Year
2011
Total Cost
$159,398
Indirect Cost
Name
Colorado State University-Fort Collins
Department
Type
DUNS #
City
Fort Collins
State
CO
Country
United States
Zip Code
80523