Finding and labeling semantic patterns in large, spatial data sets is one of the most important problems facing computer scientists today. Massive spatial data sets are being acquired in almost every scientific discipline, such as medicine, geology, biology, astrophysics, and others. Finding meaningful patterns in those data is often the bottleneck to scientific discovery. The proposed research is to develop a transformative machine learning methodology, where the process of discovering semantic patterns in large spatial data sets is interactive and semi-autonomous. With the proposed tools and algorithms, the user is provided with an interactive system that shows the most likely segmentations and labelings given the information provided so far, but allows the user to provide additional information as he/she sees fit. The user might adjust a segmentation, provide a label, or specify an expected pattern. The system will adapt in real time to each of these inputs, thus adjusting its predictions throughout the data.
The broad impact of the proposed plan will be enhanced through an integrated educational and outreach plan. Besides the published results of research results, the field will benefit from free distribution of research and education resources, including web pages, bibliographies, software, and data sets, including augmentations to WordNet. Further broad impacts include focused workshops and courses on shape analysis, machine learning, and visualization at both the university and professional levels. Finally, diversity enhancement programs will promote the opportunities for disadvantaged groups in research.