This project will develop a framework to represent, analyze and interpret shapes extracted from images, supporting a wide range of biological investigations. The primary objectives are: (1) to develop a mathematical framework and computational tools for the quantification and analysis of shapes; (2) to integrate these computational models with machine learning and statistical inference methods to enable new discoveries, transforming imaging data into biological knowledge; (3) to deliver novel quantitative methodologies for shape analysis that start from a biological premise, rather than a purely geometric one. The aim is thus not only to quantitatively describe shape, but to develop methods for linking morphological variation to its underlying biological causes. To ensure that the project focuses on methods that are most promising to biology with significant breadth of application, model and tool development will be guided and supported by a set of diverse case studies, ranging from the sub-cellular to organismal scales.
Shape represents a complex and rich source of biological information that is fundamentally linked to underlying mechanisms and function. However, shape is still often examined on a qualitative basis in many disciplines in biology, an approach that is time consuming and prone to human subjectivity. While ad hoc quantitative methods do exist, they are often inaccessible to non-experts and do not easily generalize to a wide variety of problems. The inability of biologists to systematically link shape to genetics, development, environment, function and evolution often precludes advances in biological research spanning diverse spatial and temporal scales, from the movement of molecules within a cell to adaptive changes in organismal morphology. The primary goal of this project is to develop a new suite of widely applicable quantitative methods and tools into the study of biological shape to address the significant need for consistent and repeatable analysis of shape data.
The collaborative research team, composed of eight institutions, provides a unique combination of expertise in biology, mathematics, and computation. The research goal is to transform shape-based visual characteristics of biological observations into novel knowledge that may unveil insights of biological process in plants and human diseases. The University of Missouri team has developed new computational infrastructure and algorithms to manage biological object databases, arranging them using shape characteristics, dynamic motions, and biological semantics. We have developed computational packages for tropical pollen databases, protein 3D binding site search, and mitochondrial object databases, all which have various significant applications, such as ecological studies, novel drug development, and neural degeneration diseases,. For example, the unique content-based image retrieval for tropical pollen database allows biological researchers to find similar images that may exist within the image database, assisting in the identification of novel types. The ability to compare images with subtle visual differences potentially allows for finer delimitations of morphospecies and a more consistent taxonomy among analysts. By using the semantic modeling function in place of manual labeling, researchers can automatically annotate new grain images with best-fit morphology semantic labels for learning and discovery. This potentially speeds up the discovery and establishment of novel types. For 3D shape analysis, a new protein binding site alignment algorithm was developed to compare a pair of sites based on the surface and structure properties of local regions using geometrical and physicochemical properties from two binding sites, enabling exploration of different combinations of residue correspondences and grouping of the sets of correspondences into initial alignments. This computational tool employs various techniques to eliminate unnecessary tedious comparisons and significantly improve the quality at the same time. This will provide life sciences researchers accurate alignments of protein-protein binding sites, which can aid in protein docking studies and constructing templates for predicting structure of protein complexes for in-depth understanding of evolutionary and functional relationships. To make a biological information system valuable for the life sciences community, data analytics have to provide explainable results that can demonstrate the linkages between measurable features and biological meaning of interest. Neither traditional classification nor information retrieval methods are sufficient for biomedical applications if such linkages cannot be established. In this project, we tackle a challenging research question to model mitochondrial dynamics in Drosophila segmental nerves since mitochondria, essential organelles of eukaryotic cells, undergo frequent shape changes and actively move throughout the cell. Our computational approach is able to perform shape analysis, pattern mining, and behavior retrievals and may shed light on scientific discoveries for degenerated nerve diseases. Our contributions are expected to provide a platform for the larger biological and medical community to collaboratively share and compare the morphological variations of formal and informal morphotypes using 2D and 3D biomedical media in a broad spectrum of life sciences applications.