Graphs are used throughout the biological sciences to represent how objects of interest are interrelated. A common problem, and the focus of this project, is how to elicit the properties of such a graph when only pairwise distances between objects are available. This project responds to the case when there is missing data; that is, when there exist pairs of objects for which pairwise distance information is not supplied. To do so, it introduces the concept of partially-supplied graphs and develops their mathematical properties. Emphasis is on uncovering the spectral properties of Laplacian matrices that represent partially-supplied graphs, with the goal of relating eigenvectors to the underlying graphical structure. Subsequently, this research will exploit a connection between the Laplacian matrices of partially-supplied graphs and pairwise distance matrices with missing data. Doing so will provide a missing link between larger graphs of biological objects and smaller matrices representing a subset of vertices of these graphs.

This project has an arc spanning more than fifty years. It is motivated by a particular type of data and a particular type of analysis that recurs, decade after decade, throughout the biological sciences. As a result, this project will address a number of long-standing questions. One such question concerns evolutionary trees, which are graphs whose vertices are species, including those that are extinct. Fossils aside, only extant species yield data, and thus evolutionary trees can be thought of as partially-supplied. Using this observation, the project will rectify two major branches of biological systematics that represent extant species data in seemingly different ways. Simultaneously, this research contributes a new perspective to the mathematical literature on graph theory by introducing graphs with missing data. Thus, this project sits directly at the interface between mathematics and biology, and it supports a graduate student and a postdoctoral researcher to be trained as interdisciplinary scientists. More broadly, this research serves as an example of the synergy between fields, showing how biology can advance mathematics while mathematics is advancing biology. As such, it is hoped that dissemination of this work will motivate mathematically gifted students toward the biological sciences.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1122527
Program Officer
Mary Ann Horn
Project Start
Project End
Budget Start
2011-10-01
Budget End
2015-09-30
Support Year
Fiscal Year
2011
Total Cost
$200,000
Indirect Cost
Name
North Carolina State University Raleigh
Department
Type
DUNS #
City
Raleigh
State
NC
Country
United States
Zip Code
27695