Living cells show a great deal of heterogeneity, even when they have the same genes and grow in virtually identical environments. The underlying reason is that molecules in the cell bounce around and collide with each other, and sometimes randomly react. This process is unpredictable in much the same way that the outcome is unpredictable when throwing dice. Cells can therefore differ greatly in their behavior, which is a main reason that they respond so differently to drugs and that sub-populations often survive e.g. cancer or antibiotic treatments. Recent technological developments now make it possible to measure the levels of components in individual cells, rather than just the average level over a population of cells. This provides an incredible amount of information, but also forces the data analyses to deal with many more complications when trying to interpret the data. This project will develop methods to analyze such data. Specifically, rather than basing the interpretations on greatly simplified assumptions, which may or may not be valid, it will develop tools to interpret data in terms of biological mechanisms while remaining agnostic about all the processes that indirectly affect the system dynamics. In previous treatments those indirect effects would completely confound interpretations, whereas the approaches in this project eliminate such problems entirely. The approach is broadly applicable and the project has broad consequences in basic cell and molecular biology, in health and medicine, and in ecology and population biology. Further, the project will train students at the interface between scientific disciplines, and generate principles and approaches that will be useful in undergraduate education.

Single cell data for fluctuations in abundances contain a great deal of information about the underlying mechanisms. Conventional approaches to analyze such data have relied on fits to specific models that try to make correct assumptions about all processes that substantially affect the observed fluctuations. Whether using simple toy models or detailed simulations, this poses a great problem because so many indirect effects are not yet known that can greatly affect the process of interest. It has also been shown that model uniqueness is a great problem in this field, where very different feasible models produce the same fit, yet produce very different concrete conclusions about the underlying biology. In fact it has been exceedingly difficult to mechanistically interpret the simplest single cell processes in the simplest organisms, suggesting that there is little hope of interpreting more complex systems, such as human cells embedded in tissue. Research through this project addresses this challenge by considering large classes of processes collectively, allowing the models to differ arbitrarily in all indirect effects. This is possible by considering properties of marginal distributions that are provably unaffected by many model assumptions, and makes it possible to rigorously dissect complex and sparsely characterized networks subject to changing environmental inputs, nonlinear feedback loops and unknown sources of fluctuations, despite only having access to snapshots rather than time-series. Because the presented approaches are analytical, and the marginal properties can be intuitively expressed in terms of the few explicit mechanistic assumptions, the approach is also conceptually transparent. Further, because the testable relations produced by the theory are rigorously invariant of indirect effects, they are particularly well suited for the analysis of sparsely characterized complex systems or the effects of drugs, separating direct from indirect effects even when the latter are complicated and dominant.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1562497
Program Officer
Junping Wang
Project Start
Project End
Budget Start
2016-04-01
Budget End
2021-09-30
Support Year
Fiscal Year
2015
Total Cost
$635,740
Indirect Cost
Name
Harvard University
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02138