Our research is motivated by recent advances in high-throughput technologies, such as DNA microarrays, which make it possible to record the complete genomic signals that guide the progression of cellular processes. Future predictive power and discovery in biology and medicine will come from the mathematical modeling of the rapidly growing number of these large-scale molecular biological datasets. To this end, we built the first data-driven predictive models for genomic data using frameworks from matrix computation. We illustrated thesse models in the analyses of, e.g., cell cycle expression data and transcription factors' and replication initiation proteins' DNA-binding data. The power of our models to predict previously unknown biological principles was demonstrated with a prediction of a novel mechanism of regulation that correlates replication initiation with cell cycle-regulated transcription in yeast. Now, we aim to validate experimentally this computational prediction by collecting and analyzing genome- wide expression under conditions that are thought to decouple replication from cell cycle transcription in eukaryotes. These experiments will test the ability of our mathematical models to correctly predict biological principles. The relation between replication and transcription during the cell cycle will also be illuminated. We also aim to develop the first data-driven predictive tensor computation models for large-scale molecular biological data. The structure of these data is of an order higher than that of a matrix, especially when integrating data from different studies. Flattened into a matrix much of the information in the data is lost. We will study analytically several possible tensor frameworks, and implement algorithms to compute and visualize them. We will apply these mathematical tools to biological data from studies of cancer, cellular proliferation and the cell cycle. This program will result with new insights into the interconnections among the biological programs of cancer, cellular proliferation and the cell cycle. Our goal is to enable better understanding and ultimately also control of life processes on the molecular level. These models may become the foundation of a future in which biological systems are modeled as physical systems are today. The predicted mechanism of regulation may be at the basis of a future where the cell division cycle and cancer can be controlled.
|Sankaranarayanan, Preethi; Schomay, Theodore E; Aiello, Katherine A et al. (2015) Tensor GSVD of patient- and platform-matched tumor and normal DNA copy-number profiles uncovers chromosome arm-wide patterns of tumor-exclusive platform-consistent alterations encoding for cell transformation and predicting ovarian cancer survival. PLoS One 10:e0121396|
|Bertagnolli, Nicolas M; Drake, Justin A; Tennessen, Jason M et al. (2013) SVD identifies transcript length distribution functions from DNA microarray data and reveals evolutionary forces globally affecting GBM metabolism. PLoS One 8:e78913|
|Lee, Cheng H; Alpert, Benjamin O; Sankaranarayanan, Preethi et al. (2012) GSVD comparison of patient-matched normal and tumor aCGH profiles reveals global copy-number alterations predicting glioblastoma multiforme survival. PLoS One 7:e30098|
|Muralidhara, Chaitanya; Gross, Andrew M; Gutell, Robin R et al. (2011) Tensor decomposition reveals concurrent evolutionary convergences and divergences and correlations with structural motifs in ribosomal RNA. PLoS One 6:e18768|
|Ponnapalli, Sri Priya; Saunders, Michael A; Van Loan, Charles F et al. (2011) A higher-order generalized singular value decomposition for comparison of global mRNA expression from multiple organisms. PLoS One 6:e28072|
|Omberg, Larsson; Meyerson, Joel R; Kobayashi, Kayta et al. (2009) Global effects of DNA replication and DNA replication origin activity on eukaryotic gene expression. Mol Syst Biol 5:312|
|Omberg, Larsson; Golub, Gene H; Alter, Orly (2007) A tensor higher-order singular value decomposition for integrative analysis of DNA microarray data from different studies. Proc Natl Acad Sci U S A 104:18371-6|