Complex diseases, including cancer and multiple sclerosis, arise when multiple biological systems are impacted by molecular changes, such as gene mutation, gene silencing, or cellular modification by external agents. These molecular changes lead to changes in the cellular state, almost always including reprogramming of gene expression. Because biological systems are inherently complex and nonlinear, appropriate modeling of the system is required to recover knowledge of systems-level behavior in many cases. We focus in this proposal on methods to improve our ability to infer biological process activity from high- throughput data and on the development of biomarkers of these processes. We have developed a Markov chain Monte Carlo algorithm for analysis of microarray data that combines knowledge of transcriptional regulation and simple models of signaling networks to identify on-target and off- targets effects of therapeutics in cancer. Here we propose including prior knowledge in this algorithm by integrating data from pathways, gene mutations, methylation arrays, miRNA arrays, and protein-protein interactions (PPIs) to improve inference of cell behavior at a systems-level.
Our first aim will develop an integrated model of signaling coupled to transcriptional reprogramming that couples knowledge of pathways to measurements of mutation status and miRNA levels to improve estimation of cell signaling.
Our second aim will integrate methylation measurements and transcription factor binding site information as prior data to refine inference of targets of transcription factors to improve gene set inference and estimation of upstream signaling.
Our third aim will develop a methodology to find reliable biomarkers from our models in light of multiple regulation of genes. Successful completion of the proposed work will provide novel open-source tools for improved inference on and biomarker identification of transcriptional reprogramming of cells. This will substantially improve inference of the specific changes that drive systems out of homeostasis and of treatment response in individual cases.

Public Health Relevance

Many diseases, including cancer and many autoimmune diseases, arise from errors in signaling networks that drive aberrant transcriptional reprogramming of cells. We focus here on methods to deduce aberrant signaling activity from transcriptional data, incorporating prior biological knowledge including known pathways, mutations, methylation status, miRNA expression, and protein-protein interactions. This will improve our ability to target aberrant pathway activity on an individualized basis and to determine the targeting of therapeutics designed to restore homeostasis, enhancing personalized medicine.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
College of New Jersey
Biostatistics & Other Math Sci
Schools of Arts and Sciences
United States
Zip Code
Rathi, Komal S; Gaykalova, Daria A; Hennessey, Patrick et al. (2014) Correcting transcription factor gene sets for copy number and promoter methylation variations. Drug Dev Res 75:343-7