Cross-hybridization is singly the most important source of noise in gene expression measurements. Many current approaches to assess gene signal fit data to statistical models; in contrast, the investigators will develop a thermodynamics-based algorithm to compute the hybridization free energy between probe and specific (i.e. intended) target as well as non-specific (i.e. non-intended) target molecules. This new approach takes into consideration the fact that thermodynamics of hybridization for two molecules in aequeous solution is different that than where one molecule (probe) is tethered to a glass slide. A new probe-specific position-dependent hybridization partition function will be computed by a dynamic programming algorithm PPH using free energy parameters from labs of Turner, Santalucia, and Sugimoto. The partition function accounts for (Boltzmann-weighted) sum of all possible partial as well as complete hybridizations betwenn probe and target, including secondary structure of both probe and target. Applying PPH to all probes and all (specific and non-specific) targets is not computationally feasible, so the algorithm PPHx will be developed to compute the hybridization free energy between probe and a Markov model representing all non-specific targets. Ensemble free energies are immediately obtained from partition function values Z, and can be used to derive concentrations of messenger RNA from microarray fluorescence intensity values.

High-density oligonucleotide arrays (gene-expression arrays, tiling arrays, single-nucleotide polymorphism arrays, microRNA arrays, etc.) constitute a powerful tool in molecular biology for the discovery of genes and their function, with far-reaching applications in population biology, systems biology, pathobiology and other fields. In this method, fluorescently tagged cRNA or cDNA derived from messenger RNA is washed over a glass slide to which hundreds of thousands up to millions of short cDNA probes are attached. Hybridization between fluorescently tagged molecules and probes occurs, non-hybridized molecules are removed, and fluorescence intensities are measured by an optical scanning device. Despite technical improvements in various commercial platforms, it is still not possible to infer messenger RNA concentrations from microarray fluorescence intensity values, due to noise from cross-hybridization (also called non-specific binding). A computer algorithm to compute the cross-hybridization free energy will be developed and implemented, directly allowing one to estimate cross-hybridization effects.

Findings from this research will be made publicly accessible through a web server and distribution of source code, by journal publications, and presentations in meetings. Due to the prevalent use of microarray technology, this research will have a very broad impact on molecular biology (genetics, genomics, systems biology, population biology) as well as to disease pathology and drug dosage and design. Educational impact at the undergraduate and graduate level will be ensured by courses and training opportunities offered at Boston College, which provide special opportunities for females and minorities, with additional outreach provided in the weekly MIT Bioinformatics Seminar.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0817971
Program Officer
Mary Ann Horn
Project Start
Project End
Budget Start
2008-09-01
Budget End
2011-08-31
Support Year
Fiscal Year
2008
Total Cost
$199,902
Indirect Cost
Name
Boston College
Department
Type
DUNS #
City
Chestnut Hill
State
MA
Country
United States
Zip Code
02467