In the genome sequencing era, findings genes and their regulatory regions is fundamental to our basic understanding of biology, and also to the progress of future medical technologies such as gene therapy. recent results suggest that DNA 3D structure plays an essential role in gene regulation. The long term objective of this project is to develop a versatile data mining software application for the visualization of DNA structure, the efficient prediction of regulatory sites, especially promoters, and the discovery of relationships between DNA structure and functional elements.
The specific aims are: (1) to integrate DNA structural scales, such as bendability or propeller twist angle, with our current machine-learning software environment (Hidden Markov models, Neural Networks); (2) to train expert modules to discover structural signatures from primary sequence information; (3) to develop and implement and efficient promoter prediction/classification tool. The novelty in this approach is the combination of DNA structural information with machine- learning techniques to automatically extract and visualize relevant information from large amounts of raw data. The computational tools are constructed using an object-oriented foundation designed to scale up with data expansion and complexity.
The proposed research will generate bioinformatics tools for the visualization of DNA structure and the detection of regulatory regions. In particular, the prediction and understanding of promoter regions constitute and important component of a complete gene finding solution and a necessary step towards gene therapy. All biotechnology companies are strong potential users of such tools.