Understanding the processes that regulate the transcription of genes is central to understanding evolution, the development of multicellular organisms, and the response to pathological changes, including cancer and heart disease. This proposal aims to make substantial progress in developing and testing computational methods, and then applying them to experimental systems. We develop and deploy a battery of computational methods aimed at associating regulators with their targets, and inferring sequences that are targets for currently unidentified regulators. Testing and validation is carried out both retrospectively, against well curated databases, and prospectively, using a variety of experimental methods on a selected set of predictions.
The specific aims i nclude the following. 1. Develop and test innovative approaches for discovering new binding sites for well studied regulators, as well as sites for currently unidentified regulators. The former method requires integrating numerous and often very large datasets and then pruning the features to identify those that are biologically most relevant. Our preliminary results suggest that doing so substantially improves performance over existing methods 2. Implement all algorithms on IBM BlueGene/L This is one of the fastest machines available, though implementing algorithms on it requires a fair amount of technical sophistication. Our current implementation increases compute power over standard 2 GHz processors by approximately 20-fold. The use of Blue Gene/ L in combination with (1) will put the research community in a position to make discoveries that are substantially greater in number and more reliable than is currently possible. 3. Apply and test the methods on (i) the full S Cerevisiae genome and (ii) the mammalian GABA A receptor family. The former offers the advantages of being well studied, of providing a large set for data for testing, and of being relatively simple compared to the mammalian genome. GABA is the major inhibitory neurotransmitter in the central nervous system (CNS), and plays a key role in CNS development and disease.
Showing the most recent 10 out of 11 publications