The aim of this Program Project is to discover the function of most genes in the Dictyostelium genome. In Project III we will develop computational techniques to infer gene function and reconstruct gene networks from high-throughput phenotyping, transcriptional profiling, and chromatin footprinfing data, collected in Projects 1 and 11. Our hypothesis is that the increased precision and completeness of these new data sets, made possible by Next Generation sequencing, will allow us to infer powerful predictive models. First, we will design and implement PIPA, a high-throughput sequencing data analysis pipeline. PIPA will be component based and will integrate emerging tools from the community (R, Galaxy, bowtie, top-hat, etc.). It will provide a unified, easy-to-use web-based access to the Program's experimental data. Next, we will devise methods that will query PIPA and consider transcription, competitive growth and chromatin binding information to infer gene function. Integrative data mining to construct consensus gene network models will fuse these emerging hypotheses while considering available external data from other organisms. We will use consensus gene networks as scaffolds upon which we can predict gene function, propose additional experiments, and add layers of informafion from other experiments. We will also use the gene networks as background knowledge for experiment prioritization, the proposal of new mutant-based screens, and the development of new phenotype prediction models. Finally, we propose to implement the new methods within modern server based software architecture with visualization-rich interactive interfaces. The most significant aspect of this part of the project is the design of an infrastructure and interfaces that will make the entire planned data analytics transparent and operable by biologists with no computer science background. Our software will be freely available to the research community and well integrated with dictyBase, a primary Dictyostelium community resource.

Public Health Relevance

The lack of appropriate analytical methods reduces he utility of high-dimensional, genome-scale biological data. Using diverse, rich, high-quality phenotypic and transcriptional profiling data sets we will devise new computational methods to accurately infer gene function, helping us to better understand biological processes and equipping other researchers with improved means to analyze their own biomedical data.

Agency
National Institute of Health (NIH)
Institute
Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD)
Type
Research Program Projects (P01)
Project #
5P01HD039691-12
Application #
8469542
Study Section
Special Emphasis Panel (ZHD1-DSR-N)
Project Start
Project End
Budget Start
2013-05-01
Budget End
2014-04-30
Support Year
12
Fiscal Year
2013
Total Cost
$134,584
Indirect Cost
$22,811
Name
Baylor College of Medicine
Department
Type
DUNS #
051113330
City
Houston
State
TX
Country
United States
Zip Code
77030
Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole et al. (2016) Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium. Genome Res 26:1268-76
Katoh-Kurasawa, Mariko; Santhanam, Balaji; Shaulsky, Gad (2016) The GATA transcription factor gene gtaG is required for terminal differentiation in Dictyostelium. J Cell Sci :
Zhang, Xuezhi; Zhuchenko, Olga; Kuspa, Adam et al. (2016) Social amoebae trap and kill bacteria by casting DNA nets. Nat Commun 7:10938
Chen, Xinlu; Köllner, Tobias G; Jia, Qidong et al. (2016) Terpene synthase genes in eukaryotes beyond plants and fungi: Occurrence in social amoebae. Proc Natl Acad Sci U S A 113:12132-12137
Zitnik, Marinka; Zupan, Blaz (2016) COLLECTIVE PAIRWISE CLASSIFICATION FOR MULTI-WAY ANALYSIS OF DISEASE AND DRUG DATA. Pac Symp Biocomput 21:81-92
Žitnik, Marinka; Zupan, Blaž (2015) Gene network inference by fusing data from diverse distributions. Bioinformatics 31:i230-9
Santhanam, Balaji; Cai, Huaqing; Devreotes, Peter N et al. (2015) The GATA transcription factor GtaC regulates early developmental gene expression dynamics in Dictyostelium. Nat Commun 6:7551
Brdar, Sanja; Crnojević, Vladimir; Zupan, Blaz (2015) Integrative clustering by nonnegative matrix factorization can reveal coherent functional groups from gene profile data. IEEE J Biomed Health Inform 19:698-708
Hirose, Shigenori; Santhanam, Balaji; Katoh-Kurosawa, Mariko et al. (2015) Allorecognition, via TgrB1 and TgrC1, mediates the transition from unicellularity to multicellularity in the social amoeba Dictyostelium discoideum. Development 142:3561-70
Rosengarten, Rafael D; Beltran, Pamela R; Shaulsky, Gad (2015) A deep coverage Dictyostelium discoideum genomic DNA library replicates stably in Escherichia coli. Genomics 106:249-55

Showing the most recent 10 out of 59 publications