Malaria is one of the most common human infectious diseases, with an estimated 300-500 million cases a year and between one and three million yearly deaths. Malaria is caused by protozoan parasites, with the most serious forms of the disease in human caused by Plasmodium falciparum. The P. falciparum genome has been fully sequenced. Remarkably, only 55% of its identified proteins have any predicted or known functional annotations, and much of the organism's core machinery remains unidentified, thereby significantly hampering our understanding of this organism and of malaria. Since traditional bioinformatics approaches have had limited success in uncovering P. falciparum protein functions, the long-term goal of this research is to develop novel computational approaches that are more effective for this task. Our framework is centered on better identification of protein domains, the structural, functional and evolutionary units of proteins, and linking uncovered P. falciparum protein domains to well-characterized domains associated with known protein functions. Our approaches leverage comparative genomics, graph-theoretic methods, and sensitive probabilistic profile-profile comparisons, all within a robust computational pipeline.
The specific aims of this proposal are (1) To uncover putative domains within P. falciparum protein sequences using homologous sequences in closely related genomes, and to use these to identify similarity to known functionally characterized protein domains. (2) To increase the number of P. falciparum proteins with predicted functional motifs and domains by exploiting the tendency of certain motifs and domains to occur together within the same sequence. (3) To experimentally test a representative set of predictions, in order to uncover new P. falciparum biology and to evaluate our computational pipeline. The proposed techniques have significant potential for expanding the number of protein functional annotations within P. falciparum, and for therefore accelerating ongoing research efforts aimed at developing anti-malarial drug targets against the causative agent of human malaria.
Malaria is one of the most common human infectious diseases, with an estimated 300- 500 million cases a year and between one and three million yearly deaths. The most serious forms of the disease in human are caused by /P. falciparum/. The proposed research aims to significantly expand the number of protein functional annotations for /P. falciparum/, in order to accelerate our understanding of the causative agent of human malaria.
Ochoa, Alejandro; Singh, Mona (2017) Domain prediction with probabilistic directional context. Bioinformatics 33:2471-2478 |
Ochoa, Alejandro; Storey, John D; Llinás, Manuel et al. (2015) Beyond the E-Value: Stratified Statistics for Protein Domain Prediction. PLoS Comput Biol 11:e1004509 |
Ochoa, Alejandro; Llinás, Manuel; Singh, Mona (2011) Using context to improve protein domain identification. BMC Bioinformatics 12:90 |