In response to the increased awareness for Bio-defense and the possibility of the spread of new emerging infectious pathogens, I propose to investigate the feasibility of predicting amino acid contacts for the three proteins of the replication/transcription complex of the order Mononegavirales (e.g. Ebola, Rabies, measles etc.). The large size of the replication/transcription complex is beyond the limits of current structural determination methods such as X-ray crystallography or NMR spectroscopy; therefore, an approach that correlates results from both laboratory and Bio-informatic analyses is a logical course of action. The proposed Bio-informatic studies will proceed along three paths: prediction of disorder; determination of compensatory mutation; and assessment of evolutionary dynamics. Correlating the data obtained from these methods and experimental data from the published literature will maximize the chances of identifying the protein: protein contact points both within and between the P, N, and L proteins of the replication/transcription complex. However, this poses the challenge of integrating the findings from the different methods in a meaningful and comprehensive way. A traditional joint probability distribution approach would require a space of O(2^n) to represent the data, with n being the number of data sets. Given the vast amount of information to be incorporated, this method is beyond current computational feasibility. Instead, a method from the field of artificial intelligence, Bayesian Networks (also known as Belief Networks) will be utilized to correlate the results. Several experimental laboratories have agreed to test the residue contact predictions resulting from these studies for Ebola, VSV and measles.
Cleveland, Sean B; Davies, John; McClure, Marcella A (2011) A bioinformatics approach to the structure, function, and evolution of the nucleoprotein of the order mononegavirales. PLoS One 6:e19275 |