Adaptive evolution (AE) is both a ?force of good? as it can help to optimize biological processes in industry, but it is also a ?force of frustration? when infectious diseases exploit AE to escape the host immune system or become resistant to drugs. It has long been assumed close to impossible to make predictions on AE due to the presumed predominating influences of random forces and events. However, the observation that evolutionary repeatability across traits and species is far more common than previously thought, suggests that AE, with the right data and approach, may become (partially) predictable. Indeed, we found through experiments with the bacterial pathogen Streptococcus pneumoniae on its response to antibiotics and the emergence of antimicrobial resistance, that in order to make AE predictable a detailed understanding of at least two aspects of the bacterial system are required: 1.) the genetic constraints of the system (i.e. the architecture of the organismal network); and 2.) where and how in the system stress is experienced and processed. We showed that by mapping out ~25% of the bacterium's network, determining phenotypic and transcriptional antibiotic responses, applying network analyses to capture and quantify the responses in a network context, and exploiting experimental evolution to pin-point adaptive mutations in the genome it becomes possible, by means of machine learning, to uncover hidden patterns in the data that make AE predictions feasible. This means that the network in interaction with the environment shapes the adaptive landscape, it limits available solutions and makes some solutions more likely than others, thereby driving repeatability and enabling predictability. In this proposal we build on these exciting developments with the goal to map out the constraints of S. pneumoniae's entire network and develop a machine learning model that can forecast adaptive evolution a priori, and on a genome-wide scale. To accomplish this, we combine in aim 1 parts of Tn-Seq, dTn-Seq and Drop-Seq to finalize a new tool Tn-Seq^2 (Tn-Seq squared) that is able to map genetic-interactions in high-throughput and genome-wide. We use Tn-Seq^2 to reconstruct the first genome-wide genetic interaction network for S. pneumoniae in the presence of 20 antibiotics.
In aim 2 we create 85 HA-tagged Transcription factor induction (TFI) strains and: a) Determine with ChIP-Seq the DNA-binding sites for all 85 TFs in S. pneumoniae; b) By overexpressing each TFI strain followed by RNA- Seq we determine each TFs regulatory signature; c) Use a Transcriptional Regulator Induced Phenotype screen in the presence of 20 antibiotics to untangle environment specific links between genetic and transcriptional perturbations and their phenotypic outcomes. Lastly, in aim 3, we train and test a variety of machine learning approaches to design an optimal model that predicts which genes in the genome are most likely to adapt in the presence of a specific antibiotic. The development of this predictive AE model, will not only be useful in predicting the emergence of antibiotic resistance, but the strategy should be valuable for most any biological field for which adaptive changes are important, ranging from biological engineering to cancer.
Adaptive evolution (AE) is the driving force behind the emergence of antibiotic resistance and if it were possible to predict AE before it happens, it could help in preventing resistance. Here we use cutting-edge existing and newly designed genomics tools and analytical approaches to develop a machine learning model that can predict AE a priori, and on a genome-wide scale.