Infectious diseases surveillance programs provide public health workers with important information for predicting and understanding emergence of epidemics, allowing for timely allocation of resources needed to contain the epidemics. Collecting disease agent molecular sequence information is becoming widespread, especially in surveillance of infectious diseases caused by RNA viruses, such as influenza. Phylodynamics is an emerging statistical framework that allows epidemiologists to harness information present in disease agent sequences in order to shed light on spatio-temporal population dynamics of these agents. Although sophisticated Bayesian inferential tools for phylodynamics have emerged in the last decade, these tools concentrate on sequence data alone, failing to integrate other sources of information (e.g. incidence time series data) into the phylodynamic framework. We claim that integrating multiple sources of information will make phylodynamic inference more precise, allowing for sharper predictions of disease dynamics and for statistical testing of scientific hypotheses. To test this assertion, we propose a series of new statistical methods for integration of multiple sources of information into Bayeisan infectious disease phylodynamics. We will start by developing a new Bayesian method for estimation of population dynamics directly from genomic data that combines the coalescent process, a powerful tool from population genetics, with modern Gaussian process-based Bayesian nonparametric inference (Aim 1). Our preliminary results show that the new method is more accurate than state-of-the-art Bayesian phylodynamics methods. Moreover, the proposed Gaussian process framework will liberate us from drawbacks of the current methodology and will allow us to extend this approach further to estimate correlations between the population size fluctuations and other time-varying variables of interest (Aim 2). This extension is significant, because estimating such correlations is of paramount importance to infectious disease epidemiologists and because all current phylodynamic methods are incapable of such estimation. We will also develop a new model to confront currently ignored dependence of times at which disease agent sequences are sampled on the disease dynamics (Aim 2). Explicit modeling of these sampling times should improve both accuracy and precision of the phylodynamic inference. In all our modeling efforts, we will pay close attention to computational feasibility of the proposed methods by designing efficient Markov chain Monte Carlo algorithms to perform Bayesian inference. To test our new methodology we will analyze benchmark infectious disease data sets, where available external information about disease dynamics will help us validate our methods. In addition, we will mine publicly available databases in order to perform novel data analysis using our newly developed methodology (Aim 3). One of the main deliverables of this research will be open source software, implementing the proposed new Bayesian phylodynamic methods for integration of infectious disease sequence data with other sources of information. 1

Public Health Relevance

Monitoring infectious disease dynamics is important for timely detection of infectious disease epidemics and for organizing timely public health response to these epidemics. Disease agent sequence data is becoming an important source of information in the infectious disease surveillance programs. We propose a series of new statistical methods for analyzing such sequence data. This new statistical methodology will enable epidemiologists to elucidate population dynamics of infectious disease agents and to integrate sequence data with other data collected during infectious disease surveillance programs. 1

Agency
National Institute of Health (NIH)
Institute
National Institute of Allergy and Infectious Diseases (NIAID)
Type
Research Project (R01)
Project #
1R01AI107034-01
Application #
8559213
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Gezmu, Misrak
Project Start
2013-05-22
Project End
2018-04-30
Budget Start
2013-05-22
Budget End
2014-04-30
Support Year
1
Fiscal Year
2013
Total Cost
$351,488
Indirect Cost
$60,692
Name
University of Washington
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195
Zhou, Bo; Moorman, David E; Behseta, Sam et al. (2016) A Dynamic Bayesian Model for Characterizing Cross-Neuronal Interactions During Decision-Making. J Am Stat Assoc 111:459-471
Koepke, Amanda A; Longini Jr, Ira M; Halloran, M Elizabeth et al. (2016) PREDICTIVE MODELING OF CHOLERA OUTBREAKS IN BANGLADESH. Ann Appl Stat 10:575-595
Baele, Guy; Lemey, Philippe; Suchard, Marc A (2016) Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty. Syst Biol 65:250-64
Karcher, Michael D; Palacios, Julia A; Bedford, Trevor et al. (2016) Quantifying and Mitigating the Effect of Preferential Sampling on Phylodynamic Inference. PLoS Comput Biol 12:e1004789
McCoy, Connor O; Bedford, Trevor; Minin, Vladimir N et al. (2015) Quantifying evolutionary constraints on B-cell affinity maturation. Philos Trans R Soc Lond B Biol Sci 370:
Crawford, Forrest W; Weiss, Robert E; Suchard, Marc A (2015) SEX, LIES AND SELF-REPORTED COUNTS: BAYESIAN MIXTURE MODELS FOR HEAPING IN LONGITUDINAL COUNT DATA VIA BIRTH-DEATH PROCESSES. Ann Appl Stat 9:572-596
Chi, Peter B; Chattopadhyay, Sujay; Lemey, Philippe et al. (2015) Synonymous and nonsynonymous distances help untangle convergent evolution and recombination. Stat Appl Genet Mol Biol 14:375-89
Lange, Jane M; Hubbard, Rebecca A; Inoue, Lurdes Y T et al. (2015) A joint model for multistate disease processes and random informative observation times, with applications to electronic medical records data. Biometrics 71:90-101
Bedford, Trevor; Riley, Steven; Barr, Ian G et al. (2015) Global circulation patterns of seasonal influenza viruses vary with antigenic drift. Nature 523:217-20
Vrancken, Bram; Lemey, Philippe; Rambaut, Andrew et al. (2015) Simultaneously estimating evolutionary history and repeated traits phylogenetic signal: applications to viral and host phenotypic evolution. Methods Ecol Evol 6:67-82

Showing the most recent 10 out of 32 publications