The degree of bio-threat associated with newly detected pathogen variants (genotypes) that are genetically similar to some known pathogens may be assessed in terms of the cross-immunity between the two variants under study. Perfect (lack of) cross-immunity between the two variants suggests that the newly detected pathogen and the known variant are identical (distinct) epidemiologically. New epidemiological models will be developed for estimating cross-immunity in a two-strain system that may allow for variable birth rate of the natural hosts, possibility of vertical transmission and finite number of contacts per subject per unit time. Similar to many popular epidemiological models, the proposed epidemiological models stipulate that the dynamics of the state vector follow some nonlinear partial differential equation (PDE). New computationally efficient estimation methods are proposed for estimating a PDE model. The development of the proposed methodologies will be guided by analysis of a real monitoring longitudinal data on prevalence of various Bartonella variants (genotypes) in a natural population of rodents (cotton rats).
The research team consists of two statisticians from two academic institutions and one epidemiologist from the CDC, who have worked closely together for a number of years. The proposed works will provide general tools for quantifying an epidemiological similarity between newly detected pathogen variant and known bacterial species, which contribute to the general problem on the assessment of bio-threat associated with newly detected variants. The proposed estimation methods can be generally applicable for estimating PDE models used in epidemiological studies, as well as in other fields, e.g. finance. A computer package implementing the proposed methods will be freely available to the public. The research team will continue to maintain the strong record of training PhD students in cross-disciplinary research.
In the study of infectious disease dynamics, it is pertinent to recognize sub-species of a pathogen. Pathogen variants are often clustered into sub-species based on the distances between their genetic sequences. An important scientific question concerns whether or not genetically dissimilar pathogen variants have epidemiologically dissimilar functions. From an epidemiological viewpoint, a host infected by one pathogen variant will develop cross immunity against another pathogen variant that resembles the former variant in terms of epidemiological functions. But there is no cross immunity between two pathogen variants if they are distinct species. We have developed a new differential equation model for quantifying the degree of cross immunity between two pathogen variants, which provides a framework for addressing the aforementioned scientific question. We demonstrated the efficacy of the new approach with a time-series dataset collected by a multi-year monitoring program on the mixed infections by Barontella in a wild population of cotton rats, and showed that a genetically based classification of the Bartonella variants into sub-species is consistent with the presence or lack of cross-immunity between Bartonella sub-species identified by genetic considerations. More broadly, we developed new statistical methods for estimating nonlinear differential equation models with time-series data, which facilitates the estimation of the cross-immunity model. Besides epidemiology, nonlinear differential equation models arise in diverse disciplines including biology, chemistry, finance, genetics, network analysis, etc. Differential equation models may be formulated based on subject matter knowledge. Or they can be empirically based. For instance, the so-called threshold diffusion model is a useful, empirical model for modeling nonlinear data characteristics including asymmetric periodic behavior, time-irreversibility, volatility clustering, etc. We have developed a new approach for estimating and testing a threshold diffusion model. We illustrated the usefulness of the threshold diffusion model by deriving a valuation model for pricing options such as variable annuities. We have also developed new methods for discovering relationships in possibly big data, for instance, predicting copy number variations by gene expressions. Our approach is based on regularized estimation which renders a data-oriented trade-off between goodness of fit and ease of model interpretation. The developed techniques are useful for finding important factors affecting the dynamics of an infectious diseases, and performing feature selection in other scientific and quantitative investigations.