The proposed research directly addresses the mission of NIH's BD2K initiative by developing appropriate tools to derive novel insights from available Big Data and by adapting sophisticated machine learning methodology to a framework familiar to biomedical researchers. This new methodology will be one of the first to enable use of machine learning techniques with time-to-event and continuous longitudinal outcome data, and will be the first such extension of the deep Poisson model. In essence, this undertaking builds the missing bridge between the need for advanced prognostic and predictive techniques among biomedical and clinical researchers and the unrealized potential of deep learning methods in the context of biomedical data collected longitudinally. To facilitate smooth adoption in clinical research, the results will be translated into terms familiar to applied practitioners through publications and well-described software packages. The application of the methodology developed will be illustrated using data from the NIH dbGAP repository, thereby further promoting the use of open access data sources.
Optimal risk models are essential to realize the promise of precision medicine. This project develops novel machine learning methods for time-to-event and continuous longitudinal data to enhance risk model performance by exploiting correlations between large numbers of predictors and genetic data. This will enable biomedical researchers to better stratify patients in terms of their likelihood of response to multiple therapies.