The confluence of new machine learning (ML) data-driven approaches; increased computational power; and access to the wealth of electronic health records (EHRs) and other emergent types of data (e.g., omics, imaging, mHealth) are accelerating the development of biomedical predictive models. Such models range from traditional statistical approaches (e.g., regression) through to more advanced deep learning techniques (e.g., convolutional neural networks, CNNs), and span different tasks (e.g., biomarker/pathway discovery, diagnostic, prognostic). Two issues have become evident: 1) as there are no comprehensive standards to support the dissemination of these models, scientific reproducibility is problematic, given challenges in interpretation and implementation; and 2) as new models are put forth, methods to assess differences in performance, as well as insights into external validity (i.e., transportability), are necessary. Tools moving beyond the sharing of data and model ?executables? are needed, capturing the (meta)data necessary to fully reproduce a model and its evaluation. The objective of this R01 is the development of an informatics standard supporting the requisite information for scientific reproducibility for statistical and ML-based biomedical predictive models; from this foundation, we then develop new computational approaches to compare models' performance. We begin by extending the current Predictive Model Markup Language (PMML) standard to fully characterize biomedical datasets and harmonize variable definitions; to elucidate the algorithms involved in model creation (e.g., data preprocessing, parameter estimation); and to explain the validation methodology. Importantly, models in this PMML format will become findable, accessible, interoperable, and reusable (i.e., following FAIR principles). We then propose novel meth- ods to compare and contrast predictive models, assessing transportability across datasets. While metrics exist for comparing models (e.g., c-statistics, calibration), often the required case-level information is not available to calculate these measures. We thus introduce an approach to simulate cases based on a model's reported da- taset statistics, enabling such calculations. Different levels of transportability are then assigned to the metrics, determining the extent to which a selected model is applicable to a given population/cohort (i.e., helping answer the question, can I use this published model with my own data?). We tie these efforts together in our proposed framework, the PREdictive Model Index & Exchange REpository (PREMIERE). We will develop an online portal and repository for model sharing around PREMIERE, and our efforts will include fostering a community of users to guide its development through workshops, model-thons, and other activities. To demonstrate these efforts, we will bootstrap PREMIERE with predictive models from a targeted domain (risk assessment in imaging-based lung cancer screening). Our efforts to evaluate these developments will engage a range of stakeholders (model developers, users) to inform the completeness of our standard; and biostatisticians and clinical experts to guide assessment of model transportability.
PROGRAM NARRATIVE With growing access to information contained in the electronic health record and other data sources, the appli- cation of statistical and machine learning methods are generating more biomedical predictive models. However, there are significant challenges to reproducing these models for purposes of comparison and application in new environments/populations. This project develops informatics standards to facilitate the sharing and reproducibil- ity of these models, enabling a suite of comparative methods to evaluate model transportability.