The mission of the Gene Security Network (GSN) is to create a system that enables clinicians to use aggregated genetic and phenotypic data from clinical trials and treatment records to make the safest, most effective treatment decisions for each patient. A patient's unique response to clinical therapy is dependent on his or her genetic composition, as well as the biomolecular nature of the disease process. Academic institutions are rapidly accumulating clinical data, representing a vanguard in the trend towards personalized medicine, but a lack of technology systems and format standards for the integration and validation of data makes it difficult to successfully interpret and predict individual patient responses. For Phase I, we focus on three key components of the GSN mission: i) to create a standardized ontology and translation engine for efficient integration and validation of pharmacokinetic data, ii) to use the translation engine to integrate multiple sets of pharamacokinetic data into the standardized ontology, and iii) to develop statistical methods to perform data validation and outcome prediction with the integrated genetic and phenotypic data. To demonstrate the utility of our approach, we are collaborating with the PharmGKB Project at Stanford University. PharmGKB manages an openly-shared Internet repository for clinical trial data with the intent to uncover how individual genetic variation contributes to distinctive reactions to Pharmaceuticals. As a member of the NIH Pharmacogenetics Research Network (PGRN), PharmGKB's database includes extensive pharmacokinetic and genomic records from cardiovascular, pulmonary, and cancer research. Here we focus on breast and colon cancer treatment, both of which could be considerably enhanced by the integration of diverse genetic and phenotypic data into a standardized ontology, validation of the data, and statistical analysis of data to predict drug efficacy and side-effect profiles. Underdetermined and ill-conditioned data sets are common for these diseases, as for many genotypic and phenotypic modeling problems, where the number of possible predictors? genes, proteins, or mutation sites? Is large relative to the number of measured outcomes.
For specific Aim I, we focus on creating a standardized ontology and translation engine for PharmGKB data.
For Aim 2, we concentrate on the integration and analysis of pharmacokinetic data associated with PharmGKB's breast cancer and colon cancer data.
For Aim 3, we train statistical models on the integrated data to show how the data can be used to enhance the efficacy and safety of certain drugs. In subsequent phases the prototype system will be extended to accommodate other forms of data and types of diseases, and functionality will be provided for a clinician to select a trial, submit relevant data for a new patient, and view predictions and confidence bounds for key outcomes given different interventions for that patient using models trained on the integrated trial data. Details are to be provided in a phase II application subsequent to completion of Phase I. The amount of data that clinicians must compile and digest to provide their patients with optimal care is rapidly expanding and is increasingly daunting. The Gene Security Network stands to significantly reduce this burden and greatly improve the speed and accuracy of clinical decision-making. ? ? ?