Every individual genome predisposes its carrier to some set of diseases. Despite all research efforts, however, heritable causes of complex disease remain elusive. This is largely due to the inherent complexity of pathogenesis pathways and the interaction of individual genomic determinants with the environment. Elucidating causative genetics of pathogenesis will spur the development of better treatments and prevention tactics, modulating the presence of individual-specific stressors. Here, we propose to build AVA, Dx (Analysis of Variation for Association with Disease) a computational method for defining the functional role of DNA variation in complex diseases. AVA, Dx will use exome sequence data to pinpoint the molecular pathways affected in disease and to predict individual disease predisposition. As a proof of concept, we will use the nearly two thousand available sequenced exomes of Tourette Disorder, Crohn's Disease, and Chronic Obstructive Pulmonary Disease cohorts to build separate AVA, Dx instances. For each individual disease cohort we will first build a predictor of the impact of genetic variation on molecular gene function. This predictor will be unique in its ability to account for variant genotype in evaluating the impact of all kinds of gene-associated variants, rare and common, coding and non-coding. We will further encode each exome in our set as a vector of function impact scores for all genes. Based on this set of vectors, feature selection techniques will identify disease-genes; i.e. genes with exome-specific function changes correlating best to the clinical annotation of individual disease status (disease/healthy). Note that in this manner we expect to find a sizeable set of novel disease genes. We will train an artificial learning classifier to recognize the functional differences in sts of selected genes to distinguish the clinical status of the newly sequenced exomes (individuals). As the exome sequencing techniques used in our study vary by cohort, we will build experimental setup flexibility into our analysis structure. As a result, AVA, Dx techniques will be useful for drawing conclusions on existing sequencing data. AVA, Dx will generate experimentally testable hypotheses of disease pathogenesis by pinpointing the affected molecular functions. Moreover, AVA, Dx will be prognostic, allowing determination of disease predisposition prior to clinical diagnosis.

Public Health Relevance

Every person is genetically predisposed to number of disorders that could significantly affect their span or quality of life. One in four adults in the United States is diagnosable with a mental illness in any given year. Autoimmune disorders and chronic obstructive pulmonary disease affect one in ten people, each. Despite all research efforts, however, genetic causes of these and other complex diseases remain elusive. Here we propose to develop AVA, Dx (Analysis of Variation for Association with Disease), and a novel computational method that leverages predictions of functional effects of genome variants in disorder- specific genes to predict individual disease susceptibility. We will demonstrate proof of concept functionality of our method using the genetic and clinical data from Tourette disorder, Crohn's disease, and chronic obstructive pulmonary disease patients and their families. AVA, Dx will motivate new experimentally-testable hypothesis regarding the biological mechanisms of various diseases and provide a means for earlier prognosis, more accurate diagnosis and the development of better treatments.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Ravichandran, Veerasamy
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Rutgers University
Earth Sciences/Resources
United States
Zip Code