In clinical cancer genetics, molecular diagnostic testing is now commonly performed looking for pathogenic mutations in cancer susceptibility genes. A critical challenge in the field is interpreting whether a genetic variant causes disease o not. Lynch syndrome (LS), the most common hereditary colorectal cancer syndrome, is caused by germline mutations in one of four DNA mismatch repair (MMR) genes- MLH1, MSH2, MSH6, and PMS2. About 20-30% of the variants identified in MMR and other cancer susceptibility genes are missense or non-coding changes that may or may not be pathogenic, but whose effects on function and disease cannot be interpreted easily. They are designated """"""""Unclassified Variants or """"""""Variants of Unknown Significance"""""""" (VUS). Classifying variants as pathogenic and neutral significantly improves the management of LS and other hereditary cancer syndromes by identifying which individuals carry a harmful genetic variant and thus benefit from screening and therapeutic measures. The scientific problem is to classify as either """"""""pathogenic"""""""" or """"""""not pathogenic"""""""" all MMR gene variants found by genetic testing for LS. Correct classification of variants requires integrating clinico-pathologic, epidemiologic, bioinformatic, and in vitro data. The optimal way to use these methods is unknown. Our hypothesis is that clinical, in silico, and laboratory data can be integrated qualitatively and quantitatively to classify all variants in MMR genes. This study will use a large set of MMR variants and refine a method that integrates these data.
Aim 1. Development of reference sets of gene variants in MMR genes that are classified by clinical and epidemiological data as Likely Pathogenic, Likely Neutral, and Unknown. These sets will be used to calibrate and refine a classification model integrating multiple data types.
Aim 2. Analysis of individual data types to classify variants: To assign and calibrate predictive values and odds ratios for pathogenicity for multiple data types, including: 1) clinical and family history, 2) tumor histology 3) tumor immunohistochemistry for MMR proteins, 4) tumor Microsatellite Instability, 5) tumor MLH1 methylation, BRAF V600E mutation, 6) in vitro assessment of missense variants by functional assays, 7) in silico assessment of missense variants by sequence and structure-based algorithms, 8) in vitro assessment of exonic variants by splicing assays, and 9) in silico predictions of splice effects from exonic sequence variants.
Aim 3. Development of a model for integrating data. These models will pass through three stages: (i) a qualitative model, (ii) a quantitative Bayesian model that considers each data type independently, and (iii) a two component mixture model that considers all validated data types simultaneously. Relevance: Interpreting which genetic variants increase risk for hereditary cancer and which do not can be difficult. This research uses clinicopathologic, epidemiologic, in vitro, and in silico studies of MMR genes to interpret which genetic changes cause LS and which are harmless. Improving the interpretation of genetic variation will improve the management of hereditary cancers and other genetic diseases.
For patients suspected of having Lynch syndrome, the most common type of hereditary colon cancer, genetic testing has become a common and important part of clinical cancer genetics. However, about 20-30% of DNA changes (variants) that are found through such testing cannot be interpreted, and overcoming this problem is critical to improving the management of Lynch syndrome and all cancer predisposition syndromes. The goal of this project is to develop efficient methods for classifying genetic variants as either disease-causing or not, which will immediately improve the management of hereditary cancer syndromes and other genetic disorders.