Common genetic variants contribute to a wide spectrum of human phenotypes and risks of developing many types of diseases. Differentiating causative from non-functional variants and understanding molecular mechanisms of the former together represent important challenges. Alu interspersed repeats are a common type of structural variant in the human genome. Alu insertions are intrinsically capable of impacting mRNA sequence and resulting in disease - even when the repeat is positioned in apparently non-coding, intronic sequence within the gene locus. Our hypothesis is that some inherited Alu insertion polymorphisms are functioning as causative variants for human disease risk. We propose to narrow the list of candidates by identifying those Alu insertions that associate with disease risk and that impact the structure of a relevant mRNA transcript. We will develop computational approaches to identify Alu exonization events in RNA-seq data, and we will experimentally evaluate Alu insertion polymorphisms for effects on mRNA splicing using reporter assays and genome editing strategies.
Genetic variants are likely to exert functional effects through alterations to messenger RNAs. They may alter the quantity of mRNA produced or alter its sequence. The purpose of the proposed studies is to identify genomic Alu Short INterspered Element (SINE) insertion polymorphisms that alter mRNA sequences. We will use genome wide association studies (GWAS) to prioritize those that are candidates for causing disease risk.