Eczema is a chronic skin inflammatory disease infecting 10-30% of children and 2-10% of adults and increasing. Researchers have struggled to understand the genetic and environmental factors that appear to play a role in eczema's increasing prevalence. The majority of current studies focus on one or two specific types of Genome-wide data in association with eczema such as SNPS, DNA methylation levels, or gene expression levels. Current statistical methods have forced this limitation as none exist, to our knowledge that allow clustering of subjects across multi-dimensional and multi-faceted variables. Therefore, in this project we aim to introduce a novel clustering approach that will cluster subjects across genes which contain SNP information, DNA methylation levels, and gene expression levels. Summarized clusters will then tested for associated risk of eczema or any allergic disease. Genetic patterns associated with eczema will then be further explored to better identify the behavior genetic data and it's uniqueness. The method will be built in R Gui as a free downloadable package making it accessible and easy to use. We will apply the proposed method to the third generation Isle of Wight cohort data. The third generation cohort data consist of children born (2010-current time) from the original cohort of children born between 1989 and 1990 on the Isle of Wight in the UK and contains Genome-wide datasets, disease and exposure status, and allergy information across multiple follow-ups. The implications of this study not only impact the statistical field with a new clustering approach but the eczema and allergy field as well as no other studies have been able to analyze three Genome-wide datasets in concert associated with the risk of allergic diseases.
The prevalence of eczema is increasing world-wide and infecting between 10-30% of children. The complete underlying cause of eczema is unknown but thought to be mixture of genetic and environmental factors. This project aims to incorporate genetic and epigenetic data in concert using a novel statistical approach in hopes of identifying patterns associated with eczema risk, allowing researchers to better understand the biological mechanisms.