Identify serological markers defined subgroups in CD patients using Big Data tools Crohn's Disease (CD) is a complex disorder with a wide spectrum of phenotypic heterogeneity in disease onset, symptoms, location, severity as well as response to treatments. Serological markers such as CBir1, Anti-OmpC and Anti-I2, are linked to heterogeneity of clinical features in CD. These markers constitute an opportunity to further understand the contributions of the multitude of risk factors, the innate and adaptive immune responses, and underlying CD phenotypic subgroups and heterogeneity in clinical features. Here we propose to apply Self-Organizing Map (SOM), a Big Data tool that can identify latent subgroups using complex data in a non-linear and non-parametric way, to a large cohort of 3,812 CD patients to define subgroups in CD. We will examine the clinical features in the serologically defined subgroups to evaluate whether those subgroups can reflect the heterogeneity in CD clinical phenotypes. We will also evaluate effects of known CD risk factors including smoking and previously identified genetic variants in those subgroups, to evaluate the relationship between the subgroups and heterogeneity in CD etiology. Furthermore we will utilize genetic, microbiome/metabolome and glycome data in this repository to screen for novel risk factors specific for given subgroup, and will explore the underlying pathways and networks to better understand the distinct and overlapping pathogeneses of the subgroups. This current proposal represents the first attempt to identify CD subgroups using comprehensive serological panels in a large cohort. This is also one of the first studies utilizing Big Data tools to understnd the complex disease structure in CD. Moreover, this is the first study to systematically integrate serological markers, genetics, microbiome/metabolome, and glycome data in identifying driving factors underlying CD heterogeneity. In particular, this is the largest study with both CD serology markers and glycome in the world, and the first study ever to integrate glycome, which is known to play an important role in shaping human immunity, with serological markers in CD. Identified subgroups in the proposed analysis, which are expected to be more homogenous, can help to develop individualized or tailored treatment strategies for CD patients. Also with the comprehensive integration of genetics, microbiome/metabolome and glycome data, those subgroups can help to identify novel risk factors and might lead to better understanding of CD etiology.
Crohn's disease, one of the inflammatory bowel diseases, affects millions of individuals worldwide. There are multiple subgroups of Crohn's disease with symptoms that can range from mild to severe, impact quality of life and lead to significant disability. This project will apply a Big Data tool known as a Self-Organizing Map (SOM) to help identify subgroups in CD patients using multiple markers in the blood. We will also examine whether the identified subgroups have different clinical features and are associated with distinct risk factors. This will be the first study to use SOM to explore the relationships between the blood markers associated subgroups in Crohn's disease, multiple genetic risk factors and immune responses to learn more about the causes of Crohn's disease and to develop improved and individualized methods for diagnosing and treating this range of bowel disorders.
Li, Dalin; Haritunians, Talin; Landers, Carol et al. (2018) Late-Onset Crohn's Disease Is A Subgroup Distinct in Genetic and Behavioral Risk Factors With UC-Like Characteristics. Inflamm Bowel Dis 24:2413-2422 |