Osteoarthritis (OA) is highly prevalent, contributes to substantial morbidity in the population, and lacks effective interventions to prevent onset and progression. Importantly, and like many other chronic conditions, OA is not a single disease but rather a heterogeneous condition consisting of multiple subgroups, or phenotypes, with differing underlying pathophysiological mechanisms. It is becoming increasingly clear that consideration of specific OA phenotypes in clinical studies and trials is critically needed to move the field forward. The overall goal of this line of work is to identify and understand potential phenotypes of knee osteoarthritis (KOA) to better inform future research efforts and treatments; this exploratory R21 project using OA Initiative (OAI) data will investigate novel methodology to support phenotyping in KOA. Successful treatments for OA will need to be targeted to, and tested in, specifically chosen OA phenotypes. Our hypothesis is that an understanding of KOA phenotypes, a key step toward Precision Medicine in OA, will lead to more successful clinical studies in the long-term. To approach this important clinical problem, we propose a project in which we will apply innovative machine learning methods and validation strategies to data from the large, publicly available OAI cohort. We will leverage this large dataset, along with local expertise in statistics, biostatistics and machine learning methodology, to tackle the problem of phenotyping this heterogeneous disease.
In Aim 1, we will utilize a data-driven, unsupervised learning approach, to cluster features that best define and discriminate among phenotypes of KOA in the OAI dataset, using biclustering and a novel significance test (SigClust) developed by co-I Marron.
For Aim 2, we will test specific hypotheses of relevance to OA outcomes, such as differences between those with and without OA, or those who do or do not develop new or worsening disease, using another set of machine learning methods (Direction-projection-permutation [DiProPerm] hypothesis testing, and Distance-Weighted Discrimination [DWD]), also developed by co-I Marron, in the full cohort and in any identified clusters from Aim 1. In order to address these aims, this proposal involves interdisciplinary collaborations among experts in statistics, biostatistics, computer science, rheumatology, and epidemiology. This work will significantly impact the field by fulfilling a critical need to accurately define OA phenotypes, discover the key features associated with these phenotypes, link phenotype subgroups to underlying mechanisms and use this information to inform and focus future clinical studies. In the long term, we expect that this strategy will lead to more personalized and successful management of the millions of people affected by OA.

Public Health Relevance

Osteoarthritis is an enormous and increasing public health problem, and like many other chronic conditions it is not a single disease but a heterogeneous condition consisting of multiple subgroups, or phenotypes, with differing underlying mechanisms. The lack of appreciation of this heterogeneity has contributed to the failure of all attempts to date to develop disease-modifying osteoarthritis drugs; future trials will need to target specific OA phenotypes. There is a critical need to define and understand phenotypes in OA and link these to outcomes, leading to more personalized and successful management of this common and debilitating disease.

National Institute of Health (NIH)
National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Skeletal Biology Structure and Regeneration Study Section (SBSR)
Program Officer
Zheng, Xincheng
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of North Carolina Chapel Hill
Internal Medicine/Medicine
Schools of Medicine
Chapel Hill
United States
Zip Code