Are phenotype algorithms fair for underrepresented minorities within older adults?- Revision

Henderson, Victor; Periyakoil, Vyjeyanthi; Yesavage, Jerome

Abstract

Are phenotyping algorithms fair for underrepresented minorities within older adults? According to the latest population estimates by the US Census bureau, the United States population is growing older and more diverse, with their projections indicating that in 2035 there will be more people 65-and-older than 18-and-younger. In terms of diversity, the US Census Bureau projects that by 2060 the non-Hispanic White-alone population will shrink by 20 million, leading to the increased representation of currently underrepresented minority groups. These future population milestones paired with the rise of artificial intelligence (AI) use in medicine, present a very unique and problematic problem: determining if AI solutions are fair for underrepresented populations within older adults. This has been a very debated issue, with opinion pieces in the New York times making it clear that when done incorrectly, AI solutions can actually worsen health disparities in underrepresented populations due to the nature of the training sets for said AI solutions. It is intuitive to understand how dermatology AI systems could be biased to perform better for lighter skinned individuals, due to the prevalence of said patients in the training datasets. Other biases come from the fact that there is simply a lack of diverse research subjects, which is why their mortality outcomes are considerably worse. Electronic phenotyping algorithms are the cornerstone for automatic identification of selected patient groups for tasks like disease classification, epidemiological studies and clinical-trial recruitment. These algorithms are both rule-based or probabilistic in nature (machine learning models), and are usually built by using bundled patient populations (everybody put together), with very few exceptions (due to the nature of the phenotype) having some sort of stratification of populations. This proposed project seeks to identify bias in probabilistic electronic phenotype algorithms for older populations and create best-practices and software tools to overcome them in order to lead to better health outcomes.

Public Health Relevance

Are phenotyping algorithms fair for underrepresented minorities within older adults? We will determine racial and age bias in probabilistic phenotype algorithms in both local, and national datasets. After characterizing bias, we will develop best-practices and software tools to improve the phenotyping process to achieve fairness for underrepresented minorities.