Assignment of individuals to categories of race, ethnicity and ancestry impacts health and public policy, yet the practice remains both scientifically and culturally controversial. The established means of determining race and ethnicity, as commonly used for census and health questionnaires, is self-identification. However data is accumulating from social science research showing that an individual's reported ancestry is dependent on social and cultural context. At the same time, modern genetic studies have identified robust markers of ancestry. Although some studies have shown good correlation between genetic ancestry and self-identified race and ethnicity (SIRE) for major ethnic groups in the United States, there is less correspondence and inconsistent classifications for heterogeneous and admixed groups, most notably for African American, Hispanic and multi-race individuals. While it is clear that current classifications based on SIRE do not align reliably with genetic ancestry, alternative means of self-identification have not been tested in the context of genetics. Contention over racial and ethnic definitions and the role of genetics in social science research has resulted in limited interdisciplinary work examining how individuals self-identify. This study will build on contemporary knowledge in population genetics and social demography in a novel cross-disciplinary study to produce a framework for understanding ancestry that integrates both self-identification and genetic information. We will perform rigorous assessment of alternative approaches to measuring SIRE by direct comparison between survey responses and genetic markers for the same individuals. We will design questionnaires that query multiple dimensions of ancestry and group identity. Based on these survey responses we will utilize computational approaches to identify clusters of individuals and validate them against those determined from genetic analysis. We will examine whether novel and multiple measures of race, ethnicity and ancestry allow for more consistent and salient classifications of human population groups than previously achieved. We will also develop novel self-identification methods for historically admixed populations for whom current means of self-identification and classification align poorly with genetic ancestry. We will measure the degree to which individuals'motivations or interest in personal genealogy and ancestry play a role in their responses to questions regarding their own ancestry, and query respondents on general attitudes toward defining and interpreting genetic ancestry. Our findings will allow assessment of the factors contributing to variation between SIRE and genetic ancestry and development of best practices for future use of SIRE for medical and genetic research. Further, given the broad applications of SIRE data in the U.S., this study has the potential to transform methodological approaches to the study of race, ethnicity and ancestry not only among a wide range of disciplines but also for public policy data collection.
The established means of determining race and ethnicity for biomedical research and census and health questionnaires is self-identification, but social science research has shown that an individual's reported ancestry is dependent on social and cultural context. Meanwhile modern genetic studies have identified robust markers of ancestry. This study will build on contemporary knowledge from these two fields to produce a framework for understanding ancestry that integrates both self-identification and genetic information. Our findings will have significant cultural impact, encompassing broad areas of academic research and public policy.