Social scientists are increasingly using data from diverse, worldwide sources to understand the causes and consequences of social stratification, migration, economic development, cultural diversity and violent conflict. This work frequently requires merging multiple datasets over fine-grained sociopolitical categories (e.g., ethnicities, languages, religions, or provinces). However, different datasets often encode corresponding sociopolitical categories with different variable names, at different scales, and different points in. These different encodings must be translated across datasets before merging, thereby creating a substantial bottleneck to interdisciplinary, comparative social science. This project will build a new data analysis platform called SocioMap that will help overcome this bottleneck by providing tools for translating across a large and growing body of international social science datasets and for building new datasets from diverse existing datasets storing ethnic, religious, linguistic, and administrative boundaries in disparate, and non-compatible formats.

SocioMap will be a user-friendly set of tools to help translate sociopolitical categories and classification schemes across multiple, external datasets. This project will focus on four kinds of categories — ethnicities, languages, religions, and administrative boundaries — that are commonly used in social science research. The beta version of SocioMap will be injected with a critical mass of these categories and translations between these categories across dozens of common standards and hundreds of demographic surveys and censuses worldwide. Furthermore, SocioMap is designed to grow by permitting registered users to add new categories and translations for re-use in future projects. SocioMap’s tools will help users: (1) explore contextual information about specific sociopolitical categories, (2) translate and share categories from new datasets, standards, and published studies, (3) merge novel combinations of datasets for researchers’ custom research needs, and (4) document key aspects of the analytical workflow that often go unreported and, in so doing, make analyses more reproducible. SocioMap complements existing observational datasets by providing tools for linking these datasets with social, cultural, and demographic data from a diverse and growing body of new datasets. SocioMap’s capabilities will spur new interdisciplinary research, encourage new analyses of publicly funded data and re-assessments of the robustness and reproducibility of past findings. It will also create a foundation for future work with other kinds of entities (e.g. political parties, non-governmental organizations, industry and occupation classifications, and firms) and for reconciling sociopolitical entities over longer historical periods.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Behavioral and Cognitive Sciences (BCS)
Type
Standard Grant (Standard)
Application #
2051369
Program Officer
Patricia Van Zandt
Project Start
Project End
Budget Start
2021-06-01
Budget End
2023-05-31
Support Year
Fiscal Year
2020
Total Cost
$155,528
Indirect Cost
Name
Arizona State University
Department
Type
DUNS #
City
Tempe
State
AZ
Country
United States
Zip Code
85281