Despite the prevalence of social network platforms and apps in nowadays daily life, existing research on social media takes the terms "social" and "media" separately, and fails to address important needs for intelligently managing and utilizing social media, such as finding the information that users want, situating information in a social context that gives it meaning, and providing order and structure to an intricate and intertwined network of relationships. This interdisciplinary project will provide a holistic view of social media by combining socially intelligent language processing with linguistically motivated social network analysis. Specifically, the project will: (a) discover sociolinguistic communities and identify the demographic and sociological factors that underlie community membership; (b) discover cross-community linguistic variation at various levels and develop new computational tools for dialectometric and sociolinguistic analysis and for prediction of user interests and trends; and (c) recommend content and social connections across community boundaries, which will help people to broaden their perspectives with new information, opinions, and social relationships.

Intellectual merit: The project will lead to (a) new modeling formalisms that jointly incorporate linguistic information with social network metadata; (b) a new computational methodology for sociolinguistic investigation from raw text; and (c) flexible models of linguistic variation that model temporal dynamics and move beyond simplistic bag-of-words approaches to higher-order phenomena such as multi-word expressions, syntax, and joint orthographic variation.

Broader impacts: The project will lead to advancements in basic research in statistical machine learning, social sciences, and language technology. It will also bring innovations and practical applications in all these areas, such as software that reasons intelligently about community structures and linguistic patterns and conventions in social media. The findings will benefit a wide range of needs, such as personalized information service and intelligence and security operations, which require precise and timely understanding of social-cultural events and trends. The project will also provide undergraduate research opportunities and outreach to high school students through summer programs.

National Science Foundation (NSF)
Division of Information and Intelligent Systems (IIS)
Standard Grant (Standard)
Application #
Program Officer
William Bainbridge
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Carnegie-Mellon University
United States
Zip Code