In the past thirty years, changes in technology, business, and government practice have substantially altered the American occupational structure. Our project provides a foundation to understand the consequences of new occupations on the current economy and contemporary society, and to preserve unique data key to documenting these fundamental historical changes. Specifically, this project modernizes the occupational and industry data in the General Social Survey (GSS) from the 1970s to the present time.

The project has several goals. They dovetail recent key NSF recommendations that encourage large infrastructure data sources such as the GSS to facilitate increased data access and dissemination. This can be done by presenting data and metadata according to a well-defined protocol, which will allow desirable modes of data access, search, downloads, and documentation. The project also meets the NSF challenge to retrofit historical or legacy data and metadata to become machine readable. This will possibly open up vast amount of data for dissemination and analysis once issues of confidentiality and disclosure are resolved.

To accomplish this goal, this project will (1) retrieve GSS respondents? detailed verbatim descriptions of their work activities, occupations, and industries from the physical questionnaire manuscripts from early GSS waves, (2) convert them into machine-readable form, (3) recode them to reflect 2010 occupation and 2007 industry categories developed by the U.S. Census, and (4) attach external data such as socioeconomic scores and prestige assessments to the recoded categories.

The intellectual merit of digitizing occupational information and recoding occupational and industry categories in the process is that it enables researchers to use the full potential of the occupation and industry information recorded in the GSS over time. Doing so will enhance the value of the GSS as a resource for comparative and contemporary research on social inequality, mobility, and other fields and preserve its growing value as a historical database describing trends in U.S. society over two generations. Ensuring the longevity of such legacy data by converting hand-written text into machine-readable text, the project also develops an archive of verbatim descriptions that will allow future researchers to code them using other standards, including U.S. Census standards that may become available in upcoming decades.

Broader Impacts The GSS is a public resource as well as a scientific one. Public media, especially newspapers, make extensive use of the GSS. By improving the quality of occupational and industry information in the GSS and ensuring that it is coded in a consistent way over time, this project will help journalists and citizens make sense of social trends and patterns. Also, high schools and colleges make extensive use of the GSS as a teaching tool. Teachers and students will get more out of these exercises from the new data products this project will produce when data reflect contemporary distinctions among occupations and industries as accurately and precisely as possible.

Project Report

In order to better understand important societal changes over the last 40 years, occupation and industry variables relating to respondents, spouses, fathers, mothers, and others were coded into one consistent classification using the latest industry and occupational classifications, the 2007 NAICS codes of industry and the 2010 Census codes of occupation. Verbatims from previous rounds (1972-2010) of the General Social Survey (GSS) were accessed and recoded into the consistent, updated classifications. Altogether 348,624 new industry and occupation codes were assigned. In addition, the newly collected 2012 GSS data were directly coded into the new codes. Subsequent GSSs will continue to use these same codes and will thus extend the time series.The new industry and occupation codes covering 1972-2012+ will greatly enhance research on intergenerational social mobility in general and occupational mobility in particular, on the nature and causes of growing income inequality, on the rewards/benefits of higher education, on the level of and variations in job satisfaction, and on other important trends. Both basic research and applied, policy-related research will both be assisted by these new codes. The new codes will be released to the social-science community and general public without charge as part of the 1972-2014 GSS cumulative file in early 2015.

National Science Foundation (NSF)
Division of Social and Economic Sciences (SES)
Standard Grant (Standard)
Application #
Program Officer
Patricia White
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
National Opinion Research Center
United States
Zip Code