A central challenge in conducting research on policy questions in economics is a lack of high quality data. Survey datasets are limited by small sample sizes, often do not follow individuals over time, and suffer from poorly measured variables. Working in collaboration with the Internal Revenue Service (IRS) over the past two years, the PIs--comprising Raj Chetty, John Friedman, and Emmanuel Saez--recently obtained access to the universe of individual tax records of the United States. This is the first time that researchers outside the government have worked with these data, which provide an unprecedented resource for academic and policy research. The dataset spans 1996-2008 and contains 1.7 billion tax records, including all forms (e.g., 1040?s, W-2?s, 1099?s, etc). In addition to earnings, the data can be used to study housing purchases, investment decisions, business startups, college education choices of children, inter-generational income mobility, and many other outcomes. Thus, these data provide a wealth of information to study not only questions in the economics of taxation but also more broadly in public finance, labor economics, economics of education, finance, and macroeconomics.

This project has two objectives. The first is to prepare these data for academic and policy research. The data are stored in many different files and must be cleaned and structured for use in research projects. Key steps include: (1) addressing problems of missing tax forms and outliers, (2) merging age and death information from other government databases, (3) creating individual employer-employee matched wage earnings histories by linking W-2?s to 1040?s, (4) linking parents to children, (5) identifying college attendance, and (6) developing clear documentation of the merged dataset for future researchers. These first steps are highly labor intensive.

The second objective is to use these data to research three specific questions relevant to the effects of government policy: (1) How do income shocks affect individuals? labor and investment behavior? The PIs will investigate both traditional labor supply income effects and also the impact of income grants on entrepreneurship, homeownership, the decision to send children to college, and intergenerational income mobility. (2) What are the long-term effects of income support programs such as the Earned Income Tax Credit on earnings, children's education, and income mobility? (3) How do local economic shocks such as government stimulus or plant closures propagate through communities through general equilibrium and spillover effects? The analysis will suggest new theories and provide much more precise estimates of key parameters such as local fiscal multipliers and income elasticities.

The intellectual merit of this project is twofold. First, it will pave the way for all academic researchers to make use of the IRS data to study a wide array of policy and academic questions. The IRS views this project as a demonstration of how its data can be used for academic research, and a successful demonstration may lead to wider data access through a mechanism analogous to Census data centers. Second, the PIs will contribute to knowledge about long-run and general equilibrium impacts of income support and stimulus policies, which are difficult to study using existing data.

The broader impact of this project is to help policymakers design government policies that are more likely to maximize welfare. For instance, these results will help policymakers understand the long-term consequences of the Earned Income Tax Credit on the poor. Does this policy improve the income mobility of the poor or instead foster a cycle of dependence on government support? More generally, the basic investment of preparing these data for research will allow policy makers to obtain much more precise answers to a broad range of policy questions in the years to come.

Project Report

With the support of our NSF grant, we have completed three projects addressing important questions about the outcomes of tax and education policy. The Earned Income Tax Credit (EITC) is the largest program in U.S. that directs money towards low income families with children, paying out nearly $50 billion per year in subsidies to low income households. Our research exploits regional differences in knowledge about the EITC benefit structure to assess the impacts of that structure across the distribution of eligible incomes. As predicted by theory, we find that the EITC compresses the earnings distribution around the level of wages at which the EITC credit is maximized. Those workers receiving a subsidy increase the amount earned, while those facing higher tax rates in the phase-out range cut back. Since we use double-reported W2 wages as our measure of income, these responses reflect true changes in labor supply and not just income misreporting. Our findings also suggest that state EITC programs successfully support low-income families without discouraging work, but there is no increase in the incentive effects for the lowest earners into the labor force Another of our projects assessed the long-term impacts of early classroom size in a revaluation of Project Star data. Project Star was an experiment which randomized 12,000 students to classrooms between Kindergarten and 3rd grade in Tennessee during the years 1985-1989. Though hundreds of papers have been written on the results, they have generally been confined to using test scores as the outcome of interest, and drawn different conclusions based on the horizon of test scores considered. Our research looks beyond test scores by linking students’ Star records to their adult incomes, allowing us to examine how early classroom size affects income later in life. We find that early test score gains translate into large long-run improvements for the STAR subjects. For each percentile point of test score gains, students earned 0.3% higher annual wages and were 0.1 percentage points more likely to attend college at age 20. These numbers imply that interventions to early education have tremendous returns. Our third project looks at the role of teachers in generating positive long-run outcomes for students. There is considerable debate about the best way to measure and improve teacher quality. One method is to evaluate teachers based on their impacts on students’ test scores, commonly termed the "value-added" (VA) approach. A teacher’s value-added is defined as the average test-score gain for his or her students, adjusted for differences across classrooms in student characteristics such as prior scores. By tracking one million children from a large urban school district from 4th grade to adulthood, we evaluate the accuracy of standard value-added measures using several methods, including natural experiments that arise from changes in teaching staff. This study has two primary findings: first, that value-added measures accurately captures teachers’ impacts on student achievement, and second, that teacher quality has a substantial impact on future earnings.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Type
Standard Grant (Standard)
Application #
1025490
Program Officer
Georgia Kosmopoulou
Project Start
Project End
Budget Start
2010-09-01
Budget End
2013-08-31
Support Year
Fiscal Year
2010
Total Cost
$285,348
Indirect Cost
Name
Harvard University
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02138