Experimental studies are powerful because the experimental framework allows one to examine causal effects. Some research questions, however, are not amenable to an experimental framework. For example, if one is interested in whether long-term smoking causes lung cancer, it is not possible to randomly assign people to smoke for 30 years while randomly assigning others to not smoke for 30 years. Instead, this research must rely on observational data where one simply observes the rate of lung cancer among smokers and non-smokers who have made individual decisions about whether they will smoke. To get around the lack of randomization, researchers have attempted techniques that manipulate observational data to resemble the randomized experimental framework by matching individuals from the treatment group to individuals from the control group who are similar in measurable ways (age, income, education level, previous medical history, etc.) except that the treated individual smokes while the control individual does not. There is no consensus on how best to proceed in matching individuals, and the problem grows increasingly difficult as one tries to match on larger numbers of attributes. This project provides a novel formulation of the matching problem. The key insight is that matching individuals is neither necessary nor sufficient for simulating randomization. In an experiment, randomization ensures that the treatment group and the control group do not differ systematically on any attribute, but there does not need to be a ?twin? in the control group for each treated subject. The proposed procedure ensures systematically similar treatment and control groups by choosing treatment and control groups that maximize similarity in attributes. Moving away from the individual ?twin? approach allows for the exploration of a wider range of possible and better suited treatment and control groups. The PIs formulate procedures for addressing this problem that are superior to existing methods.

Enhancing the ability to make causal inferences from observational data will stimulate research in a wide variety of fields and enhance our understanding of a broad array of phenomena. In political science, research on causal relationships include, but are not limited to, understanding the role of information on voters in advanced versus new democracies, the impact of different voting technologies for counting votes, whether proportional rather than majoritarian electoral systems are more effective for incorporating underrepresented groups, the extent to which degrees of campaign exposure affect the type of information individuals possess about politics, whether voter canvassing efforts are effective, and the effect of affirmative action on passing bar exams. The research questions are important and diverse, and the potential applications are limitless given a proper research design. In medicine or health, causal inference studies include applications to criminality rates related to gene patterns, the effect of generic substitution of presumptively chemically equivalent drugs, and the effect of maternal smoking on birth weight, to name but a few. Studies are certainly not limited to political science and health, and it would be simple to compile similar lists for a varied set of interesting and pressing queries in many other fields of study. The value of this research and its potential impact affects a diverse scholarly community.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Type
Standard Grant (Standard)
Application #
0849170
Program Officer
Brian D. Humes
Project Start
Project End
Budget Start
2009-07-01
Budget End
2012-06-30
Support Year
Fiscal Year
2008
Total Cost
$50,975
Indirect Cost
Name
Southern Illinois University at Edwardsville
Department
Type
DUNS #
City
Edwardsville
State
IL
Country
United States
Zip Code
62026