This project funded by the Ethics and Values in Science, Engineering and Society component of the Science and Society Program studies lapses of scientific integrity arising from failures to compare the accuracy of a preferred analysis method with that of alternative available methods. Contemporary applied science is driven by dramatic increases in both its measurement capacity and concomitant computational power, resulting in a fundamental change from a traditional hypothesize-and-test methodology to a search-databases-for-hypotheses methodology. The results of work with algorithmic data analysis, based on large databases are often essentially impossible for independent groups of scientists to replicate or test by virtue of practical considerations of cost, algorithm documentation, and data access. One of the bastions of scientific reliability, independent confirmation or refutation, is therefore increasingly mitigated if not lost Accordingly, it is important to the accuracy and reliability of automated or semi-automated science that professional standards develop and be inculcated in a wide range of sciences requiring objective, empirical testing, proofs of correctness under explicit assumptions, understanding of those assumptions in concrete applications, and consideration of alternative methods addressing similar goals. This project will describe cases in which validation of automated and semi-automated procedures has been done systematically and carefully, and contrast them with cases in which one or another important aspect of evaluation of the accuracy of algorithms has been neglected. Cases will include work on coastal sea floor mapping, estimation of gene regulation networks, robotic determination of mineral composition, uses of factor analysis and regression in estimating effects of intelligence on behavior, estimates of the effects of low level lead exposure on the intelligence of children, forecasts of wildfire, and estimates of pneumonia prognosis. The investigation will articulate general standards about validation of search and forecasting procedures, and consider a variety of reasons why such standards are sometimes not met, both in the particular examples and more generally. This work will result in a book, in a module for a widely used automated course on causal and statistical reasoning, and in policy essays. The broader impact in this research should be a heightened awareness of methodological issues in the deployment of automated search procedures in applied sciences, and, one hopes, eventually an emphasis in the instruction of students in many fields on the necessity of establishing the conditions for accuracy of search and forecasting procedures, and the obligation to use the most reliable, feasible procedures available.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Type
Standard Grant (Standard)
Application #
0551838
Program Officer
Laurel A. Smith-Doerr
Project Start
Project End
Budget Start
2006-06-01
Budget End
2008-09-30
Support Year
Fiscal Year
2005
Total Cost
$145,000
Indirect Cost
Name
Florida Institute for Human and Machine Cognition, Inc.
Department
Type
DUNS #
City
Pensacola
State
FL
Country
United States
Zip Code
32502