Despite evidence that innovation has become increasingly collaborative, our understanding of collaborative creativity and its impact on economic productivity remains incomplete. For example, how should firms structure the collaborations of their inventors, given that networks which enhance the generation of a new idea also appear to hamper the dissemination of that idea? How does knowledge flow within and between regions and how can policy makers influence those flows for maximal social welfare? How does investment in scientific research result in peer-reviewed publication, the diffusion of knowledge, invention, and patenting, and ultimate gains in economic productivity? These questions remain unanswered because social network data are difficult to gather, particularly across time, space, and boundaries. This proposal allows answering these questions by calculating and posting millions of relational data, based on all co-authorship ties between inventors of U.S. patents, from 1963 through the present.
The database complements the National Bureau of Economic Research (NBER) patent database by providing the social networks of patent co-authorships. It creates a standard social network patent database at the individual inventor and aggregate levels including organizational, regional, and technological. It reduces barriers to entry to scholars who lack the requisite programming skills and hardware to create the data on their own and enables real time graphing of patent co-authorship networks. It provides a website by which actual inventors can assess the accuracy of the algorithms used to uniquely identify them in the patent database. Finally, the project provides data on a public website accessible by scholars, business analysts, and students. Just as the original NBER database has unleashed broad and diverse scholarship on innovation (over 500 papers cite it, according to Google Scholar, by early 2008), the social network database project will unleash a similar wave of research, focused on collaborative creativity and the social networks of inventors and their organizations.
Broader Impacts: The research will publish social network data from all U.S. patent co-authorships (1963-present) for use by researchers, students, and business analysts. It will provide real-time ability to visually illustrate these networks, with a variety of variables illustrated by color and size of the nodes and co-authorship links. It will enable answering how managers should structure collaborative relationships and how information flows across organizational, regional, and technological boundaries. It will enable tracing the career productivity and mobility of millions of inventors around the world. In conjunction with other databases on research grants and scientific publication, it will illuminate the process of knowledge creation and dissemination at many levels of analysis, from the individual, to the organizational, and international.
The primary purpose of this project is to make social network data, based on the co-authorship of U.S. patents from 1963 through the present, available to researchers and the public at large. The plan, within six months of project start, is to make the data available from the Harvard MIT Data Center. The data will also be archived in the Henry A. Murray Research Archive. Raw data and the source code will be included in these postings.
The ability to collect, organize, and analyze large amounts of patent data has become increasingly important in our effort to understand the processes of invention, innovation, and job creation. The US patent database since 1975 is available to researchers, yet the individual inventors and their careers across this time period are not identified in the raw data. For example, the Principal Investigator in this study has two patents: one is owned by Lee Fleming, who lives in Fremont, CA, and works for Hewlett Packard, and the other is owned by Lee O. Fleming, who lives in Fremont, CA, and also works for Hewlett Packard. While a naïve reader could probably determine that this is the same person, this decision cannot be made manually across 40 years and millions of inventors and patents (think of someone having to compare every single inventor on every patent to every other inventor on every other patent –essentially an intractable "clustering" problem). To solve this problem, we applied automated methods (Bayesian supervised learning is the technical term) and have made the inventor careers available to the general public (http://dvn.iq.harvard.edu/dvn/dv/patent). The data have been downloaded almost 10,000 times by researchers and businesses. Much research, even now being completed and not yet published, has been enabled by these data. As an example, the Principal Investigator applied these data to determine the impact of noncompete agreements on inventor mobility (noncompetes are employment contracts that forbid an employee for working for a competitor, typically within the same state and within a period of time, for example, a few years). With colleagues, he determined that the enforcement causes a brain drain from states that enforce noncompetes to states that do not (think of the best inventors leaving Massachusetts, to work in California). This occurs because inventors have better job prospects and career flexibility, in states that do not enforce noncompetes. The brain drain shows up in the cross sectional data of the last 40 years, but more importantly from a research viewpoint, it shows up very strongly in a natural experiment in Michigan. The Michigan legislature inadvertently flipped the legality of noncompetes - from prohibited to allowed – in 1985. Using this unexpected change, the PI established that noncompetes drive the best and most collaborative inventors out of states that enforce, into states that do not enforce noncompetes. This is an important policy result, as such elite inventors are exactly those that drive technology economies and create jobs, in both established firms and even more importantly, in start-ups and entrepreneurial ventures. Much of Silicon Valley's success has been attributed to noncompetes, yet this was the first research to establish a causal linkage between noncompetes and brain drain. It could not have been done without the research to idenify the inventor careers in the first place.