This project will improve the process of patent search by combining methodologies for measuring the patent space developed at the University of Michigan School of Information with methodologies developed at IBM's Almaden Research Center for browsing and exploring topics and concepts within a large document collection. This combined methodology, called Patent Cartography, will leverage multiple taxonomies, related terms, network analytics, visualization, and user interaction to navigate, explore, and map the patent space.

As the rate of patenting has increased, so too has the problem of patent thickets, or dense webs of overlapping intellectual property rights that an organization must hack its way through in order to commercialize new technology. In certain industries characterized by cumulative innovations and multiple blocking patents, the existence of such densely concentrated patent rights can have the perverse effect of stifling innovation rather than encouraging it. A recent Federal Trade Commission report notes that in certain industries, the large number of issued patents makes it virtually impossible to search all the potentially relevant patents, review the claims contained in each of those patents, and evaluate the infringement risk or the need for a license. For many firms the only practical response to this problem of unintentional and sometimes unavoidable patent infringement is to file hundreds of patents each year so as to have something to trade during cross-licensing negotiations. In other words, the only rational response to the large number of patents in a given field may be to contribute to it.

Given that 200,000 US patents are issued each year, with new patents issuing each week, any attempt to analyze and comprehend the dynamic topology and interconnectedness of patent space will almost certainly have to be based on information technology. The process of patent search, however, has not progressed very much even with the advent of searchable patent databases. Although automated, the patent search process itself has not been reengineered and thus remains functionally similar to the patent search process developed in the nineteenth century.

While end-users do not have the necessary tools to conduct exhaustive patent searches, professional patent searchers are overwhelmed with an ever increasing workload driven by an increased awareness of the importance of intellectual property. The need for effective end-user patent search capabilities will only increase over time, yet the traditional search process must be reengineered for that need to be met.

Using sets of existing professionally-generated patent searches, this project will compare end-user searches conducted with existing tools with other searches conducted with the new methodologies. Several different classes of end-users have been identified, with different levels of previous knowledge and experience with the patent system, and this project will compare the search results produced by each class of user with the professional patent search results. At the conclusion of the study, we will have not only a better understanding of the applicability of Patent Cartography and visualization techniques to the process of patent search, but also an understanding of the impact of familiarity and experience with the patent system in harnessing the potential of Patent Cartography.

This project will expand the boundaries of document retrieval beyond simple keyword and category search into multidimensional analysis combining mixed initiative data and text mining with visualizations. The project will also combine content and network analysis and evaluate visualizations based on both proximity and dependency data. Search processes and search strategies will also be examined in the context of large document collections. The project will also develop models of task-based information searching, which has been generally under-explored. Finally, the project will develop an ethnographic account of the patent search process by observing patent search experts.

This project will extend the scope of research on searching, analyzing, and visualizing large document collections. Patents are one exemplary form of large digital library repositories that contain both structured and unstructured data to which these techniques can be applied. Other sources include Medline, Research Abstracts, repositories of the World Wide Web, and other enterprise repositories. This search process could be leveraged in health care, life sciences, pharmaceuticals, market research, competitive intelligence, knowledge discovery, and tracking technology maturity and innovation to name just a few.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0855352
Program Officer
William Bainbridge
Project Start
Project End
Budget Start
2008-09-01
Budget End
2011-08-31
Support Year
Fiscal Year
2008
Total Cost
$178,220
Indirect Cost
Name
University of Houston
Department
Type
DUNS #
City
Houston
State
TX
Country
United States
Zip Code
77204