This Small Business Innovation Research Phase I project combines methods from natural language processing (NLP) with regression and classification techniques from statistics and machine learning to determine the feasibility of associating opinions with outcomes in business and industry. The research objectives are the following: 1) Determine whether or not automatically extracted opinion information is associated with security value trajectories, with asset value trajectories, or with some other measurable value (e.g. market penetration of a product or product line) and, 2. Use predictive models to investigate which specific media sources and opinion holders are most influential and describe these influences on the outcome. The research builds on previous opinion-extraction research where information extraction and machine learning techniques from natural language processing were adapted to handle subjective language. This project focuses on research in statistical modeling where features/predictors derived from automatically extracted opinions will be used to augment predictions of interest to information analysts and decision-makers in business and industry. If successful, the project will result in the development of services that allow decision makers to better understand who and what is influencing their company, customers, competitors and marketplace, in an environment where trend-setting content originates from an exploding number of information sources.
Although this SBIR project focuses on the uses of automated opinion analysis in business and the financial market, the techniques and services that will be developed are domain independent: they can just as easily be applied to opinions and outcomes in other industries or in politics, regulatory policy, foreign policy, sports and entertainment. The methods might also be used to track opinions on narrower topics of interest for users of the service, e.g. climate change, urbanization, sustainable architecture, their favorite presidential candidate. The market opportunity for text analytics is projected to grow from the current $700 Million to $2 Billion over the next three years.