This innovative project explores in a novel fashion an emerging phenomenon of significant importance to academic research. Public academic information resources on the social web are growing in size and number at a very rapid rate. The goal of this project is to develop a quality assessment and an association discovery framework for online academic information, and ultimately to establish a novel framework for supporting researchers in accessing, organizing, utilizing, and exchanging all types of academic information. The proposed assessment framework will utilize the rich behaviors expressed in social reference data to design quality assessment measures that are tightly connected to the new data. The framework will utilize publicly-available academic information on social reference sites, and examine association rules that connect two articles based on time periods, topics and social-network based measures. The dynamic information environment of the web adds considerably to the challenges associated with this research. However, if successful, the results will be of considerable value to many communities of domain researchers.
For many years, scholarly communication has been based on paper-based dissemination channels, over which a citation-oriented academic quality evaluation framework presides. The new challenges in supporting scholarly communication address the questions of how to utilize the new and rapidly expanding public academic information resources on the social Web. This is motivated by the large quantities of scholarly communications taking place on the social Web rather than through paper-based traditional channels. In addition, the rapid pace of generating and utilizing academic information makes traditional citation analysis-based quality and impact measures too slow and thus out of sync with the latest academic developments. Under the support of our NSF grant titled Tapping into Public Academic Information on the Social Web: Towards a Novel Academic Recommendation Framework (IIS-EAGER 1052773), the iRiS Lab at the University of Pittsburgh’s School of Information Sciences led by Prof. Daqing He has conducted cutting-edge research to resolve these challenges. A data collection called PASS containing both traditional publication information and online social academic usage data (called social reference) from Web sites such as CiteULike has been created; we used PASS to explore and identify the advantages and limitations of using social reference to study scholarly communication. The explorations demonstrate that social reference makes a unique contribution to modern fast-paced scholarly communication, and it can be used to complement to traditional citation-based methods. Several social reference-based academic influence measures have been validated. Another major contribution of this project is the development of several task-oriented recommendation algorithms, using the PASS data. The iRiS Lab began by conducting experiments utilizing real researchers and data about their reading behavior on the social Web. The results demonstrate researchers’ strong interests in reading cross-disciplinary articles, and that recommendation algorithms can perform more effectively after resolving the issue of how to find cross-disciplinary articles. Through further studies using the PASS data, it was found that articles' crucialness (e.g., the temporary state in which the articles are useful, but comparatively unknown in the community) can play a significant role in scholarly communication, and yet it is different from the article’s popularity (measured by number of citations). This novel aspect of articles' quality as identified by crucialness can better be modeled using social reference data. Then, the iRiS Lab expanded existing latent topic models with author associations on social networks, so that two novel models (called RATM model and ATM-Regu model) were developed and evaluated using the PASS data. ATM-Regu outperformed all other models including two state-of-the-art baselines: ATM and TBMP-Regu. Finally, collaborative information seeking becomes increasingly important because of the complexity of current academic projects. The iRiS Lab developed a set of search tactics that describe both the search and collaborative activities in collaborative information seeking, and constructed HMM-based transition models for conceptualizing the critical steps in the process. During the course of the project, the iRiS Lab has developed several prototype scholarly communication related systems. iRiS-IPS (http://crystal.exp.sis.pitt.edu:8080/acm/index__.jsp?uid=0) models people discovery (such as finding collaborators, identifying external reviewers, and so on) as an interactive exploratory process, where users can work with an integrated interface to specify the importance of different factors for selecting candidates. CollabSearch is a web search system for either a single user or a team of users (http://crystal.exp.sis.pitt.edu:8080/CollaborativeSearch). Its interface and backend data flow support both individual search as well as team collaboration. CollabSearch provides novel virtual working spaces for the team to resolve complex information seeking tasks that require collaboration. Overall, the achievements in this project benefit the whole scholarly community, regardless of the discipline or methodology, by providing new online academic resources, clarifying their effects and implications, and developing novel algorithms and systems. Further, the development of the insights, algorithms and systems will be the first step to making academic information recommendation an important element of a global information infrastructure. This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.