PROPOSAL NUMBER.: DMS-0505599 INSTITUTION: Rutgers University New Brunswick NSF PROGRAM: STATISTICS PRINCIPAL INVESTIGATOR: Madigan, David PROPOSAL TITLE: Bayesian Methods for Large-Scale Applications
The investigators work on Bayesian statistical methods for large-scale applications. Three applications provide the backdrop for the work. "Text categorization" concerns the automatic assignment of documents to predefined categories and requires ultra-high dimensional supervised learning models. "Authorship attribution" uses similar methodology but attempts to identify authors of anonymous documents. The "Localization" problem uses signal characteristics to locate users in wireless networks. The investigators focus on technical challenges that span these applications including sequential Bayesian analysis, non-linear optimization, and novelty detection algorithms.
In both the business and scientific realms, computing advances have drastically altered the role of data analysis. Historically, analysts produced data locally to address specific research questions. Now, ubiquitous computing and cheap storage have decoupled the production of data from the research questions. Data of all kinds are produced and deposited in remotely accessible databases with myriad questions in mind, both foreseen and unforeseen. Statistics has historically focused on squeezing the maximum amount of information out of limited data. The investigator's work focuses instead on so-called Bayesian statistical methods for these emerging larger-scale applications