The Department of Statistics at the University of Wisconsin at Madison will host a Workshop on Nonparametric Statistics for Big Data (June 4-6, 2014, Madison, WI). Nonparametric statistics is a fundamental area of statistics, at the interface of mathematics, statistics, data mining, engineering, and computer science. The complexity and scale of big data impose tremendous challenges for knowledge discovery; they meanwhile demand more powerful and flexible analysis techniques. In recent years, the field of nonparametric statistics has seen significant development in theory, methods, and computation to address emerging issues in big data analysis. New breakthroughs in nonparametric theory have broadened the horizon of classical large-sample asymptotic inferences to accommodate high and ultra-high dimensional situations. A variety of cutting-edge statistical methodologies and state-of-art computational algorithms have been created for big data visualization, geometric representation, dimension reduction, and modeling and inference. These tools have made significant impacts on sciences, engineering, and industry. A broad range of topics will be covered in the workshop, including sparse nonparametric regression, regularization and feature selection, high-dimensional inference and theory, spatial and environmental statistics, image analysis, functional data analysis, as well as related topics in statistical machine learning such as supervised learning, clustering, network analysis, large-scale optimization, computational biology and bioinformatics.

Due to recent technology advances, Big Data are collected ubiquitously in many scientific investigations, such as in biological, genomic, medical, climate, social, and environmental sciences. Given the complexity and huge range of systems being measured, a wealth of new "nonparametric" tools have been emerging that require few assumptions and that adapt to the patterns found in big data. This workshop will bring together broad interdisciplinary expertise from mathematics, statistics, computer science, machine learning, engineering, and biomedical research to highlight cutting-edge research from nationally and internationally renowned scholars and researchers. The workshop will use NSF funding for travel awards to attract graduates students and young researchers, with special attention to women and underrepresented minorites. It will create a unique opportunity for young researchers to interact with leading scientists. Through 30 plenary talks (expository, intermediate, and advanced), open floor discussions, and two poster sections, the workshop will promote new connections and collaborations. Further, this workshop provides an important review that will highlight future research directions nonparametric statistics for big data analysis.

Workshop web site: www.stat.wisc.edu/workshop-npbigdata

Project Report

There are four major goals of this workshop. They are (1) to bring together researchers in mathematics, statistics, computer science, machine learning, engineering, biomedicine and other related fields to address recent development and emerging issues in big data; (2) to promote opportunities for information exchange and new results dissemination; (3) to discuss new ideas and future research directions for nonparametric statistics in big data; (4) to provide opportunities for young researchers to present their works and interact with leading experts and scientists. Major Activities: The two and a half day conference program consists of 10 invited sessions of talks and a poster session. There were totally 22 research talks, and the graduate students with travel support were invited to present their work at the poster session. The sessions covered a wide array of topics, from regularization and smoothing to semiparametric modeling to high dimensional and big data analysis. Specific Objectives: The conference aims to encourage interdisciplinary research collaborations, particularly between theoretical and applied areas. Several of the talks included substantive analysis in various scientific disciplines, including electronic wafer acceptance, earth depth measurement, gene expression for tumor diagnosis, and chromatographic fingerprints. Slide decks from all speakers were donated to the program and are available via the conference web site (www.stat.wisc.edu/workshop-npbigdata). In addition, photographs from the conference are available for download as a mosaic. The conference had several talks that were focused on recent results in theory and applications of nonparametric methods for big data analysis, ranging from spectral clustering and variable selection to spatial data and functional data analysis, with applications in geosciences, climate, biology, finance, energy, and medicine. The breadth and broadness of these results showed the great impact of nonparametric statistics on big data analysis. Further impact includes forming new research directions, establishing new collaborations among researchers, and educating next generation statisticians and data scientists. The conference has an interdisciplinary focus that was evident in the presented talks. Participants spoke on recent applications of nonparametric theory and methods to genomics (Berkeley Drosophila Genome Project – the fruit fly project), medicine (tumor diagnosis, mammalian eye diseases, child metabolic data and nutrition study), biology (mucus modeling), dynamical system identification, finance (stock data), climate modeling (temperature change), energy (electronic energy demand), and geosciences (earth depth estimation). The major impact was to demonstrate the wide applicability of nonparametrics and smoothing methods to statistics and other areas of science. The conference aided in the training and recruitment of the next generation of statisticians and data scientists. 59 graduate students attended the conference and were exposed to the cutting-edge research in nonparametrics for big data and its applications to scientific and societal problems. The presentation and discussion activities provide excellent opportunities for young researches to interact with their peers and senior researchers and develop their own careers as data scientists. The conference "Nonparametrics for Big Data" helped to strengthen the knowledge, training, and career prospects of young statisticians and data scientists. Their enhanced scientific abilities and awareness will help to improve the scientific, industrial, and economic competitiveness of the United States.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1419219
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2014-06-01
Budget End
2015-05-31
Support Year
Fiscal Year
2014
Total Cost
$10,000
Indirect Cost
Name
University of Wisconsin Madison
Department
Type
DUNS #
City
Madison
State
WI
Country
United States
Zip Code
53715