Modern technologies in science and engineering generate data that become bigger in size and more complex in content. It is important and challenging to understand such data using statistical and computational methods, especially to identify trends in the data. Examples include large scale sequence data from genomics and market data from economics. In genomics, researchers search for copy number variations (CNVs) by examining hundreds of thousands of measurements of biomarkers along the whole genome. In financial engineering, it is useful to identify and interpret abrupt changes of stock prices. In these examples, a premier goal is to discover structural changes from massive sequence data. In this collaborative project, the investigators intend to develop and study theoretically sound and practically flexible and portable strategies to analyze complex sequence data and apply the proposed algorithms to real data for scientific discovery.

The investigators aim to develop scalable and flexible algorithms to identify and infer structural changes in contemporary high-throughput data. In particular, they will work on (a) fast change-point detection techniques which are flexible enough to handle non-Gaussian data and dependent data; (b) new statistical framework for joint analysis of multiple-sequence data; (c) theoretical foundations for inference of change points which can assign significant levels for detected change points and achieves the control of false discovery rate (FDR); (d) applications to CNV data for scientific discoveries. Moreover, the investigators plan to develop user-friendly and publicly accessible software for the proposed methods so that researchers can apply the proposed methodologies directly to their research problems.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1722562
Program Officer
Christopher Stark
Project Start
Project End
Budget Start
2017-07-01
Budget End
2020-06-30
Support Year
Fiscal Year
2017
Total Cost
$47,484
Indirect Cost
Name
University of South Carolina at Columbia
Department
Type
DUNS #
City
Columbia
State
SC
Country
United States
Zip Code
29208