One characteristic of the information age is the exponential growth of information, and the ready availability of this information through networks, giving the "Big Data" era. Whereas the conventional approach to treating such large volumes of data is through statistics, such as averages, the perspective in this project is the opposite, namely that most of the value in the information is in the parts that deviate from the average, that are unusual, atypical. Familiar examples include valuable paintings or writings that deviate from the norms, that are atypical. The same could be true for venture development and scientific research. The goal of this project is to investigate atypicality both from a theoretical and a practical point of view. The proposed work uses the information theory concept of descriptive length, in a general sense, to characterize atypicality. The project is oriented around two thrusts. The first is a theoretical development to make precise exactly what should be considered atypical from an information theory point of view. The second is algorithm development to implement this theory, and in particular to find fast algorithms for implementation so that big data sets can be processed.

Analysis of large data sets, mainly in biology and medicine, including ECG (electro cardiogram) and genomics, will be used in simulations, as they can lead to new diagnostic methods for human cardiac disease.

Project Start
Project End
Budget Start
2014-04-15
Budget End
2016-03-31
Support Year
Fiscal Year
2014
Total Cost
$79,641
Indirect Cost
Name
University of Hawaii
Department
Type
DUNS #
City
Honolulu
State
HI
Country
United States
Zip Code
96822