This U.S.-France cooperative research project between Hichem Frigui's research group at the University of Memphis and the IMEDIA research team led by Nozha Boujemaa at the French National Institute for Research in Informatics and Applied Mathematics (INRIA) in Rocquencourt focuses on the development of effective clustering algorithms suitable for categorizing massive multi-modal scientific data collections. The proposal addresses theoretical aspects of clustering algorithms and their applications in analyzing and organizing scientific data sets, namely: (1) scientific text and related images in botanical biodiversity date produced in the last 50 years; and (2) gene expression data from Arabidopsis thaliana genome project.
Intellectual Merit: The project will advance knowledge in data clustering methods using integration of fuzzy set theory and statistical estimators. They will develop new algorithms that identify clusters of data in subspaces, that combine multi-modal features, and that use few labeled samples to guide the clustering process.
Broader Impacts: The new algorithms, along with the scientific data sets, will be combined with the U.S. investigator's content-based image retrieval (CBIR) prototype funded under his NSF-CAREER project. The CBIR prototype will be used in education activities for freshman engineering and computer science students as well as high school students. The methods developed will be useful for applications in information security, bioinformatics, content-based multimedia and other large data sets. Through this award, U.S. students are given the opportunity to develop research skills in an international research environment and initiate partnerships with French researchers for future collaboration.