This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. Clustering is considered one of the most powerful techniques utilized in discovery of relationships among records such as genes, networks, or other biomedical data. Due to the large number of clustering algorithms in existence and inability to computationally choose a single algorithm that would suit a variety of situations and data sets, there is a need to identify ways to tackle this problem. This problem is exacerbated by the exponential growth of data, which drive the need for new approaches to the problem. Our work addresses questions of relationships among records in different algorithms and groups, and enables scientists to look through multiple results in order to draw meaningful conclusions. We created new analytical and visual tools and techniques to provide insights into single and multiple clustering algorithm results. Our work facilitates the analysis of high-dimensional data utilizing visual techniques for data presentation. Our new clustering measure gives information on proximity and provides a means for visual projection of the records including the largest number of common cluster memberships among the records. We continue with the development of our suite of tools with visual and analytical approaches. We integrated new visualizations into the system, together with implementation of algorithms and additional interaction methods. We implemented a propagation mechanism for the inter-suite communication and tested a variety of data sets to determine scalability of these tools. We compared our tools with other algorithms, none of which are capable of presenting record and group comparisons using a single technique, visually and/or analytically. We utilize our tools in several research studies from preliminary to confirmatory and we show a case study of a transitional cell carcinoma of the bladder data set, combining statistical analysis, visualization and data mining.
Showing the most recent 10 out of 179 publications