This Small Business Innovation Research (SBIR) Phase I research project will develop a collection of privacy sensitive distributed data mining algorithms for immediate applications in domains that deal with sensitive private data. Privacy is becoming a growing concern in many data monitoring and mining applications such as network intrusion detection, fraud detection, and counter-terrorism intelligence gathering among others. However, to date, there does not exist any commercial data mining system that is capable of analyzing potentially distributed multi-party data in a privacy-sensitive manner. This research will develop technology to meet this immediate need. It will develop data mining algorithms that can work without direct access to the original sensitive data. The research will particularly focus on privacy-preserving statistical computing and clustering techniques that are particularly suitable for security-related threat management applications. The algorithmic approach is based on a combination of random projection and secured multi-party computation-based techniques. Deliverables will include a collection of privacy-sensitive algorithms and a documentation of their performance along with a demonstration.
A successful completion of this project will open up many new possibilities particularly in the domain of security and threat management for counter-terrorism which are not possible today because of due concerns about the privacy of the common citizens. Privacy-preserving data mining has numerous potential applications, with enormous potential benefit for security and economic efficiency. It also has the great virtue of offering transparency to providers of information, allowing them to understand and control the revelation of sensitive features.