This Small Business Innovation Research Phase I research project aimes to research next generation digital data recovery techniques. The problem of restoring lost data from a damaged digital device arises routinely in digital forensics and data recovery. In many advanced cases of digital storage failure currently available file recovery techniques based on disk storage information fail. The objective of the Phase I project is to further research and refine reassembly techniques, already developed in Polytechnic, that do not rely on disk information, but are based on the statistical properties of file contents themselves. In order to build a commercially viable tool, research is required for discovering domain specific techniques to identify the type of fragments and additional research needs to be conducted on developing efficient/scalable algorithms to recover a myriad of file types. This research will require introducing techniques from expert systems, data modelling, and combinatorial optimizations. Funding will also be used to test the viability of the research by implementing the research in a developed Data Recovery tool. At the end of Phase I it is anticipated to have researched enough domain specific recovery techniques to market the data recovery tool to existing digital forensic vendors.
The problem of recovery of information from bits and pieces of digital data, in the absence of storage meta-information to tie the pieces together, is equivalent to the problem of having hundreds/thousands of jigsaw puzzles mixed into together. The challenge of identifying if a piece of data belongs to a specific file or file type is daunting. In addition, one must identify not only which pieces belong to which file but the correct order of placement of the data to reconstruct a file. There has been no serious work undertaken to tackle the problem of recovery of fragmented digital data. The preliminary research conducted at Polytechnic University has not only been groundbreaking, but also has demonstrated the viability of developing domain specific techniques to identify the type of data fragments and the use of file type specific algorithms to econstruct files. The funding from Phase I would be used to further data fragment classification techniques as well as file type specific techniques for enhanced recovery. The funding will also be used to develop file type specific recovery algorithms for email, word processing, database and multimedia files.
With the surge in usage and capacities of digital storage devices, the need for more efficient and better techniques for data recovery are becoming more apparent. It is anticipated that funding obtained from this proposal can result in the development and dissemination of an expert system that can be licensed as libraries to the vendors of disk analysis tools in the Digital Forensic market. Utilizing these libraries, existing tools can be enhanced to handle the recovery of digital data. The digital forensic market will greatly benefit from the additional recovery of information as it can be crucial to the needs of the intelligence, law enforcement and security sectors. The ultimate goal of the project is to develop a stand-alone next generation recovery tool that utilizes the very latest recovery techniques researched in Phase I and beyond. Such a tool, would cater to the needs of the digital forensic and the broader data recovery market.