The rapid proliferation of Internet-connected devices, specially Internet-of-Things (IoT) devices, has led to mounting concerns regarding their security and the security of the Internet. This project seeks to harness the power of big data analytics and machine/deep learning to enhance Internet measurement techniques and associated information processing, to make them more scalable, efficient, and produce more actionable information. It aims to develop techniques for automated monitoring and data analysis to gain insight into the range of Internet-connected devices, their security vulnerabilities, and the ever-changing activities of malicious entities on the public Internet. The ensuing information will help software/hardware vendors and Internet-connected entities identify vulnerabilities and protect themselves against cyber-attacks, and move toward a more secure and transparent Internet.
This project aims to significantly advance the state of the art in using active and passive measurements to (1) effectively monitor and track Internet devices, (2) accelerate scanning and improve their efficacy, and design and develop an intelligent honeypot that can learn responses mimicking a wide range of vulnerable devices, in order to fool attackers into engaging and revealing their attack vector. The project seeks to develop software, as well as deep learning and other machine learning models to build new Internet measurement capabilities and process datasets captured from passive/active measurements to distill data consumable by machine learning algorithms and instrumental in security analysis and network monitoring. The resulting automated tools can monitor the Internet in a continuous manner, to maintain an up-to-date view of the devices/machines that comprise the Internet, susceptible and infected devices, and vulnerabilities that are being actively exploited in-the-wild. The final result of this project is a generalized framework of interconnected components that applies deep learning to active/passive network measurements to gain actionable insights with respect to the Internet and its security, a set of scalable tools that model and enable real-time decision making regarding Internet addresses and network traffic, and a large number of raw and curated datasets shared with the research community while protecting the privacy and security of all parties involved. Automatically detecting software/hardware vulnerabilities as exploits are observed by these techniques allows vendors and network administrators to address critical vulnerabilities while enhancing intrusion detection and DDoS mitigation techniques. The data can also transform risk assessment techniques for gauging the security of networks by exposing host-level risk factors, for self-assessment as well as assisting third-party assessment, e.g., by security vendors and cyber insurance underwriters.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.