Machine learning and cybersecurity are traditionally treated as separate subdomains of computer science study. In practice, though, machine learning has become an essential tool for IT security professionals seeking to prevent attacks and detect vulnerabilities. Just as new methods of extracting knowledge and making predictions from perpetually growing and changing datasets have revolutionized science and business, they’re creating novel defenses — and in some cases, attacks — in the ongoing struggle between hackers and security teams.
But many mid-career professionals in cybersecurity built their skills at a time when “machine learning” was still an esoteric approach and data was not as integral for security operations. To remedy this knowledge gap, the Center for Data and Computing (CDAC) convened a trio of UChicago CS faculty — Nick Feamster, Yuxin Chen, and Blase Ur — to produce an innovative new online Machine Learning for Cybersecurity certificate course that will be offered for the first time this fall.
“AI has been taking over the security area in products, startups and methods for the last ten years,” said Feamster, CDAC faculty director and Neubauer Professor of computer science. “We want to give professionals the skills to correctly apply these models to their data and solve various security problems that they’re dealing with every day, from malware and denial of service attacks to botnets and phishing scams.”
The four-session certificate course will teach information security managers, DevOps engineers, software developers, and system administrators the fundamentals of machine learning for security in various real-world situations. While many off-the-shelf ML tools promise to catch IT departments up to the modern state-of-the-art, applying those tools without understanding how they work can cause more harm than good.
“There are a lot of people who work in computer security who were trained a long time ago, and the world is very different now,” said Ur, assistant professor of computer science at the University of Chicago. “They might have heard some of the buzzwords and they might know how to use some of these black-box tools, but they don’t really understand the nuances of what they’re doing, or the foundational aspects of it. You can actually start doing things that put your organization at risk if you don’t really understand what’s going on under the hood.”
The course consists of four units: Foundations of Machine Learning and Data Science for Security, Data-Driven Network and Computer Security, Machine Learning in the Presence of Adversaries, and Ethics, Fairness, Responsibility, and Transparency in Data-Driven Cybersecurity. Each module will be taught with a combination of video instruction, case studies, interactive Jupyter notebook exercises, and live group discussions.
“We want to paint a roadmap towards how these security professionals can think about solving problems,” said Chen, assistant professor of computer science. “So once they see data from a practical challenge, they can start analyzing their data and thinking about which models could work and which models are not applicable, reasoning from the bottom up to build out the system rather than trying out different approaches blindly.”
The course itself is also a research project, assessing the state of machine learning education for cybersecurity professionals and developing new, adaptive forms of online instruction. Using human-centered interviews, the team will talk to cybersecurity experts on the vanguard of utilizing ML approaches in industry and academia, using those insights to build real-world case studies and hands-on exercises for the course.
Data for the course will come from sources including UChicago IT Services and the CDAC Internet of Things (IoT) Lab, which will be used to generate synthetic data on normal and anomalous network activity, providing students with realistic datasets to test machine learning approaches. Additionally, Chen will apply his expertise on “machine teaching” to optimizing future iterations of the course, creating the appropriate educational pathway for students with different experience and skill backgrounds.
“The teaching process itself is also creating new data for us to observe how the students actually learn,” Chen said. “In the long term, we aim to build customized and adaptive course modules that are personalized to different student responses based on assessments of their exercises.”
For more information on the Machine Learning for Cybersecurity course, visit mlccertificate.uchicago.edu. The course and research project are supported by a grant from the National Science Foundation EAGER program, which funds exploratory, early-stage research.