Scott McKellar is currently a Technical Consultant at Mimecast where he has been since early 2019. Scott has been working in the technology industry for fifteen years and is passionate about technology & security. Scott enjoys understanding his customers and prospects often complex business challenges and aligning them with technology to solve problems and add value. Prior to his role at Mimecast, Scott headed up the technology team for an Australian leading Wi-Fi analytics SaaS and IaaS provider; Discovery Technology (a Data#3 company).
There’s always been a lot of excitement around Machine Learning and Artificial Intelligence in the cybersecurity community, and we’ve already explored the facts and fiction behind some of their biggest claims.
While the theories behind these new technologies seem sound, what do they actually look like in practice? Let’s explore some of the specific, proven, real-world applications of AI and ML in cybersecurity.
Understanding the difference between AI and ML
Before we get started, we should quickly outline the difference between Artificial Intelligence (AI) systems and Machine Learning (ML). Let’s talk about AI first.
Herb Roitblat, one of Mimecast’s resident data scientists, defines AI as “a form of computerised problem-solving with the means to solve a problem, but without the rules to do it.”
Artificial intelligence is a system's ability to "learn" from its inputs and infer rules from them. That enables the system to better interpret and react to similar inputs in the future.
This capability is enabled by Machine Learning, which is the technology, hardware, software and algorithms that ‘learn’ from previous inputs.
In a more practical sense, Dr. Jim Davis, Professor of Computer Science and Engineering at Ohio State University explains, “Modern machine learning is data-driven and with the data you can do auto-discovery of categories and classification, such as types of malicious or unwanted emails.”
The practical cybersecurity applications of AI and ML
There are a huge range of systems available today that utilise AI techniques for email and web security. These systems complement more traditional analytic and detection techniques by cutting down on the time required for analysis, blocking known risks and flagging likely risks to users or cybersecurity professionals to train and improve the AI.
Let’s take a look at successful real-world applications of AI/ML in cybersecurity.
Image checking and filtering - Deep learning, enabled by ML, are being used to identify not-safe-for-work and other images, such as logos, to improve filtering and phishing detection. AI/ML filtering tools are being used to spot deepfakes and flag phishing emails and the like.
Detecting outbound email attacks - ML models are also used to detect unusual and potentially risky patterns in a sender’s email frequency. This could be a sign of a cyberattacker using an organisation’s email for outbound attacks.
Malicious URL detection – Algorithms can analyse a URL’s structure and content to check for anything unusual, which is great for detecting malicious URLs.
Detecting data leaks - Several AI/ML techniques like content matching, image recognition and statistical analysis are being used to detect sensitive data leakage during channel monitoring.
Website categorisation – AI/ML tools can and do use supervised learning (i.e. human-assisted) to categorise websites, detect high-risk sites and enforce policies. This application is useful for both email and web security controls, which use site categorisation as part of policy-based decision-making.
Spam detection - Neural networks (AI systems that mimic the way our brains work and increase computational speed by categorising data sets) are used to help identify spam and other forms of unwanted, but non-malicious emails.
DNS-based data exfiltration – Use of AI to detect the malicious use of external DNS calls by malware to sneakily exfiltrate data.
Identifying and categorising customer-reported phishing emails – Pre-sorting and categorising emails submitted by customers to improve the efficiency of an SOC. When you look at a big network overall, there can be thousands of phishing emails flying around at any given time. AI/ML tools can categorise and sort those, simplifying things greatly for the human cybersecurity team.
Spear phishing - Predictive URL classification models based on ML algorithms can identify patterns that reveal a malicious sender’s emails. The AI/Ml tool is trained to spot micro behaviors like email headers, subsamples of body-data, punctuation patterns, etc to judge whether an email is likely to be a phish or not.
Watering hole detection - Attacks that are designed to compromise targeted users by infecting websites they typically visit can be detected using ML. Path traversal detection algorithms are used to detect these malicious domains and monitor them for rare or extraordinary redirect patterns to and from a site’s host.
Malicious webshell identification - Web shells can modify websites to route transaction data through a different path. ML models can be trained to distinguish normal behavior from malicious behavior, and malicious files can be executed on a monitored standalone system in order to train the model further. ML algorithms can be used to pre-emptively identify web shells and isolate them from the system before they do anything dangerous.
Ransomware - AI neural networks in combination with deep learning algorithms can detect unknown ransomware data sets through micro behavior training. A large set of ransom files with an even larger set of clean files are used to create an algorithm to identify key features that are then categorised into subsets to train the AI. When a ransom file attacks a system, that file can be checked against the trained model and automated security actions can be taken before it encrypts the whole file system or locks access to the computer.
Remote exploitation - Malicious attacks that target one or a network of computers in order to gain access to the system can happen in various ways including DDOS, DNS poisoning and port scanning. ML algorithms can be used to analyse system behavior and identify abnormal instances which do not correlate with the typical network behavior.
It's official, Machine learning and AI tools are already being used in cybersecurity, and are proving their worth every day. AI/ML are making cybersecurity systems more proactive and easier to use, reducing repetitive tasks, lowering costs, and at the end of the day - reducing risk by making detection processes more effective.
But there is a caveat: it can only do these things if the underlying data that supports the machine learning provides a complete picture of the environment. AI and ML are only as good as their inputs. The GIGO principle still applies i.e. “garbage in, garbage out.”
You’ll still need human cybersecurity experts who can monitor and train the ML technology, and users will still need to take action when security threats are identified. AI will likely automate some processes, and the above list of current applications clearly demonstrates how. But AI and ML are unlikely to completely replace security strategy in the near future, or possibly ever. It’s just a continually-evolving tool to assist cybersecurity teams in their fight against the constantly expanding cybersecurity threats that go hand-in-hand with our increasingly digitised world.