Across the IT security industry, there’s a growing recognition that enterprises can no longer detect and stop security attacks without the help of automation.
Today’s security teams are dedicated, highly trained, and equipped with more advanced tools than ever before. But they’re facing a steady stream of attacks that are carefully crafted by determined and well-funded adversaries. Detecting, let alone stopping, attacks against a vast and varied IT infrastructure is increasingly difficult.
Automation can help enterprise Security Operations Centers (SOCs) with three critical activities:
- Alert Triage (sorting through data that suggests the presence of a threat)
- Threat Hunting (analyzing data carefully to confirm the existence of a threat and discover its characteristics)
- Incident Response (responding to and, if necessary, recovering from a threat or incident)
The most difficult of these activities is threat hunting, which today is usually performed manually by security analysts as they pore through data, log into various systems, consults data feeds, and try to determine—as quickly as possible—whether a threat exists.
All three activities require decision making. To perform alert triage, analysts have to decide whether or not an alert indicates a real incident. Then, in the threat hunting stage, analysts have to decide whether an incident constitutes a real threat. Finally, in the incident response stage, analysts end up spending a good deal of time confirming that the incident is a real incident and deciding how best to respond to it.
When it comes to automating alert triage, threat hunting, and incident response, the hardest nut to crack turns out to be automating decision making. SIEMs use rules to perform some rudimentary automation, but they are not powerful or sophisticated enough to replicate a skilled analyst’s decision making process. Just ask security teams who get alerts from SIEMs: a vast percentage of these alerts out to be false positives.
Beyond the most rudimentary level, the logic required for decision making is too sophisticated and complex to automate with a scripting language. Fortunately, the gap in what an analyst can do and what can be easily scripted can be bridged by applying machine learning techniques. Machine learning is an artificial intelligence (AI) technique that uses statistical models to learn pattern recognition without having been explicitly programmed to do so.
To see just how quickly and easily machine learning can be applied in a SOC, let’s consider an example of a common type of triage performed in a SOC: determining whether or not suspicious emails are spam (annoying but probably harmless) or phishing attacks (definitely harmful to the organization).
Putting Machine Learning to Work for Phishing Triage
In this example, we’ll take a collection of email messages forwarded to an enterprise SOC for analysis and use machine learning to determine which of these messages really contain phishing attempts.
The emails arrive at the SOC from users who suspect it to be phishing. The job for the SOC is to evaluate each email more rigorously to determine which are genuinely dangerous.
Building a Machine Learning Workflow with LogicHub
In this example, we’ll use the Flow Builder in the LogicHub Intelligent Security Automation platform to train an automated flow for analyzing email. We’ll train the system to characterize email as spam or phishing using one set of labeled data, and then testing it against other set of labeled data.
In the image below, you can see the flow we’ve built for an automated analysis on the left. This flow was built using the drag-and-drop interface in the LogicHub platform. No Python scripting was required.
On the right, you can see examples of incoming emails tentatively classified as spam. There’s a similar set of emails suspected of being phishing. (We’ve obscured the name of the company that received this email.)
In the next step, we apply labels in the context of the flow: suspected spam is labeled ‘spam’ and suspected phishing is labeled ‘phishing.’
Next, we mix these two streams, compiling a single collection containing ‘All Emails.’ Why? Because we’re to re-evaluate all of them to learn whether or not their original classification was correct.
To perform our automated analysis, we need to create a new tag for sorting the email, so we assign a random tag to each email.
Separating Training Data from Test Data
Next, we split the email again into two streams. This time, they’re sorted by their random tags, not by their preliminary identification as spam or phishing.
Why are we creating two streams? We’re going to use one stream to train our machine learning algorithm to distinguish spam from phishing on the basis of internal characteristics of the email. Machine learning works by training an analytical “machine” to discover patterns, including similarities and differences, from a collection of test data. So, to build our analytical model, we’re going to use some of the email received from the SOC as test data to train our machine learning model.
Then we’ll apply the machine learning model to the test data, which comprises the other email messages we’ve received, and assess the accuracy of the original classification of the email.
(Incidentally, it’s important that we not use the training data as test data. If we use the same data set for both, then we’re testing the machine on the same data we should trained it on. Its results won’t vary, and they won’t suggest the accuracy of the model when dealing with unfamiliar data.)
In the image below, you can see an example of our test data and how our machine learning model analyzed it. Each message is still tagged with its original label (‘spam’ or ‘phishing’). But now there’s an additional label (lhub_label) that presents the machine learning model’s analysis of the message, taking into account all the characteristics of the message.
Accurate, Real-Time Analysis for Phishing Triage
For the first two emails shown below, the predicted label (spam) matches the lhub_label (spam). But for the third email, the machine learning analysis produced a different result, showing the analysis to be phishing.
From there, our flow divides the messages into four categories:
- True positives (emails that were genuinely phishing threats)
- True negatives (emails that were genuinely spam; that is, not phishing threats)
- False positives (emails that were spam but that had been misidentified as phishing)
- False negatives (emails that were phishing but that had been misidentified as spam)
If we examine some of the messages in the True Positives stream (shown below), we can see that they really are phishing messages.
And if we view the Results of the complete flow, we can see how well our machine learning model performed.
Our machine learning model, which can easily be assembled using built-in classifiers of the LogicHub platform, found:
- 6 phishing threats
- 31 spam messages
- 1 false positive classification (spam incorrectly flagged as phishing)
- 1 false negative classification (phishing incorrectly flagged as spam)
Overall, the model performed with a precision of 0.86 and an accuracy of 0.95. (Precision refers to the consistency of the analysis. Accuracy refers to the analysis’ ability to genuinely detect threats.)
From our work with SOCs, we know that these results match or exceed the results achieved by security analysts. And they were achieved with the benefit of speed, the analysis being completed in seconds not minutes or hours.
And if the model needs fine tuning? Analysts can use the LogicHub platform to correct the results of the analysis, retraining the algorithm and enabling the machine learning model to perform with even greater accuracy in the future.
The Importance of Machine Learning for Security Automation
I hope this example gives you a sense of the advantages of machine learning for automating the threat analysis performed every day in enterprise SOCs.
The LogicHub Intelligent Security Platform’s machine learning model was able to quickly distinguish harmful phishing messages from spam, enabling the SOC team to take appropriate corrective actions.
The model performed with 95% accuracy. To achieve these results, no Python programming was needed. Instead, using a drag-and-drop interface, an analyst was able to build an analytical flow, which was trained on live data in the SOC. Once built, this flow can be used repeatedly, copied, modified, and elaborated on as needed, enabling analysts to respond more quickly to genuine threats.
This shows the power of machine learning: fast, accurate results were achieved with minimal programming, and analysts were spared the need to evaluate the data entirely.
In any overworked SOC today, LogicHub’s machine learning approach can eliminate the vast majority of false positives received by the SOC and enable to the SOC to perform 10X more threat hunting.