The majority of security teams still rely on rules to detect threats. Typically, these teams have a central repository of security events and implement rules that create an alert when the condition within the rule matches. This rules-based technology has been in use for a number of years; and while Security Operations teams have come to rely on it for alerting, they are finding daily that it has several limitations.

Limitation #1: Rules are written to catch attacks you already know about.

Rules rely in large part on the author’s a-priori knowledge of the attack. This includes what the attack looks like and how to find it in a stream of security events. It is possible that a rule can be defined too tightly and does not match even all variations of the same attack. The opposite case is also true; sometimes rules are written too broadly. Or you might not have sufficient data for the rule to determine whether it’s an incident or not. In that case, an analyst has to manually weed out the false positives.

The bottom line: with rules-based systems you miss both variations of old attacks as well as new attacks for which you haven’t already established rules.

Limitation #2: Most teams use a small rule set

90% of large enterprises (those with 10K+ employees) have fewer than 400 rules. Building such rule sets is time-consuming and needs constant updating/revision. Typically, it takes 4-8 hours to fully establish a sophisticated rule—some security teams have dedicated content authors to do just this task. Despite these professionals’ best efforts, there are simply not enough rules to detect a large percentage of today’s growing suspicious/malicious activities.

Limitation #3: Investigation flows are hard to write

To catch attack patterns not seen before, you have to be able to investigate more deeply. You want to look at the context of an event and investigate related events based on entities or time or other events that might provide supporting evidence. You then need to bring all these different steps together to conclude whether this event or alert is a real incident or not. Typically rules (or queries/searches in some systems) are capable enough of encoding one of these steps, but they fall short when it comes to encoding the entire investigation flow.

Limitation #4: It is impossible to model billions of events with rules

Since rules are focused on encoding known patterns of attack or suspicious activities, they typically capture tens to hundreds of events. Many enterprises generate billions of events every day. And if it takes half a day for an analyst to write a sophisticated rule and put it into operation, clearly rules by themselves are not sufficient to model billions of events.

Limitation #5: Rules can’t differentiate between known and unknown events at scale.

Because rules can’t model millions of events, they only understand what is a “known bad”. They can’t model a “known good”, which is what 99.99% of all events in the enterprises are. As a result, new and unknown activity gets buried in millions of “known good” events. Not all new activity is bad, but if we cannot even separate “new and unknown” activity from “known”, we definitely will miss “unknown bad”, which is an even smaller subset of “unknown” activity.

Going beyond rules

If we are to truly reduce our chances of missing a breach not just by 20% or 50% but by 10-50x, we need a much more efficient way to model millions of events easily and efficiently. And we need a way to efficiently classify millions and billions of events into “okay” and “not okay” buckets.

The Future of Threat Detection

Optimally, an organization’s cyber analysts would be able to:

  • Model and learn millions of events much more efficiently—so efficiently that one person could easily do so in a single day;
  • Efficiently triage millions of events and classify them as “OK” and “Not OK”, again on a timeframe of days, not months.

How Threat Detection processes would work in that scenario:

  • You would be able to classify most of our events as “Known Good” and “Known Bad”
  • You would be notified about “Unknowns (good or bad)” and “Known Bad”
  • Every day we would classify “Unknowns” into “Known Good” and “Known Bad”

The efficacy of this process depends on two factors:

  • Speed: How long does it take to automate classification of a million events into “Known Good” and “Known Bad”
  • Effectiveness: Any classification process will have perfect accuracy on training data. But you want to apply the classification on new data as well, which means it will have some false positives and false negatives. We want to make sure that these error rates are very low.

At LogicHub we’re fundamentally rethinking and redesigning how all of the above can be achieved—not just as a technical accomplishment but as a contributor to an organization’s efficiency and success.