Security threats show no signs of abating, so security analysts need every useful tool they can get to detect, analyze, and stop threats as quickly as possible. In this post, I’m going to talk about an underutilized tool for threat detection: baselines.

A baseline is a set of behavioral data that serves as a reference for establishing normal IT activities, making it easier for security analysts to identify anomalies that indicate the presence of threats.

When analyzed with rich contextual detail, baselines make threat detection faster and more accurate. The key to their usefulness is building baselines that are sufficiently complex, taking into account enough factors (contextual details) to ensure that benign anomalies are not mistaken for genuine threats.

Context Matters: Regular Haystacks, Irregular Needles

It’s rare that we have black-and-white information about security threats, rendering the threats obvious and unmistakable. Instead, we have a torrent of details to sort through and examine. To understand the significance of any detail (everything from an account login to a file transfer to a CPU spike), we need contextual data. We need to know what’s normal in a situation. More specifically, we need to know what’s likely and what’s unlikely in a given situation.

We can begin with this premise:  If something happens regularly, it’s less likely to be a threat. In their busy work in Security Operation Centers (SOCs), analysts should pay attention to the anomalies – the things that don’t happen regularly – first. The threats are going to the exceptions, the activities that haven’t occurred regularly before, even if their difference or uniqueness is not especially jarring at first glance.

This premise helps us prioritize how we use our time, which is perhaps the most critical resource in a SOC.

Regardless of what we find when we investigate an anomaly, we’re always learning. If we determine that an event is an anomaly but not actually a threat, that’s useful information for us to have for interpreting alerts in the future.

Building Baselines to Model What’s Likely

To recognize what’s likely and what’s unlikely in a particular context, we need to build and analyze baselines. There’s an art to this.

Here are 7 ways to use baselines effectively for threat detection.

  1. Begin by recognizing the goal of baselines for alert triage: we need to prioritize our investigations, identify the issues that genuinely deserve attention, and make security analysts more productive.
    If you look at all the processes running on a system that is behaving erratically, you might find 200 things to look at. But if you know that 195 of these things are normal—that there’s a high probability of them occurring just as they are—then you can now focus your attention on the 5 things that are anomalous. These five things might be benign or they might be threats, but you can determine whichever they are more quickly now, because you’ve eliminated the need to examine all the other things happening on that system.

  2. Use historical data—carefully and consistently.
    To figure out if something is normal, it’s useful to have data going back 30 days or more. Keep in mind, though, that patterns might vary within this 30-day period. A thirty-day period at the end of the fiscal quarter might differ from a thirty-day period at the beginning of the quarter. There could be weekly cycles or patterns tied to special events. When comparing the frequency of activities, make sure you’re not comparing apples to oranges. Don’t assume that the finance department’s behavior is identical the first month of the quarter and the last. Don’t compare a day’s activities to a month’s, and don’t compare the activities of a three-person department with those of a thirty-person department.

  3. Beware of averages.
    The average frequency of an event can be misleading. For example, if a user accesses an ERP system three times a week two weeks in a row, that might appear normal. But if in the first week, access occurred during business hours, and in the second week, access twice occurred in the middle of the night, something could be suspicious about the second week. Look at all the relevant details before deciding that averages are reliable predictors of threat status.

  4. Build a baseline model that uses probabilities.
    If something has happened repeatedly before, you can likely whitelist it, flagging it as something not to worry about. But you don’t have to treat any occurrence as a Boolean activity (happened/didn’t happen). Sophisticated threat detection platforms enable you to work with probabilities. If something is 95% likely to happen, and it happens, that’s fine. But if something else was only 5% likely to happen, and it happens, that might merit investigation. Make sure you adopt a threat model that can take probabilities into account when analyzing baselines.

  5. Profile the behavior of applications, systems, and users—whatever is useful.
    Don’t limit your baseline model to just user behavior. Broaden the scope of your behavioral analysis to include applications, servers, and end user devices. Your goal should be to collect whatever data will be prove helpful for threat detection. Make sure the threat analysis platform you’re using supports a broad range of factors of various types.

  6. Find a threat analysis platform that supports factor analysis.
    Factor analysis is a well-established discipline in mathematical analysis that analyzes the variability in a set of correlated variables in terms of an unobserved, smaller set of variables known as factors. Essentially, it’s a discipline that enables organizations to identify and rank possible causes for a set of variables. Automating this analysis for alert triage reduces the work load of security analysis and reduces Mean Time to Detection (MTTD) for threats.

  7. Apply baselines and factor analysis not just for threat detection but also for alert triage.
    Many IT security tools detect anomalies to raise an alert. They’ll notice an anomaly, and raise an alert, suggesting that a security analyst investigate it. That’s useful, but today’s Security Operation Centers (SOCs) are inundated with alerts, most of which turn out to be false positives. Use baselines and factor analysis for alert triage, as well, winnowing the list of alerts down to the very few that genuinely merit attention. This is an application of baselining that is too rarely used in SOCs today.

Building Baseline Models with LogicHub

At LogicHub, we’ve built the LogicHub Security Automation Platform to apply machine learning to the problem of threat analysis and threat detection. Our platform enables security analysts to build and refine baseline models for their organizations, so they can quickly triage alerts, reducing thousands of alerts to the few that require attention and might signal the presence of a genuine threat.

Some of our baseline models, such as phishing triage, are broadly applicable across organizations. Others can be built and customized for the IT patterns specific to an organization. The LogicHub platform makes it easy for a SOC to customize baseline models to account for the IT patterns in its organization, providing analysts with a highly accurate basis for performing alert triage.

To learn how the LogicHub platform can your security team build baseline models and apply factor analysis to eliminate 95% of false positives and make your SOC 10X more effective, please contact us.