When it comes to monitoring, some of the most important things can also be the simplest. Whether it be small analyst activities like basic triage or larger automated ones like whitelisting, actions speak louder than words. When deciding what the most helpful actions are, though, it’s important to first think of ways to measure the effects that those actions have on your network.

This is where metrics and statistical analysis come in. With a few simple measurements, any organization can make an efficient plan for detection, triage, and response, even on limited resources. The biggest problems become easier to prioritize and the smaller ones are easier to plan for later.

The following is a brief examination of the top five metrics for effective triage, case creation, and response, as compiled from both an operational and managerial perspective.

Coverage and Visibility

Before an analyst even enters a case, it’s important to make sure that the cases they’re receiving cover everything desired by the organization. Operationally, this scope confirms that detections aren’t missing any key points in the network, that responses are full and complete, and that analysts have enough data to make a proper judgement. From a managerial standpoint, metrics towards visibility (like a listing/count of visible systems, applications, and/or users) create a better roadmap towards integration of the rest of the environment and can provide valuable data on network blind spots.

As actions are taken on cases and tuning is performed, visibility should increase. This is a metric that requires constant awareness and goals for maintenance to remain healthy.

Time-to-Decision

Sometimes, the ability of an analyst to research their case can be hindered. Perhaps they don’t have all the data they need at hand, or the method to find that data is obscured. Both issues will cause an increase in response time and are likely to make metrics based on escalations, visibility, and overall effectiveness that much more obscure.

To avoid problems with Time-to-Decision metrics, it’s important to keep an eye on documentation accuracy and ease of access to resources for analysts. When either is lost, so is time towards what could be a crucial detection.

Response Rate

We’ve all had our overwhelming moments: too many projects, too little sleep, too little coffee. For an analyst, that ‘burnout’ when facing a mountain of cases can result in a poor response rate - the rate at which an analyst makes useful responses (or any response at all). The underlying causes for this burnout may vary from an excess of volume, a lack of details in a case, or even an excess of detail. No matter what the underlying cause, any operation suffers from a lack of useful responses in more ways than one.

All other metrics suffer in some way from a poor response rate, for instance. Other analysts looking for more information on a case may be left disappointed and needing to perform research over again, the overall time to complete a case may suffer because more cases need to be redone, and visibility stops mattering when no useful product comes from it.

In order to avoid a poor response rate, operations teams should make a special effort to increase automation, decrease false positives, decrease overall case count, and decrease complexity of threat investigation tools.

Time-to-Escalation

Unlike Time-to-Decision, the Time-to-Escalation metric focuses on the speed at which cases move past being researched. Are methods of remediation ready at hand? Is interoperability with departments outside of the SOC sufficient for remediation the SOC can’t perform? Are the tools to escalate a problem fast and easy to use?

Each of these questions is an important consideration when thinking about the time it takes to escalate an incident. Without the tools to escalate quickly, cases are left hanging in limbo, neither being actioned nor waiting to be remediated.

Ratio of Success

This metric may have been one of the first to come to mind when considering measurements of effectiveness. When a case arrives, the primary goal is for the problem leading to that case to be resolved. If a SOC’s Ratio of Success is high, the chances of a case turning into a remediation are therefore higher and should reduce the number of cases in the future past remediation. This success rate works hand-in-hand with the prior metrics to quickly decimate the number of cases and time worked to only the most necessary interactions.

A higher ratio of success will also result in better visibility over time. As cases are resolved and issues patched, more of the network becomes familiar.

Blog

Related Posts

September 13, 2022 Kumar Saurabh

Why No Code Solutions Are a Double-Edged Sword

Most out-of-the-box security automation is based on a simple logic — essentially, if “this”...

Learn More

August 16, 2022 Willy Leichter

Understanding MDR, XDR, EDR and TDR

A program with proper threat detection and response (TDR) has two key pillars: understanding the...

Learn More

August 9, 2022 Willy Leichter

Intuition vs. Automation: What Man and Machine Bring to Data Security

Cybersecurity experts Colin Henderson and Ray Espinoza share their take on the automation-driven...

Learn More

August 2, 2022 Anthony Morris

Using AI/ML to Create Better Security Detections

The blue-team challenge Ask any person who has interacted with a security operations center (SOC)...

Learn More

July 26, 2022 Willy Leichter

How to Select the Right MDR Service

It can be difficult to understand the differences between the various managed detection and...

Learn More

July 21, 2022 Willy Leichter

The Evolving Role of the SOC Analyst

As the cyber threat landscape evolves, so does the role of the security operations center (SOC)...

Learn More

July 19, 2022 Kumar Saurabh

Life, Liberty, and the Pursuit of Security

As cyber threats evolve, organizations of all sizes need to ramp up their security efforts....

Learn More

July 15, 2022 Tessa Mishoe

LogicHub Security RoundUp: July 2022

Hello, and welcome to the latest edition of the LogicHub Monthly Update! Each month we’ll be...

Learn More

July 12, 2022 Willy Leichter

Security Tools Need to Get with the API Program

No cloud API is an island The evolution of cloud services has coincided with the development of...

Learn More

July 6, 2022 Willy Leichter

Why the Rush to MDR?

LogicHub recently published a survey conducted by Osterman Research, looking at changing trends and...

Learn More