SIEM Next-Genness (2.0)
I previously wrote about SIEM next-genness, but I mixed and matched between features and architectures. Itโs a problem because a product can have a lot of features but be a sluggish clunky product thatโs a nightmare to manage, or it can be super well architected but thereโs not much you can do with it.
So I now give you a matrix of capabilities in rough order of next-genness, from ๐ฑ legacy to ๐ตโ๐ซ too-next gen.
As the big asterisk notes, take this as an illustration rather than an assessment. If you want proper peer-reviewed and fact-checked assessments, please read my research.
Achitecturally
This is a list in order of how SIEM products are deployed and designed to store and process data.
Hardware deployment - Box in closet
๐๐จ๐ซ๐ข๐ณ๐จ๐ง๐ญ๐๐ฅ ๐ฌ๐๐๐ฅ๐๐๐ข๐ฅ๐ข๐ญ๐ฒ - More boxes in closet
Virtualized deployment - a virtual machine that can either run on a box in your closet, or as a box in somebody elseโs closet (like AWSโs). If you have unused boxes in your closet, you can deploy additional virtual machines to scale your SIEM.
Containerized deployment - a container that can either run on a box in your closet that also has an operating system, or in somebody elseโs box in a closet that has an operating system. Cool thing is that you can manage the containers rather flexibly with the Greek word for pilot.
๐๐ฅ๐จ๐ฎ๐ ๐ง๐๐ญ๐ข๐ฏ๐ - these solutions are built on top of hyperscaler infrastructures and may even be available in cloud marketplaces. Cloud-native products can either be deployed in your own hyperscaler environment or consumed as SaaS, where the vendor managed the environment and you just give your web UI to your security analysts.
๐๐ข๐๐ซ๐จ๐ฌ๐๐ซ๐ฏ๐ข๐๐๐ฌ ๐๐ซ๐๐ก๐ข๐ญ๐๐๐ญ๐ฎ๐ซ๐ - with the risk of sparking the monolith vs microservices conundrum, this indicates โnewerโ SIEMs that are deployed in containers, use K8s for scaling, and run components independently of each other to help mitigate knock-on effects. Most (if not all) microservices-based SIEMs are consumed as-a-service, because otherwise the customer needs to inherit or the complexity of managing the monstrosity.
Native data pipeline management - This refers to applying filtering, normalization, routing, and correlation of data prior to ingestion/storage. Itโs a way to reduce both costs and volumes. Thereโs a whole market on data pipelines for SIEMs (look at Cribl and co). Some SIEMs are starting to develop these features natively
Decoupled/Distributed SIEM - This refers to having data management and the threat analysis separate. For example, you can use a security data lake to store your logs at an arguably cheaper price, and the โanalysisโ module pulls data as it is queried. I link here Anton Chuvakinโs blog post as I agree with his take and I only reference things I agree with.
Featurally
This is list in order of newer and supposedly more useful features to detect and respond to threats.
Machine Leaning - some good old pre-GenAI statistical analysis to identify deviations from the baseline.
Deterministic ๐๐๐ฌ๐ฉ๐จ๐ง๐ฌ๐ ๐๐ง๐ ๐๐ฎ๐ญ๐จ๐ฆ๐๐ญ๐ข๐จ๐ง - workflows or scripts that are usually the result of a SOAR acquisition or rarely by in-house developments. This boils down to doing a bunch of API work using workflow logic. But mind you, good workflow engines are very comprehensive and most of these natively available in SIEMs are not that advanced.
๐๐๐ญ๐๐๐ญ๐ข๐จ๐ง-๐๐ฌ-๐๐จ๐๐ - allowing your engineers to write their detection rules in YAML and using SIGMA rules.
๐๐๐ฅ๐-๐ญ๐ฎ๐ง๐ข๐ง๐ ๐๐๐ญ๐๐๐ญ๐ข๐จ๐ง - your engineers may be too busy writing their detection rules in YAML, so the tool can optimize these rules by itself based on investigation results and flagged false positives
Copilots - using LLMs to provide a natural language interface for the product. Analyst says what they want, and the copilot can spew out an SQL query or navigate the product to perform some actions. I think copilots are on the low end of LLM productivity gains.
Self-writing detection - using LLMs to write detection rules. Particularly suitable if the product supports detection-as-code.
๐๐-based response and automation - architecting LLMs with memory, retrieval, and guardrails to investigate threats, propose remediation, carrying them out, and perhaps do all that autonomously.


