The future of security operations depends on AI agents, not LLMs

What are AI agents?

“Is it an Agent, or just a Program?” is a widely cited paper which has seen more than 5,000 citations, highlighting the complexity of this intuitive question. Stuart Russell and Peter Norvig, in their well-regarded AI textbook titled “Artificial Intelligence: A Modern Approach,” define an agent as “anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators.” For example, a wall thermostat measures the environment temperature and turns on the heat when the temperature drops below a set value. Unlike programs, agents exhibit autonomy when interacting with their environment, using actions to achieve established goals. 

The terms agent, action and environment related to AI agents are also key to an important field of machine learning (ML) called reinforcement learning (RL). RL agents learn to optimize actions in an environment to achieve a goal. Challenged with a metric to optimize, RL agents evaluate their environment and often discover strategies that prove better than human approaches, sometimes sacrificing short-term rewards for long-term gains.

For example, RL together with deep learning has led to breakthroughs such as AlphaGo. AlphaGo defeated the human world champion in a game called Go, which has a substantially larger action space than chess. AlphaGo won by using techniques such as Monte Carlo Tree Search (MCTS) to explore sequences of actions and predict those most likely to win, without needing exhaustive search. 

Large language models (LLMs) also benefit from RL. For example, RL from human feedback (RLHF) helps align LLM outputs with human expectations. In addition, recent advances that use MCTS-like techniques can solve multiple solutions to math and coding problems with verifiable answers.

Stand-alone LLMs are generally considered as programs, even though they use agentic elements during training. Certain LLM architectures can be made deterministic by adjusting parameters, and the variability in outputs often results from artificial inputs or system factors such as batch processing on distributed systems. While LLMs lack true autonomy and learning, it can be argued they “learn” by being continuously retrained on new data and refined by user feedback. However, the point when LLMs can be connected to digital environments to track user behavior and adjust outputs to optimize metrics still seems a ways off, but not unreachable.

AI agents fixes for LLMs shortcomings

There are three inherent features of LLMs which result in shortcomings when used in stand-alone mode. However, AI agents can improve performance of these applications, particularly as they relate to security operations.

Sequential generation problems for LLMs

LLMs are limited by the way they generate responses. LLMs answer word-by-word (or token-by-token), with each word influenced by previous ones. If an LLM begins to hallucinate, it continues to build on this incorrect information until the end of its response. LLMs don’t truly “understand” their outputs, they simply predict the most probable next word. Contextual confusion is inevitable, particularly when words have different meanings in different fields. Unlike AI agents, LLMs cannot pause and correct themselves, so it’s important to restart or edit earlier parts of a conversation rather than continuing a faulty response.

AI agents on the other hand can work in pairs. While one generates an answer, the other validates its accuracy, regenerating content if hallucinations occur. The same LLM can be used under the hood to handle both roles, but from a human perspective it’s easier to model this as a pair of specialized agents. Optionally, different agents can use different LLMs to yield performance, latency and cost benefits. For example, a security summarizer AI agent could use a more simple and cost-effective architecture fine-tuned for the task.

LLM limited reasoning capabilities

Another problem with LLMs is they often go straight to a solution, often skipping steps. Users typically want concise answers, and the cost per token encourages brevity, but this approach can reduce the quality of the output.

Numerous studies show that asking LLMs to think step-by-step (chain of thought) significantly improves performance in answering logical questions. Some LLMs employ “scratch space” or memory tokens to track intermediate results or list pros and cons before giving a final input. But how much control one has over LLM outputs can be significantly affected by AI agentic flow. For example, an AI agent might prompt the generation-validation team to explore multiple perspectives, record intermediate steps, and assess progress. This approach is particularly valuable when LLMs struggle with problems that require iterative exploration, such as a security investigation, which does not have a well-defined path, since agents can interact with the environment using tools like a computer terminal or vision-language models and quickly test multiple strategies. This mirrors RL’s approach to optimizing actions.

Limited relevant knowledge and planning capabilities for LLMs

LLMs struggle to answer questions about data that was unavailable during their training. As a result, they often rely on retrieval-augmented generation (RAG), where they fetch external data and add it to the input question. AI agents, on the other hand, enhance RAG systems by validating and filtering relevant documents and providing multiple perspectives on the environment, including using graph-based approaches. AI agents are also more flexible than LLMs in constructing knowledge graphs from collections of unnormalized documents. AI Agents are advantageous in simplifying planning (which has proven challenging for LLMs) and breaking down complex tasks, leaving the smaller, more manageable subtasks to LLM agents.

AI agents in security operations 

Security operations, or SecOps, is an umbrella term for teams more popularly known as the SOC, NOC, AppSec, GRC, IAM operations, Threat Hunters, and a long tail of niche areas. These teams work 24/7 to keep their organizations secure by investigating and responding to security alerts, disrupting threat actors, patching vulnerabilities, conducting security training and reviews, assessing 3rd party risk, answering customers’ security questionnaires, etc. 

Traditional approaches to automation have had limited success in this area because security decisions need organizational context that is only in people’s heads and never written down, or is written in unstructured documents. In addition, the security landscape evolves rapidly, so decisions need to involve correlating and pattern matching across diverse data points in real time. Due to the limits of traditional automation, security operations has been heavily manual. But it is not easy to find trained SecOps personnel either, so organizations are caught between a rock and a hard place.

Enter AI agents. There are many SecOps tasks that AI agents are beginning to take on today. Some examples include triaging alerts like a SOC Analyst, behavior-based hunting for threat actors like a Threat Hunter, answering security questionnaires, and assessing vendor risk. As LLM generation costs decrease and speed improves, it will become feasible to have network environments where blue and red agent teams continuously hone their skills. Blue team AI agents could become ubiquitous and able to deliver cost-effective defense against increasingly complex red team attacks. Agentic flows will continue to enhance red teaming, particularly in areas such as jailbreaking, since regulations scrutinize offensive uses of LLMs. Security experts leveraging insights from blue team AI agents across multiple environments will be invaluable, especially as agentic tool risks diminish individual knowledge due to ease of use.

The potential of AI in security operations

GenAI has tremendous potential in automating laborious security operations. However, it takes expertise to get predictable results and contain the risks. The most promising path forward is AI agents built by security experts that use LLMs internally. Agents are not just an interesting technology, they are rapidly becoming a critical necessity in security operations because as the rate and complexity of attacks continues to rise rapidly and staffing remains limited, AI agents are the only way to close the gap. Security operations teams would be wise to evaluate and pilot AI agents in 2025. 



Source link

Leave a Comment