The MITRE ATT&CK Gap: Anthropic LLM ATT&CK Navigator Insights • William OGOU Cybersecurity Blog

Your threat intelligence feeds are losing their predictive power. For years, the cybersecurity industry relied on frameworks like MITRE ATT&CK to categorize threat actors by their technical sophistication and the breadth of techniques they employed. But artificial intelligence has fundamentally broken that scale.

In a landmark analysis covering March 2025 to March 2026, Anthropic partnered with Verizon to dissect the behavior of 832 malicious accounts banned for cyber-policy violations. Mapping 13,873 observed actions against the MITRE ATT&CK framework revealed a chilling reality: the barrier to entry for post-compromise exploitation has vanished. The percentage of medium-to-high-risk AI-enabled actors jumped from 33% to 56% in under a year.

Worse, AI is stripping away the unique fingerprints that defenders use to attribute attacks to specific hacker groups. Here is a deep dive into Anthropic’s findings, the introduction of the ARiES risk score, and why the cybersecurity industry must rewrite its taxonomy for the age of autonomous agents.

LLM ATT&CK Navigator Data

What to Remember

Post-Compromise Shift: Threat actors are moving from using AI for basic malware generation (T1587) to complex live-environment operations like lateral movement and account discovery (T1087).
Attribution is Dying: AI homogenizes Tactics, Techniques, and Procedures (TTPs). When an LLM generates the attack path and code dynamically, a script kiddie’s telemetry looks identical to a nation-state APT.
Scaffolding Beats Skill: Technical sophistication no longer dictates an actor’s risk level. The new metric for danger is agentic scaffolding how well an attacker orchestrates an AI to act autonomously.
The MITRE Gap: There are no ATT&CK IDs for “Autonomous Killchain Orchestration” or “AI-directed pivot decisions.” The framework must evolve to track agentic behavior.

The Data: AI Pushes Amateurs into Live Exploitation

Historically, the attack lifecycle operated on a steep skill curve. Anyone could buy a phishing kit or a commodity remote access trojan (RAT), but successfully navigating a compromised Windows domain executing Kerberos ticket attacks or pivoting via WMI required deep technical expertise.

Anthropic’s LLM ATT&CK Navigator data proves this is no longer true.

In the first half of the study, AI usage was heavily concentrated in the preparatory phases. 69% of actors used AI for T1587 (Develop Capabilities), mostly writing custom scripts or obfuscating payloads (T1027).

However, in the second half of the year, a distinct shift occurred. While initial access techniques like phishing (T1566) actually decreased by 8.6%, post-compromise techniques spiked. AI-assisted Account Discovery (T1087) rose by 8.9%, and Automated Exfiltration (T1020) increased by 6.2%.

The data highlights that lateral movement is the ultimate indicator of a high-risk actor. The 54 actors in the dataset who used AI for lateral movement averaged an ARiES (AI Risk Enablement Score) of 56.4 nearly 10 points above the mean. AI is effectively acting as a senior red team mentor, holding the hands of low-skill operators as they navigate live, hostile enterprise networks.

The Threat Intel Crisis: Why AI Breaks TTP Attribution

If an AI writes the malware, chooses the lateral movement path, and executes the commands, who is the attacker?

For decades, threat intelligence teams attributed cyberattacks to specific Advanced Persistent Threats (APTs) by tracking their unique Tactics, Techniques, and Procedures (TTPs). APT28 might use a specific PowerShell obfuscation routine; Lazarus Group might favor custom SMB exploits. These human “tells” allowed defenders to cluster activity and anticipate next steps.

Generative AI destroys this paradigm.

The Homogenization of Code: When a low-tier cybercriminal and a nation-state operator both ask Claude Code to generate a payload for an SSRF vulnerability, the resulting code lacks the idiosyncratic flaws, variable naming conventions, or structural quirks of human malware authors. The code is homogenized, perfectly syntactical, and sterile.
Dynamic Attack Paths: Human attackers rely on playbooks. They execute steps A, B, and C because that is what they trained on. Autonomous AI agents, however, assess the environment dynamically. If an AI agent hits a honeypot, it pivots in real-time based on system variables, creating a highly randomized attack path that defies traditional playbook clustering.
The Skill-Level Illusion: The Anthropic report explicitly notes that the number of techniques an actor uses no longer correlates with their actual skill. The least-skilled actors deployed 16 distinct techniques; the most skilled used 20. When an entry-level attacker can orchestrate a 20-step, multi-tactic intrusion using an LLM, their network telemetry looks completely indistinguishable from a highly resourced state-sponsored team.

Threat attribution based solely on TTPs is rapidly becoming impossible.

Agentic Scaffolding: The New Metric for Danger

If technical skill and TTP breadth no longer define a high-risk actor, what does? Anthropic’s analysis points to a new metric: Agentic Scaffolding.

The difference between a script kiddie and a critical threat is no longer whether they can write a buffer overflow; it is how effectively they wrap an AI model in tools (like Model Context Protocol servers) to allow it to operate autonomously.

Take the case of GTG-1002, a threat actor disrupted in November 2025. GTG-1002 achieved a perfect ARiES risk score of 100. Yet, they only used 30 MITRE techniques a number comparable to many medium-risk actors.

What made GTG-1002 a maximum-risk threat was the scaffolding:

Autonomous Tool Use: They deployed Claude Code on a Kali Linux machine, giving the AI access to penetration testing tools via MCP servers.
No Human Checkpoints: The AI scanned networks, discovered admin portals, exploited an SSRF vulnerability, and dumped AWS Secrets Manager credentials completely autonomously.
Human Intent, AI Execution: The human operator only provided the strategic goal (e.g., “Map the network and exfiltrate the proprietary workflow database”). The AI made all the tactical pivot decisions.

The highest-risk actors are transitioning from using AI as an advisor to using it as an execution engine.

The MITRE ATT&CK Gap: Missing the Autonomous Killchain

The MITRE ATT&CK framework is the bedrock of modern SOC operations, but Anthropic’s report exposes a critical blind spot: the framework is built to categorize what happened, not who or what made the decision.

Anthropic observed 13,873 actions that mapped to existing MITRE IDs. But the behaviors that actually made attacks like GTG-1002 devastating simply do not exist in the taxonomy.

There is currently no ATT&CK ID for:

Autonomous Killchain Orchestration: The act of an agent chaining Discovery, Credential Access, and Lateral Movement without human input.
AI-Directed Pivot Decisions: An agent altering its attack path dynamically based on realtime terminal output.
LLM Tool-Augmented Operations: The specific integration of offensive security tools into an LLM context window via MCP.

Anthropic is currently in discussions with MITRE to evolve the framework. Until new cross-cutting categories are created that address agentic behavior, defenders will struggle to build SIEM rules that accurately flag autonomous AI intrusions.

Lessons Learned: Adapting to AI-Enabled Threats

Stop relying on TTP clustering: If you are assuming an attack is low-risk because the TTPs seem disparate or uncoordinated, you might be watching an AI agent actively probing and adjusting to your network. Focus on behavioral containment, not attribution.
Monitor the scaffolding: Defenders must identify the signatures of AI agents communicating with MCP servers or rapidly iterating terminal commands at non-human speeds.
Accelerate your patch cycles: AI agents compress the time from reconnaissance to exploitation from days to minutes. If your vulnerability management SLA is measured in weeks, an autonomous agent will outmaneuver you.

Conclusion

The LLM ATT&CK Navigator data proves what the industry has long feared: AI has flattened the cybersecurity skill curve. As threat actors shift from using models as code generators to deploying them as autonomous network operators, traditional defensive playbooks and attribution models are becoming obsolete.

The industry must stop treating AI-enabled attacks as a future hypothetical and start rewriting our defensive taxonomies today. We must build defenses that assume the adversary moves at machine speed, makes dynamic decisions without human hesitation, and leaves no distinct fingerprints behind.

To further enhance your cloud security and implement Zero Trust, contact me on LinkedIn Profile or [email protected].

Resource

Original Anthropic research blog: LLM ATT&CK Navigator

Frequently Asked Questions (FAQ)

What is the LLM ATT&CK Navigator?

It is an interactive framework developed by Anthropic that maps observed AI-enabled malicious behaviors against the MITRE ATT&CK taxonomy, helping defenders understand how AI is used in real-world cyberattacks.

What is the ARiES risk score?

ARiES (AI Risk Enablement Score) is a 0-100 metric that evaluates an actor's threat profile, the AI model's contribution to the harm, and the resulting impact, prioritizing agentic behavior over sheer technical skill.

Why does AI make threat attribution more difficult?

AI homogenizes malicious code and dynamic attack paths. Because an AI generates flawless, sterile payloads and pivots unpredictably, the unique human 'tells' and TTPs used to identify specific hacker groups disappear.

What is 'agentic scaffolding' in a cyberattack?

Agentic scaffolding refers to the surrounding code, architecture, and tooling (like MCP servers) that attackers build around an AI model, allowing it to autonomously execute entire stages of a cyberattack without human intervention.

What gap did Anthropic find in the MITRE ATT&CK framework?

The current MITRE framework categorizes individual techniques but lacks IDs for autonomous AI behaviors such as AI-directed pivot decisions or autonomous killchain orchestration which are the true differentiators of high-risk attacks.