Weekly Musings Top 10 AI Security Wrapup: Issue 27 February 20, 2026 - February 26, 2026

Pentagon, Prompt Injection, and China’s AI Playbook: The Week AI Security Got Loud

Feb 27, 2026

The week of February 20, 2026, delivered a reckoning. Not in the abstract, conference-keynote sense. In the concrete sense of “your AI vendor might get declared a supply chain risk by your own government.” While security teams spent their days triaging a GitHub Copilot prompt injection that could take over your entire repository, Anthropic’s CEO was in a room with Pete Hegseth, being threatened with the Defense Production Act. Simultaneously, industrial-scale AI distillation attacks by Chinese labs were exposed, a financially motivated hacker with modest skills used commercial GenAI to breach 600 firewalls across 55 countries, and IBM’s annual X-Force report confirmed what most of us already knew but hadn’t put numbers to. This was not a normal week. Welcome to the AI security present tense.

If you’re a CISO trying to explain to your board why your vendor risk program now includes monitoring geopolitical standoffs between AI labs and cabinet secretaries, I feel for you. This newsletter exists precisely for that conversation. Bookmark the archive at rockcybermusings.com and check rockcyber.com for the advisory work that goes deeper.

1. Pentagon Threatens to Blacklist Anthropic Over “Woke AI” Refusal to Drop Safeguards

On February 24, Defense Secretary Pete Hegseth met with Anthropic CEO Dario Amodei and issued an ultimatum: strip the safety restrictions from Claude or face cancellation of Anthropic’s $200 million DoD contract, designation as a “supply chain risk,” and potential compulsion under the Defense Production Act. The sticking points are Claude’s restrictions against use in autonomous weapons without human oversight and mass domestic surveillance, positions Amodei has publicly defended for months. By February 25, the Pentagon had reached out to Boeing and Lockheed Martin, asking for an assessment of their dependence on Claude, a formal first step toward the supply chain risk designation. The designation, normally reserved for adversarial foreign vendors like Huawei, would effectively blacklist Anthropic across the defense industrial base. Claude is currently the only AI model cleared for use in classified U.S. military settings (NPR, Axios).

Why it matters

The “supply chain risk” label would cascade across DoD prime contractors, potentially forcing them to strip Claude from pipelines where it’s embedded in sensitive workflows, regardless of whether Anthropic’s position changes.
This sets a precedent for government compulsion of AI lab design decisions, specifically the argument that AI safety guardrails constitute a national security liability rather than an asset.
Competing models from Google, OpenAI, and xAI have already agreed to “all lawful use” terms, meaning Anthropic could lose classified-domain market share to firms that accepted no comparable restrictions.

What to do about it

If your organization operates in the defense contracting space, audit how deeply Claude is integrated in classification-adjacent workflows and assess your switching timeline if Anthropic loses its clearance standing.
Brief your board on the federal AI governance trajectory: the administration’s “innovation-first” stance is actively hostile to AI safety conditions as a contractual requirement.
Watch the DPA invocation question closely. If the administration successfully compels an AI lab’s design choices via DPA, your vendor agreements with any AI provider become significantly less predictable.

Rock’s Musings

Let’s be clear about what’s actually happening. The administration is demanding that an AI company strip restrictions on autonomous lethal force and mass domestic surveillance. Anthropic said no. Hegseth called that “woke AI.” I call it one of the most consequential AI governance fights in U.S. history, playing out while most of the industry’s attention was focused on vulnerabilities in developer tooling.

The era of assuming your AI vendor’s ethics are your ethics is over. When a government can designate your AI provider as a supply chain risk for maintaining safety policies, those policies become a business continuity variable. Build your AI governance program like that’s true, because it is.

2. Anthropic Exposes Industrial-Scale Claude Distillation Attacks by Three Chinese AI Labs

On February 23, Anthropic published a detailed disclosure identifying three Chinese AI laboratories, DeepSeek, Moonshot AI, and MiniMax, as having conducted coordinated campaigns to extract Claude’s capabilities at industrial scale. The three labs generated over 16 million exchanges with Claude through approximately 24,000 fraudulent accounts, bypassing China’s access restrictions using commercial proxy services running what Anthropic calls “hydra cluster” architectures, sprawling networks designed to mix distillation traffic with legitimate requests. Anthropic attributed each campaign with high confidence using IP correlation, request metadata, and infrastructure indicators. MiniMax drove the heaviest traffic, over 13 million exchanges, and Anthropic detected its campaign while still active, watching MiniMax pivot within 24 hours of Claude’s new model release to begin extracting the latest capabilities. DeepSeek’s requests specifically targeted reasoning capabilities and censorship-safe alternatives to politically sensitive queries, consistent with training models to evade content restrictions (Anthropic blog, The Register, TechCrunch, CNN).

Why it matters

Models built through illicit distillation strip the safety guardrails implemented by the originating lab. Chinese actors can acquire frontier AI capabilities without the restrictions that prevent those systems from assisting with bioweapons synthesis, offensive cyber operations, or disinformation at scale.
The scale of extraction, 16 million exchanges across 24,000 accounts, demonstrates this is not opportunistic scraping. It is a structured intelligence collection operation against American AI infrastructure.
Anthropic’s disclosure adds evidentiary support for stricter AI export controls and industry-wide distillation detection infrastructure, neither of which currently exists in a coordinated form.

What to do about it

Review any AI products or internal tools that wrap or call third-party model APIs. If those calls touch frontier models, understand whether your data is insulated from aggregation risks similar to these distillation patterns.
Engage your legal and procurement teams on supply chain risk assessments for AI model providers used by your organization, specifically their ToS enforcement posture and detection capabilities.
Track the policy response. Anthropic is explicitly calling for “a coordinated response across the AI industry, cloud providers, and policymakers.” That framing is the precursor to regulatory proposals.

Rock’s Musings

The distillation story gets presented as an IP theft problem. It is that, but only partly. The deeper problem is the safety bypass. When MiniMax runs 13 million exchanges through Claude to train its own model, it’s not paying licensing fees, but it’s also not inheriting Claude’s constitutional restrictions against helping with bioweapons or cyberattacks. It gets the capability without the cage.

The “hydra cluster” architecture Anthropic describes, 20,000 fraudulent accounts blending distillation traffic with legitimate use, is a detection problem that no single company can solve alone. This is what a coordinated AI security posture looks like when it’s absent. The industry needs shared threat intelligence on distillation patterns the same way it shares threat intel on ransomware actor TTPs. We don’t have that yet.

3. RoguePilot: Passive Prompt Injection in GitHub Codespaces Enables Full Repository Takeover

On February 24, Orca Security disclosed RoguePilot, a passive prompt-injection vulnerability in GitHub Codespaces that enabled attackers to achieve a full repository takeover by embedding malicious instructions in a GitHub Issue. No exploitation of Codespaces itself was required. When a developer opened a Codespace from a poisoned Issue, GitHub Copilot was immediately prompted with the Issue’s description and executed the hidden instructions, which were concealed inside HTML comments invisible to human reviewers. The attack chain exfiltrated the GITHUB_TOKEN by manipulating Copilot to access a symlinked sensitive file and append the token to a schema download request, leaking it to an attacker-controlled server. Microsoft patched the vulnerability following responsible disclosure, and no CVE had been assigned at time of reporting. Researcher Roi Nisimi at Orca called it a new class of AI-mediated supply chain attack (Orca Security, SecurityWeek, The Hacker News).

Why it matters

The attack required no special privileges. Anyone who could create or view a GitHub Issue in a targeted repository could trigger it, placing it within reach of anonymous threat actors and insider risks alike.
This is a demonstration that AI agents with God Mode permissions, terminal access, file reads, API tokens, and network connectivity, cannot reliably distinguish between developer instructions and adversarial content embedded in data.
Microsoft patched this specific chain, but the root problem, AI agents treating user-generated content as potentially trusted, persists across every developer tool that integrates an LLM agent into an active workspace.

What to do about it

Rotate any GITHUB_TOKENs generated in Codespaces environments. Even if your team wasn’t targeted, token hygiene is your first action.
Audit your developer tooling inventory for any AI assistants that ingest user-generated content, Issues, comments, pull requests, wikis, and operate with elevated permissions. RoguePilot will not be the last of its class.
Push your security engineering teams to review json.schemaDownload settings and symlink sandboxing defaults in any LLM-integrated development environments in your stack.

Rock’s Musings

This is the AI agent security problem crystallized into a single attack chain. You have a model that’s been given broad permissions to be helpful in a developer context, and the moment that model processes untrusted content from a public-facing surface like a GitHub Issue, the permissions become the attack surface. The model doesn’t know to distrust the Issue. It was designed to process it.

I’ve been telling clients since the agentic AI wave hit that “helpful” and “privileged” is a dangerous combination without an explicit trust boundary model. RoguePilot proves it doesn’t take a sophisticated threat actor or a zero-day. It takes someone who understands that AI agents read everything they’re shown and act on it.

4. GenAI Democratizes Cybercrime: Low-Skill Actor Breaches 600+ FortiGate Devices Across 55 Countries

Amazon Threat Intelligence published a report on February 20 documenting a Russian-speaking, financially motivated threat actor, assessed as an individual or small group, who used multiple commercial generative AI services to compromise over 600 FortiGate devices in 55 countries between January 11 and February 18, 2026. No FortiGate vulnerabilities were exploited. The campaign targeted exposed management interfaces and weak single-factor credentials, using AI to automate scanning, script generation, configuration parsing, and victim prioritization. Amazon’s CISO CJ Moses described the custom tooling as bearing hallmarks of AI-generated code: redundant comments, naive JSON parsing, and failures under edge cases. When targets proved too hardened, the actor abandoned them and moved on, a pattern consistent with AI-augmented scale rather than technical depth. Post-compromise activity included Active Directory compromise, credential dumping, and targeting of Veeam backup infrastructure (AWS Security Blog, BleepingComputer, Cybersecurity Dive, The Record).

Why it matters

The campaign demonstrates that AI services lower the barrier to entry for offensive cyber operations at scale. A single actor achieved what would historically require a well-resourced team.
The activity didn’t rely on any novel exploits. Exposed management ports and missing MFA are the root causes. AI just made systematic exploitation of those gaps cheap and fast.
Post-compromise actions, Active Directory compromise and backup targeting, align with ransomware pre-positioning. The technical unsophistication of the actor does not constrain the damage ceiling.

What to do about it

Immediately audit all perimeter devices, especially FortiGate, Palo Alto, and similar appliances, for management interfaces exposed to the internet on any port.
Enforce MFA across all VPN and admin access without exceptions. This campaign succeeded entirely because MFA was absent.
Harden backup infrastructure separately from production networks. Backup access should require distinct credentials, not shared Active Directory credentials, and should not be reachable from compromised VPN paths.

Rock’s Musings

Amazon’s description of this actor as achieving “operational scale that would have previously required a significantly larger and more skilled team” is the AI security risk thesis for 2026 stated as a fact. The threat actor’s own documentation, left exposed on their own infrastructure because of poor OpSec, acknowledges when targets are too hardened to exploit. They just move on. That’s not a skill constraint, it’s a volume play.

The lesson here is not new. Missing MFA and exposed management interfaces have been on the “fix this now” list for a decade. What’s new is that ignoring them now feeds a threat actor pipeline that doesn’t get tired, doesn’t need to be patient, and doesn’t cost much to operate. The economics of cyber offense just changed again.

5. IBM X-Force 2026 Threat Index: AI Accelerates Exploitation Speed, 300K ChatGPT Credentials Exposed

IBM published the 2026 X-Force Threat Intelligence Index on February 25, reporting a 44% increase in attacks initiated through the exploitation of public-facing applications, driven by missing authentication controls and AI-enabled vulnerability discovery. Vulnerability exploitation became the leading cause of incidents, accounting for 40% of cases in 2025. Active ransomware and extortion groups increased 49% year over year. Supply chain and third-party compromises nearly quadrupled since 2020. On the AI-specific threat front, X-Force identified over 300,000 exposed ChatGPT credentials from infostealer malware, signaling that AI platforms have reached the same credential risk profile as core enterprise SaaS systems. IBM also noted that North Korean IT worker schemes are using AI for synthetic identity creation and translation to operate across global marketplaces (IBM X-Force, UK Newsroom).

Why it matters

The ChatGPT credential exposure figure, 300,000 accounts, tells you that enterprise AI platform access is now a target of credential theft campaigns. Account compromise of AI tools creates follow-on risks including manipulated outputs, data exfiltration, and prompt injection by threat actors who gain access.
The 44% jump in public-facing application exploitation reflects that AI tools are helping attackers identify missing auth controls faster than organizations can patch them.
North Korean actors using AI for synthetic identity creation to penetrate companies as fake remote IT workers is an active, documented threat that most security teams are unprepared to detect.

What to do about it

Treat enterprise AI platform accounts like you treat cloud admin accounts. Enforce MFA, monitor for credential exposures via dark web intelligence, and rotate credentials regularly.
Add AI service accounts to your privileged access management inventory. ChatGPT, Claude, Gemini, and Copilot accounts with sensitive system access are high-value targets.
Build screening processes for remote technical hires that account for AI-augmented identity fabrication. Resume AI content detection is insufficient. Require live video verification with behavioral assessment.

Rock’s Musings

Three hundred thousand exposed ChatGPT credentials. Let that settle for a moment. A year ago, someone might have asked why stealing a ChatGPT account matters. Now, with AI tools embedded in code pipelines, document workflows, and agentic systems with access to production data, the answer is obvious. Your AI account is your data account.

IBM’s broader finding on supply chain compromise quadrupling since 2020 is the decade-long trend nobody fixed. We’ve known this curve for years. AI is not changing the attack surface, it’s accelerating movement across an attack surface that was already too wide.

6. OpenAI’s February Threat Report: Chinese Law Enforcement Uses ChatGPT to Target Japan’s Prime Minister

OpenAI published a new threat disruption report on February 25, detailing recent cases of AI misuse by nation-state and criminal actors. The report’s lead case involved a Chinese law enforcement operator using a ChatGPT account to attempt to undermine support for Japan’s Prime Minister through a coordinated influence operation. OpenAI’s principal investigator Ben Nimmo described the operation as unusually revealing of China’s strategy for covert influence operations and transnational repression. OpenAI banned the associated accounts and shared indicators with industry partners. The report also documented cases of AI-generated content being used across multiple platforms simultaneously, with threat actors using different AI models at different stages of their operational workflow (OpenAI, Axios).

Why it matters

Government-affiliated actors using commercial AI tools for foreign influence operations demonstrate that the threat is not limited to custom AI systems. Commodity access is sufficient.
The targeting of a democratic leader’s public legitimacy via AI-assisted influence operations previews what election and leadership integrity risks look like at scale.
OpenAI’s two-year publication cadence on these reports is generating actionable public threat intelligence, but the reports also reveal how much AI platforms are dealing with that goes unreported.

What to do about it

Build operational awareness of influence operations into your executive protection and media monitoring programs. AI-generated content can now target company leadership and board members with the same sophistication used against government officials.
If your organization operates in sectors with geopolitical exposure, energy, defense, or critical infrastructure, factor AI-assisted influence operations into your threat model.
Monitor OpenAI’s threat disruption report series as a recurring intelligence source. It’s public, primary, and reflects actual activity on a major AI platform.

Rock’s Musings

The interesting detail in this report is that the Chinese law enforcement operator used ChatGPT as one tool in a broader workflow that touched multiple AI platforms. That’s consistent with what Amazon observed in the FortiGate campaign and what Anthropic documented with the distillation attacks. Threat actors don’t pick one AI tool. They build multi-model workflows just like enterprise teams do.

The policy implication is significant. Restricting any single AI platform’s use by foreign actors doesn’t solve the problem. The capability is distributed. Coordination across AI providers for threat intelligence sharing is the only approach that has a realistic shot at tracking these operational workflows.

7. Anthropic Launches Claude Code Security: AI-Powered Vulnerability Scanning for Enterprise

On February 21, Anthropic began rolling out Claude Code Security, a new capability allowing Claude Code to scan software codebases for vulnerabilities and suggest patches. The feature is in limited research preview for Enterprise and Team customers. The announcement triggered a two-day sell-off in cybersecurity stocks, with GitLab dropping 8% and JFrog falling 25% amid fears that AI-native vulnerability scanning would cannibalize dedicated code-scanning platforms. Bank of America analysts pushed back on the severity of the threat, arguing that the tool poses a significant risk only to code-scanning specialists rather than broader security platforms, and noting that AI-based tools lack the visibility, control, and reliability to replace end-to-end security programs (WIU Cybersecurity Center, CNBC).

Why it matters

AI-native vulnerability scanning built directly into the development workflow is fundamentally different from dedicated SAST/DAST tools. If it achieves comparable accuracy, the vendor selection calculus for code security tools shifts.
The market reaction, wiping $4.3 billion in GitLab’s market cap in two days, reflects real investor concern about AI disruption of security tooling categories, not just hype.
For CISOs, this is a sign to evaluate your code scanning vendor relationships against the trajectory of AI-native alternatives, not to replace them immediately, but to understand your exit options.

What to do about it

Request a technical briefing from your code scanning vendor on how they are differentiating from AI-native alternatives. Price, integration, and accuracy benchmarks matter.
Pilot Claude Code Security with a representative engineering team on a low-risk codebase. Generate your own internal benchmark data rather than relying on vendor comparisons.
Reframe your security tool evaluation process to account for AI-native alternatives in every category, not just code scanning. This is not the last product of this type.

Rock’s Musings

The stock market reaction was overstated. Bank of America is right that AI-powered code scanning doesn’t replace end-to-end security platforms, at least not yet. But the trajectory matters more than the present capability gap. When Anthropic can embed vulnerability detection directly into the coding workflow where developers are already working, the friction of adopting a separate scanning tool becomes a competitive disadvantage for established vendors.

The question I’m asking clients is simpler: if your developers are already using Claude Code daily, and Claude Code Security reduces their friction for addressing vulnerabilities without switching tools, what is the case for a standalone scanning product that adds another context switch?

8. Malicious npm Campaign SANDWORM_MODE Uses 19 Packages to Harvest Crypto Keys and CI Secrets

On February 23, supply chain security firm Socket disclosed an active campaign it codenamed SANDWORM_MODE, a cluster of at least 19 malicious npm packages designed to harvest cryptocurrency keys, CI/CD secrets, and API tokens from developer environments. The packages used dependency confusion and typosquatting techniques to position themselves for installation by developers targeting legitimate packages. Socket described the campaign as a “Shai-Hulud-like” supply chain worm designed to propagate through developer ecosystems by targeting shared package dependencies across interconnected projects (The Hacker News, WIU Cybersecurity Center).

Why it matters

CI/CD secret theft is the direct path to production system compromise. Developer machines and pipelines hold credentials with scope that far exceeds what individual credentials should carry.
Supply chain attacks on npm are not novel, but the targeting of cryptocurrency infrastructure alongside CI secrets reflects a broadened target set, both immediate financial gain and longer-term access to production pipelines.
Organizations with open-source dependencies in their build pipelines and no dependency integrity validation are running a systemic risk that this campaign is designed to exploit.

What to do about it

Run a dependency audit on your npm packages today, specifically checking for recently added packages matching the naming patterns of heavily used libraries. Tool your CI pipeline to flag any package published in the last 90 days with rapid download acceleration.
Enforce software bill of materials requirements across your internal build pipelines and require cryptographic attestation for all published packages.
Restrict CI/CD secrets to minimal scope, rotate them on a regular schedule, and audit which pipelines are storing credentials with broader access than their task requires.

Rock’s Musings

Supply chain attacks on developer tooling have a compounding quality that distinguishes them from most other threat categories. When you compromise a developer’s machine or CI pipeline, you get access to the code before it ships. That means you get to touch production systems through a path that looks like normal development activity. Security controls built around the production perimeter don’t see it coming.

SANDWORM_MODE landing one week after Cline CLI’s supply chain compromise and in the same month as RoguePilot is not a coincidence. The developer toolchain is the current hot frontier. If your security program doesn’t have explicit coverage of the npm ecosystem, your CI/CD secrets posture, and your AI-assisted development tooling, you have a gap that multiple active campaigns are targeting right now.

9. AI-Driven Cyberattacks Now Breach Systems in an Average of 72 Minutes

A study published February 23 via BusinessWorld Online and cited in a February cybersecurity roundup found that AI-driven cyberattacks now breach target systems in an average of 72 minutes from initial contact. The figure illustrates how AI tooling is compressing the exploitation timeline by automating vulnerability identification, credential testing, lateral movement planning, and exploitation scripting in near-real time. The finding aligns with Amazon’s FortiGate campaign documentation and IBM X-Force’s 44% jump in application exploitation figures published the same week, suggesting a convergence of independent research pointing to the same fundamental shift (BusinessWorld Online, Advanced IT Technologies cybersecurity roundup).

Why it matters

A 72-minute average breach timeline means that the traditional “detect, investigate, respond” model is effectively broken at its current operational tempo. You need detection and automated containment that operates in minutes, not hours.
The compression of attack timelines is a direct consequence of AI tooling automating what previously required skilled human intervention at each step.
Your incident response plans and SOC SLAs were probably not written with a 72-minute adversary dwell window in mind. That’s the process gap this creates.

What to do about it

Benchmark your current mean time to detect against the 72-minute figure. If your average detection time exceeds that, you are consistently behind the adversary timeline.
Implement automated network-isolation triggers for anomalous credential usage and lateral-movement indicators that don’t require human approval in the initial response phase.
Run a tabletop exercise against a 72-minute breach scenario. The point is forcing your team to confront the tempo of modern AI-assisted attacks on paper before they face it live.

Rock’s Musings

Seventy-two minutes. Think about what your SOC is doing at 3:17 a.m. on a Tuesday when that timer starts. The alert might not even be triaged by the time the attacker has Active Directory. This is the operational reality that makes the continuous, automated detection and response investment non-negotiable.

The productivity framing of AI security tools, “AI helps your analysts work faster,” is a secondary benefit at best. The primary reason to deploy AI-assisted detection is that the adversary is already using it, and your organization cannot match the attack tempo with humans alone.

10. Exposed LLM Endpoints Expanding the Attack Surface for Organizations Running Their Own Models

On February 23, threat researchers published an analysis in The Hacker News documenting how organizations deploying their own large language models are creating new internal attack surfaces through unprotected LLM endpoints and supporting APIs. The research observed that modern security risks for LLM deployments originate less from the models themselves and more from the infrastructure serving them: APIs, orchestration layers, model registries, and tool-calling endpoints. Each new LLM endpoint expands the attack surface, often with no authentication controls, minimal logging, and no separation from internal data systems (The Hacker News, WIU Cybersecurity Center).

Why it matters

Organizations are deploying internal LLMs without applying the same security posture they would apply to any other internal API. Unauthenticated model endpoints with access to internal data are exactly the kind of target that opportunistic and targeted attackers seek.
LLM orchestration layers, the middleware connecting models to internal tools, databases, and external services, represent a new class of attack surface with no established hardening standard.
The vulnerability exposure is not primarily in the model weights or training data. It is in the infrastructure decisions made during deployment, which are often made by teams with no security review.

What to do about it

Conduct an inventory of every internal LLM endpoint in your environment, including development and experimental deployments. Apply authentication requirements to all of them without exception.
Include AI infrastructure, model endpoints, orchestration APIs, vector databases, and tool-calling integrations in your penetration testing scope. Most current pentest methodologies do not cover these surfaces.
Establish a minimum security baseline for any internal AI deployment, including authentication, logging, rate limiting, and network segmentation, before any model reaches non-development environments.

Rock’s Musings

This is the AI security gap that will generate the embarrassing breach stories of 2027. Right now, teams are spinning up internal RAG systems, agentic workflows, and model fine-tuning pipelines on infrastructure that wasn’t designed for adversarial access. The model is not the weak link. The unprotected API sitting in front of it, with access to your internal document store, is the weak link.

The pattern is familiar. Every time a new technology category emerges, the same thing happens: deployment races ahead of security posture, and the community learns the hard way. We’re early in the AI infrastructure deployment curve. The window to get ahead of this is narrowing fast.

The One Thing You Won’t Hear About But You Need To

Infostealer Malware Now Targeting OpenClaw AI Agent Configuration Files and Gateway Tokens

On February 24, The Hacker News reported on a new infostealer variant specifically designed to steal OpenClaw AI agent configuration files, API tokens, and gateway credentials from developer and enterprise systems where OpenClaw is installed. The malware targets OpenClaw’s local storage paths and the WebSocket Gateway daemon credentials, providing persistent, privileged access to the systems on which the agent is installed. The timing of this discovery follows the Cline CLI supply chain compromise that forced thousands of unauthorized OpenClaw installations in mid-February, and a separately disclosed critical vulnerability in OpenClaw versions before 2026.1.29 (CVE-2026-25253, CVSS 8.8) that allowed unauthenticated operator-level access through a crafted WebSocket handshake (The Hacker News).

Why it matters

AI agent credentials are a new, high-value target class. An attacker with OpenClaw gateway tokens has a persistent foothold in the target environment with broad system permissions, the same permissions OpenClaw uses to perform its legitimate autonomous tasks.
The convergence of a supply chain attack forcing OpenClaw installations, a critical unpatched vulnerability in older versions, and a dedicated infostealer targeting its credentials represents a rare three-vector alignment against a single AI platform.
Organizations that don’t know OpenClaw is installed in their environment, which includes anyone who installed Cline CLI during the February 17 compromise window and didn’t remediate, are running exposed agent infrastructure they may not know exists.

What to do about it

Run an immediate audit for OpenClaw installations across your developer and CI/CD environments. If you find instances you didn’t authorize, treat them as compromised until proven otherwise.
Verify all OpenClaw installations are running version 2026.1.29 or later to eliminate the CVE-2026-25253 authentication bypass exposure.
Restrict OpenClaw’s network access to explicitly needed destinations and audit the permissions granted to its agent execution context. AI agents with broad permissions and persistent daemons should be treated as privileged processes, not user applications.

Rock’s Musings

Nobody is writing the headline “Infostealer Targets AI Agent Credentials.” But they should be. This is the next phase of the credential theft ecosystem, and it has been visible coming for anyone who has watched how quickly OpenClaw gained deployment share in enterprise developer environments.

AI agents have privileged access to your systems by design. That’s what makes them useful. It’s also what makes their credentials extraordinarily valuable to attackers. When an infostealer can steal an AI agent’s configuration and token in the same way it steals browser-stored passwords, you’ve added a new category of credential to your exposure surface. Your existing credential monitoring program almost certainly doesn’t cover this yet.

Share RockCyber Musings

References

Amazon Web Services Security. (2026, February 20). AI-augmented threat actor accesses FortiGate devices at scale. https://aws.amazon.com/blogs/security/ai-augmented-threat-actor-accesses-fortigate-devices-at-scale/

Anthropic. (2026, February 23). Detecting and preventing distillation attacks. https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks

Allyn, B. (2026, February 24). Hegseth threatens to blacklist Anthropic over ‘woke AI’ concerns. NPR. https://www.npr.org/2026/02/24/nx-s1-5725327/pentagon-anthropic-hegseth-safety

Swan, J., & Borsuk, A. (2026, February 25). Trump admin moves toward blacklisting Anthropic in AI safeguards fight. Axios. https://www.axios.com/2026/02/25/anthropic-pentagon-blacklist-claude

The Hacker News. (2026, February 24). RoguePilot flaw in GitHub Codespaces enabled Copilot to leak GITHUB_TOKEN. https://thehackernews.com/2026/02/roguepilot-flaw-in-github-codespaces.html

Orca Security. (2026, February 24). RoguePilot: Critical GitHub Copilot vulnerability exploit. https://orca.security/resources/blog/roguepilot-github-copilot-vulnerability/

SecurityWeek. (2026, February 24). GitHub Issues abused in Copilot attack leading to repository takeover. https://www.securityweek.com/github-issues-abused-in-copilot-attack-leading-to-repository-takeover/

The Hacker News. (2026, February 23). Anthropic says Chinese AI firms used 16 million Claude queries to copy model. https://thehackernews.com/2026/02/anthropic-says-chinese-ai-firms-used-16.html

CNBC. (2026, February 24). Anthropic accuses DeepSeek, Moonshot and MiniMax of distillation attacks on Claude. https://www.cnbc.com/2026/02/24/anthropic-openai-china-firms-distillation-deepseek.html

TechCrunch. (2026, February 23). Anthropic accuses Chinese AI labs of mining Claude as US debates AI chip exports. https://techcrunch.com/2026/02/23/anthropic-accuses-chinese-ai-labs-of-mining-claude-as-us-debates-ai-chip-exports/

IBM. (2026, February 25). IBM 2026 X-Force Threat Intelligence Index. https://uk.newsroom.ibm.com/ibm-2026-x-force-threat-index

BleepingComputer. (2026, February 21). Amazon: AI-assisted hacker breached 600 Fortinet firewalls in 5 weeks. https://www.bleepingcomputer.com/news/security/amazon-ai-assisted-hacker-breached-600-fortigate-firewalls-in-5-weeks/

Cybersecurity Dive. (2026, February 23). AI helps novice threat actor compromise FortiGate devices in dozens of countries. https://www.cybersecuritydive.com/news/ai-cyberattacks-fortigate-amazon/812830/

OpenAI. (2026, February 25). Disrupting malicious uses of AI. https://openai.com/index/disrupting-malicious-ai-uses/

CNBC. (2026, February 23). Cybersecurity stocks drop for a second day as new Anthropic tool fuels AI disruption fears. https://www.cnbc.com/2026/02/23/cybersecurity-stocks-anthropic-ai-crowdstrike.html

WIU Cybersecurity Center. (2026, February 25). Cybersecurity news. https://www.wiu.edu/cybersecuritycenter/cybernews.php

The Hacker News. (2026, February 23). Malicious npm packages harvest crypto keys, CI secrets, and API tokens. https://thehackernews.com/2026/02/malicious-npm-packages-harvest-crypto.html

The Hacker News. (2026, February 24). Infostealer steals OpenClaw AI agent configuration files and gateway tokens. https://thehackernews.com/2026/02/infostealer-steals-openclaw-ai-agent.html

BusinessWorld Online via Advanced IT Technologies. (2026, February 23). AI-driven cyberattacks now breach systems in 72 minutes, study finds, cited in: February 2026 cybersecurity news roundup. https://www.advancedittechnologies.com/post/february-2026-cybersecurity-news-roundup-major-breaches-ai-driven-attacks-critical-vulnerabiliti

The Hacker News. (2026, February 23). How exposed endpoints increase risk across LLM infrastructure. https://thehackernews.com/2026/02/how-exposed-endpoints-increase-risk.html

Dark Reading. (2026, February 21). 600+ FortiGate devices hacked by AI-armed amateur. https://www.darkreading.com/threat-intelligence/600-fortigate-devices-hacked-ai-amateur

The Record. (2026, February 23). Russian-speaking hackers used gen AI tools to compromise 600 firewalls, Amazon says. https://therecord.media/gen-ai-fortigate-hackers-russia

Discussion about this post

Ready for more?