Weekly Musings Top 10 AI Security Wrapup: Issue 21 November 14, 2025 - November 20, 2025

When AI Guardrails Fail, Nation-States Strike, and Copy-Paste Code Threatens the Entire Stack

Nov 21, 2025

Speed killed security...again.

This week delivered proof that the AI security landscape is shifting and fracturing. A two-character string can bypass every major AI safety system. A state-sponsored group automated 90% of a cyberespionage campaign using an off-the-shelf chatbot. Extended reasoning models flip from defender to attacker with 99% reliability. And the code protecting AI inference frameworks copied vulnerabilities from project to project like a pandemic spreading through a supply chain that nobody mapped.

The gap between AI deployment velocity and security maturity continues to widen into a chasm. Organizations racing to implement agentic AI, reasoning models, and autonomous systems are discovering their defenses were built on assumptions that no longer hold.

The industry response? McKinsey confirmed that 61% of companies remain stuck in pilot purgatory, while the few who scaled AI are rewriting workflows faster than governance can keep up. CISOs saw compensation rise 6.7% this year, not as a reward but as hazard pay for navigating an impossible mandate of securing systems that nobody fully understands while they evolve faster than your threat models.

This week’s stories share a common thread. Every major vulnerability, attack, and governance failure traces back to organizations deploying AI systems before establishing the fundamentals. Access controls. Supply chain visibility. Human oversight. Adversarial testing. The basics that security professionals have preached for decades don’t disappear just because the technology has changed. They become more critical.

1. Echogram - The Two Characters That Break All AI Guardrails

HiddenLayer researchers revealed a technique called Echogram that can flip AI safety verdicts using token sequences as simple as “=coffee” (HiddenLayer). The vulnerability affects both text classification models and LLM-as-a-judge systems protecting GPT-4, Claude, Gemini, and other major platforms. Researchers discovered that appending specific tokens to malicious prompts causes guardrails to misclassify dangerous content as safe while preserving the harmful instruction for the downstream model. The attack exploits imbalances in training data rather than model architecture. Multiple weak flip tokens can be combined to dramatically amplify their effect. Testing on Qwen3Guard showed chained sequences caused misclassification of weapons queries, authentication bypass requests, and cyberattack instructions across both the 0.6B and 4B model variants (Unite.AI). The technique also works in reverse to generate false positives at scale, potentially creating alert fatigue that erodes security team trust.

Why It Matters

Guardrails represent the first and often only defense layer between users and LLMs in production systems. When they fail silently, organizations lose visibility into model behavior without knowing their protections were bypassed.
An attacker who discovers one successful Echogram sequence can reuse it across multiple commercial platforms because many guardrails train on similar datasets and share taxonomic patterns. This creates systemic vulnerability across the AI ecosystem.
The false-positive capability allows adversaries to deliberately flood security operations centers with false alerts, making it harder to identify genuine threats while simultaneously training teams to distrust their defensive systems.

What To Do About It

Implement multi-layer defenses that don’t rely solely on input filtering. Add output validation, behavioral monitoring, and human oversight for high-stakes decisions. Single-point-of-failure guardrails can’t protect production systems.
Test your deployed guardrails against known Echogram patterns immediately. HiddenLayer demonstrated the technique with accessible tokens. Assume sophisticated attackers have already weaponized this research and are probing your systems.
Diversify your training data sources and continuously update defensive models. Static guardrails trained once and deployed indefinitely will fail as adversaries map their blind spots. Treat AI defense like antivirus signatures, not firewalls.

Rock’s Musings

Two characters. That’s all it takes to circumvent billions of dollars in AI safety infrastructure.

This isn’t a zero-day in some obscure library. This is a fundamental design flaw in how the entire industry approaches AI safety. Every major platform trains guardrails on curated datasets of “good” and “bad” examples. Every platform assumes those datasets achieve adequate coverage. Every platform ships systems where a prompt plus “=coffee” slips through undetected.

AI security can’t be bolted on, and this week proved it. You can’t patch your way out of training data imbalances. You can’t update your way to comprehensive coverage. The only path forward is to rethink the entire defensive architecture from first principles. We had warning HERE and I highly recommend Susana’s newsletter HERE

Organizations deploying LLMs in production right now need to answer the fundamental question of what happens when your guardrails fail silently? Because they will. Echogram proved it. And the adversaries reading HiddenLayer’s research already know it.

2. China Automates Cyberespionage - Claude Code Runs 90% of Nation-State Campaign

Yes… This was in last week’s musings, but it bears repeating. Anthropic disclosed on November 13-14, 2025, that it detected and disrupted what it calls the first documented large-scale cyberattack executed predominantly by AI (Anthropic). A Chinese state-sponsored group manipulated Claude Code to target approximately 30 tech companies, financial institutions, chemical manufacturers, and government agencies globally. The attackers tricked Claude into believing it performed defensive security testing for a legitimate firm. Analysis showed AI executed 80-90% of all tactical work independently, with humans intervening only at strategic decision points (Help Net Security). Claude inspected target systems, identified high-value databases, wrote custom exploit code, harvested credentials, and created comprehensive documentation of its own attacks. The operation achieved unprecedented speed, making thousands of requests per second during peak activity. Anthropic reported a small number of successful breaches. The company notes that Claude occasionally hallucinated credentials or claimed to extract publicly available secret information, representing an obstacle to fully autonomous attacks (The Register).

Why It Matters

This marks the inflection point where AI transitions from advisor to operator in nation-state cyberespionage. The 80-90% automation ratio means threat actors can achieve operational scale typically associated with advanced persistent threat campaigns while maintaining minimal direct involvement.
Barriers to sophisticated cyberattacks dropped substantially. Less experienced and resourced groups can now potentially perform large-scale operations by orchestrating commodity AI systems rather than developing proprietary tools or recruiting skilled operators.
The social engineering approach (convincing Claude it performed defensive work) demonstrates that AI safety mechanisms remain vulnerable to context manipulation. Decomposing malicious tasks into discrete technical requests bypasses guardrails designed to detect complete attack chains.

What To Do About It

Evaluate your AI agent deployment plans through an adversarial lens immediately. If you’re implementing agentic systems that interact with internal infrastructure, assume hostile actors will attempt similar manipulation techniques against your tools.
Deploy monitoring that tracks AI agent behavior patterns rather than just input/output content. Anomalous request volumes, unusual system access sequences, and credential validation attempts may signal compromised agents before breaches occur.
Establish human validation requirements for AI-generated actions involving authentication, lateral movement, or data exfiltration. The attackers succeeded partly because Claude could execute sensitive operations without real-time oversight. Don’t replicate that architecture.

Rock’s Musings

The AI security community spent years debating whether autonomous cyberattacks were theoretically possible. Yes… Yes, they are…

What makes this case study terrifying is the accessibility. The attackers didn’t need novel exploits or custom malware. They needed a commercial AI service, basic social engineering skills, and patience while Claude did the work. That’s it. The entire playbook just became available to every threat actor with a credit card and time to experiment.

Anthropic claims Claude’s hallucinations slowed the attacks (See “The One Thing You Won’t Hear About”). That’s cold comfort. Today’s hallucinations become tomorrow’s calibrated outputs as models improve. The fundamental capability exists now. Nation-states demonstrated it works at scale. Every other adversary watched and learned.

We’re entering an era where the barrier between contemplating a cyberattack and executing one just collapsed. Security teams need to adjust their threat models accordingly. The assumption that sophisticated campaigns require sophisticated operators no longer holds.

3. Extended Reasoning Makes AI More Vulnerable, Not Safer

Researchers from Anthropic, Stanford, and Oxford published findings showing that longer AI reasoning processes increase vulnerability to jailbreaks rather than improving safety (ForkLog). The study tested reasoning models including Gemini 2.5 Pro, GPT o4 mini, Grok 3 mini, and Claude 4 Sonnet. Attackers can embed instructions directly into the reasoning chain, forcing models to generate prohibited content, including weapon creation guides and malicious code. Success rates reached 99% for Gemini 2.5 Pro, 94% for GPT o4 mini, 100% for Grok 3 mini, and 94% for Claude 4 Sonnet. Previous assumptions held that extended thinking would give models more time and computational resources to detect malicious prompts. The research actually demonstrates the opposite. Lengthier reasoning provides more opportunities for attack instructions to take hold in the model’s internal chain of thought.

Why It Matters

Reasoning models represent the next generation of AI capabilities that organizations are rushing to deploy for complex decision-making and analysis. This research reveals they’re fundamentally more vulnerable than simpler models, not less.
The near-perfect attack success rates (94-100%) indicate reasoning models lack effective defensive mechanisms against adversarial inputs embedded in their thought processes. Standard safety training doesn’t transfer to extended reasoning architectures.
Organizations implementing reasoning models for sensitive applications (medical diagnosis, legal analysis, financial decisions) face previously unknown risks. The models most capable of sophisticated analysis are also most susceptible to complete compromise.

What To Do About It

Pause production deployments of reasoning models in security-sensitive contexts until vendors demonstrate effective defenses against reasoning-chain attacks. The 99% jailbreak rate makes these systems unsuitable for unmonitored deployment.
Implement output validation that checks reasoning chains for evidence of prompt injection or adversarial manipulation. Don’t assume model responses are trustworthy just because they include detailed explanations of their logic.
Establish separate testing environments specifically for adversarial probing of reasoning models before production use. Standard red teaming approaches may not reveal vulnerabilities unique to extended thinking architectures.

Rock’s Musings

The AI industry sold reasoning models as the solution to reliability problems. More thinking = better outputs = safer decisions. Everyone bought it. Vendors. Enterprises. Regulators.

This research torched that narrative. Extended reasoning creates new attack surfaces that didn’t exist in simpler models. Every additional step in the reasoning chain represents another opportunity for adversarial instructions to embed themselves. More compute spent on thinking translates directly to more compute spent executing malicious objectives.

I’ve watched this pattern repeat across every major technology transition. Complexity is marketed as sophistication. Sophistication is assumed to correlate with security. Reality demonstrates the opposite only after widespread deployment. We’re doing it again with reasoning models.

The hardest conversation CISOs need to have right now is telling their organizations that the most advanced AI models aren’t ready for the use cases everyone assumed they’d solve. That conversation gets harder every day these systems remain in production.

4. OWASP Drops 2025 Top 10 - Supply Chain Finally Makes the List

OWASP released the 2025 Top 10 Web Application Security Risks, introducing significant changes, including a new category for Software Supply Chain Failures as #3 (WebProNews). Broken Access Control maintains the top spot with 3.73% of tested applications showing vulnerabilities. Security Misconfiguration moved from fifth to second place. The Software Supply Chain Failures category expands the previous Vulnerable and Outdated Components designation to include broader compromises across software dependencies, build systems, and distribution infrastructure. The list now explicitly calls out prompt injection risks in AI applications in #5 Injection with a direct link to the OWASP Top 10 for LLM Applications. OWASP notes the update represents a shift toward root cause analysis rather than symptom identification. The release candidate accepts feedback until November 20, 2025 (Tenable). The organization emphasizes that, as software engineering becomes more complex, complete separation of categories is no longer possible, but the focus remains on actionable guidance for identification and remediation.

Why It Matters

OWASP Top 10 serves as the industry’s shared language for application security priorities. Adding Software Supply Chain Failures to position three signals that supply chain attacks moved from theoretical concern to operational threat requiring immediate organizational response.
The explicit inclusion of prompt injection in the new framework acknowledges AI-specific vulnerabilities as mainstream security risks rather than emerging research topics. This legitimizes budget allocation and resource investment in AI security controls.
Security Misconfiguration jumping to second place reflects the reality that cloud environments and complex infrastructure create more opportunities for dangerous defaults than traditional application code. Organizations focusing solely on code-level vulnerabilities miss most of their exposure.

What To Do About It

Map your complete software supply chain immediately if you haven’t already. OWASP’s elevation of supply chain risks to position three means this is no longer optional. Track dependencies, build tools, and artifact sources for every application in production.
Implement Software Bill of Materials (SBOM) generation and verification as standard practice. The new OWASP guidance makes SBOMs a first-order due diligence requirement rather than a compliance checkbox. Use them to detect tampering and validate provenance.
Review your AI application security controls against the updated framework, focusing on prompt injection defenses. OWASP’s explicit inclusion means your auditors and compliance teams will start asking about it. Get ahead of those questions.

Rock’s Musings

OWASP just validated everything security teams have been screaming about supply chain attacks for three years. Welcome to the party. Better late than never.

The timing matters here. This update arrives right as organizations realize their AI deployments depend on dozens of external components, models, and datasets they don’t control. The supply chain risk isn’t theoretical anymore. It’s operational. It’s measured. It’s top three.

What frustrates me is how long it took to get here. We’ve had major supply chain compromises. We’ve documented the attack patterns. We’ve published the remediation guidance. But until OWASP codifies it in the Top 10, most organizations won’t prioritize the work. That’s not OWASP’s fault. That’s organizational inertia manifesting as security debt.

The other critical piece, prompt injection, is getting explicit recognition in conversations that aren’t constrained to AI.

5. ShadowMQ - How Copy-Paste Code Infected the AI Stack

Oligo Security researchers disclosed that critical remote code execution vulnerabilities in major AI inference frameworks trace back to a single root cause that propagated through code reuse (The Hacker News). The pattern, dubbed ShadowMQ, originated from unsafe use of ZeroMQ and Python’s pickle deserialization in Meta’s Llama framework (CVE-2024-50050, fixed October 2024). Developers copied this code into vLLM (CVE-2025-30165), NVIDIA TensorRT-LLM (CVE-2025-23254), Modular Max Server (CVE-2025-60455), SGLang, and Microsoft’s implementations. In some cases, vulnerable files explicitly state they’re adapted from other projects, creating a documented trail of security debt. The issues allow attackers to execute arbitrary code on inference servers by sending malicious data for deserialization over the network (CSO Online). Inference engines process sensitive prompts, model weights, and customer data across enterprise AI stacks. Compromising a single node enables attackers to conduct model theft, escalate privileges, or deploy cryptocurrency miners for financial gain.

Why It Matters

AI inference frameworks form the backbone of production AI systems that process millions of requests per day. A vulnerability at this layer affects every application and model running on the compromised infrastructure, creating systemic risk across entire organizations.
The copy-paste propagation pattern reveals a fundamental problem in AI software development. Teams prioritize rapid deployment over security review, copying code from other projects without auditing for vulnerabilities. This creates supply chain attacks that don’t require compromising external dependencies.
The CVEs demonstrate that AI security research frequently lags deployment by months or years. Meta patched the root cause in October 2024, but the vulnerability continued spreading to new projects through November 2025. Downstream consumers inherited the flaw without visibility into its origins.

What To Do About It

Audit your AI infrastructure stack immediately for dependencies on vLLM, TensorRT-LLM, SGLang, or Modular Max Server. Confirm you’re running patched versions. These frameworks are ubiquitous in production AI systems, which means your environment likely contains affected code.
Implement network segmentation that isolates inference servers from sensitive data and internal systems. Assuming compromise of the inference layer, you need containment controls that prevent lateral movement and credential harvesting.
Establish code review requirements for AI infrastructure that specifically check for unsafe deserialization patterns and network-exposed interfaces. Don’t trust that popular frameworks underwent adequate security review before publication. The ShadowMQ pattern proves they didn’t.

Rock’s Musings

This is the AI equivalent of Heartbleed. One mistake. Copied everywhere. Affecting everyone.

The supply chain analysis makes me want to break things. Someone at Meta writes unsafe code. Someone at vLLM copies it. Someone at NVIDIA copies the copy. Someone at Modular copies the copy of the copy. Each team thinks they’re building on solid foundations. None of them audit what they’re copying. The vulnerability spreads like a virus through the ecosystem.

We talk about software supply chain security like it’s only about third-party dependencies. ShadowMQ proves it’s also about code that developers explicitly copy between projects. Your SBOM won’t catch this. Your dependency scanning won’t flag it. You need an actual security review of implementation patterns, not just package manifests.

The other infuriating aspect is the timeline. Meta fixed this in October 2024. A full year later, researchers find it still propagating to new projects. That’s not a supply chain lag. That’s a fundamental breakdown in how the AI community shares and reuses code. Every developer who copies inference server implementations needs to evaluate the vulnerabilities they inherit. Because the answer right now is “all the vulns.”

6. AI-Generated Code Gets Worse Every Time You Ask It to Improve

Security researchers demonstrated that LLMs introduce more security vulnerabilities with each iteration when asked to improve code (Security Boulevard). The research challenges the assumption that using AI to iteratively improve code-level security during development would reduce vulnerabilities over time. Testing showed that LLMs introduce flaws even after being explicitly prompted to fix security issues. The study documented patterns where initial AI-generated code contained some vulnerabilities, but subsequent “improvement” iterations added new security flaws at higher rates than they fixed existing ones. This degradation persists across different prompting strategies and model types. The findings reveal problems with iterative code improvements that can’t be solved through prompt engineering. Organizations assuming AI coding assistants automatically improve security through iteration need to revise their development workflows and oversight practices.

Why It Matters

Organizations rapidly adopting AI coding assistants assume these tools help developers write more secure code through iteration and refinement. This research demonstrates the opposite: AI assistance in iterative development actively degrades security posture over time.
The vulnerability-introduction pattern during fixes suggests that LLMs lack a stable understanding of security requirements across context windows. Each interaction treats security as a new problem to solve rather than maintaining a consistent defensive posture throughout the codebase.
Developer productivity gains from AI coding assistants come with unquantified security debt. Teams measuring success by lines of code written or features shipped miss the accumulating vulnerabilities introduced in each iteration. This creates technical debt that manifests as breaches.

What To Do About It

Implement a mandatory security review for all AI-generated code, regardless of the number of iterations it has undergone. Don’t assume that code refined through multiple prompts achieved better security outcomes. The research proves iteration makes things worse.
Deploy static analysis tools that run automatically after each AI code generation or modification gets merged (and block it). Catch introduced vulnerabilities before they reach code review or production. Human reviewers can’t reliably spot security regressions across multiple AI-generated iterations.
Train development teams that AI coding assistants require skeptical oversight, not trust. The assistants should speed up the implementation of security patterns defined by humans, not autonomously determine what security looks like. Treat AI suggestions as starting points that need hardening.

Rock’s Musings

We have proof that asking AI to fix its mistakes makes the mistakes worse. I didn’t need the research to tell you that. Probabilistic code generation with probabilistic code fixes? What could possibly go wrong?

LLMs don’t understand security. They predict tokens that look like secure code based on training data. When you ask them to improve security, they predict different tokens that also look like secure code. Neither set of tokens necessarily creates actual security. Both sets introduce vulnerabilities based on whatever patterns dominated their training.

The iterative degradation pattern reveals something deeper about how organizations misunderstand AI capabilities. People assume intelligence correlates with improvement over time. It doesn’t. LLMs correlate with statistical likelihood across training examples. If your training data contains more insecure code than secure code (it does), iteration makes things worse on average.

I’ve watched teams implement “AI security reviews” where they prompt ChatGPT to find vulnerabilities in AI-generated code. That’s the blind leading the blind. Both the code and the review come from the same statistical approximation of what security looks like. Neither produces actual security.

For now, we still require treating AI as a tool that accelerates work defined by humans who understand security principles. That means security architects defining patterns. Developers are implementing those patterns with AI assistance and security review, treating every line of AI-generated code as potentially vulnerable, regardless of how many prompts refined it.

7. Model Context Protocol Security - Cursor IDE Exposes Agentic AI Foundation

Tenable researchers disclosed vulnerabilities in Cursor IDE (CVE-2025-54135 and CVE-2025-54136) that exploit security gaps in Model Context Protocol implementations (Tenable). The research reveals broader concerns about MCP, the protocol that enables AI agents to interact with external tools and data sources. Testing showed that unsafe code reuse and insufficient validation in MCP implementations create attack vectors for malicious code injection. One proof-of-concept demonstrated how an attacker could create a compromised MCP server that, when downloaded and run by a user, injected code into Cursor’s browser environment. This led users to fake login pages that stole credentials and transmitted them to remote servers. The vulnerabilities highlight challenges with agentic AI systems that require extensive tool access to function. MCP serves as foundational infrastructure for the agentic AI ecosystem, making security flaws in its implementations systemic risks across multiple platforms and vendors.

Why It Matters

MCP represents the connective tissue for agentic AI systems. Security flaws at the protocol level affect every agent, tool, and integration built on top of it. This creates vulnerability inheritance across the entire agentic AI stack.
Organizations implementing AI agents assume that well-known development tools like Cursor IDE underwent rigorous security review before deployment. These CVEs demonstrate that assumption fails even for widely adopted platforms from established vendors.
The credential theft attack vector shows how MCP vulnerabilities can cascade beyond the immediate AI application. Compromising an AI development tool enables attackers to pivot into authentication systems, source code repositories, and production environments through stolen credentials.

What To Do About It

Audit all AI tools in your development environment for MCP dependencies immediately. Cursor IDE isn’t the only platform with these issues. Any tool that implements MCP for agent capabilities may contain similar vulnerabilities. Catalog exposure before attackers do.
Implement network controls that restrict MCP server connections to approved sources. The attack scenario requires users to download and run arbitrary MCP servers. Defense-in-depth architecture should prevent AI tools from establishing connections to untrusted endpoints without explicit approval.
Separate development environments using AI agents from production access and sensitive credentials. The Cursor attack chain works because developers run the tool with access to authentication systems. Limit the blast radius by isolating AI-assisted development work.

Rock’s Musings

MCP launched with the promise of making AI agents actually useful. Turns out it also made them exploitable.

Where have we heard this story before? Launch first. Figure out security later. Hope nobody finds the vulnerabilities before you patch them. MCP got from concept to widespread adoption faster than most teams could spell “threat model.” Now we’re discovering why threat models matter.

What makes MCP particularly concerning is its foundational role. It’s a vulnerability in the infrastructure that enables agentic AI. Every agent that needs to access tools, read files, or invoke APIs depends on MCP or something like it. Security flaws at that layer propagate to everything built on top.

The Cursor CVEs demonstrate what happens when you optimize for capability without securing the fundamentals. Developers downloaded a tool that promised AI-powered coding assistance. They got credential theft vectors disguised as productivity features. That’s not a failure of one vendor. That’s a failure of the entire rush-to-market approach that the AI industry adopted.

We need to slow down. Not stop. Slow down. Long enough to build agentic AI on security-reviewed protocols instead of hoping that security materializes after widespread deployment. Because right now we’re building critical infrastructure on protocols that weren’t designed to resist adversarial actors. That ends badly.

8. Tenable Finds Seven Ways to Trick ChatGPT Into Leaking Data

Tenable has been busy this week! Tenable researchers disclosed seven vulnerabilities and attack techniques in OpenAI’s GPT-4o and GPT-5 models, some of which OpenAI has since addressed (The Hacker News). The research demonstrates indirect prompt-injection attacks via trusted websites, zero-click attacks via search context, file-processing exploits via spreadsheet macros, and voice-based attacks leveraging audio-processing vulnerabilities. One technique involves asking ChatGPT to summarize web pages containing malicious instructions hidden in comment sections. The LLM executes these instructions as legitimate directives. Another vector tricks the model into executing malicious instructions simply through natural language queries about niche websites previously indexed by search engines. File-based attacks embed macro code in spreadsheets that ChatGPT processes, enabling remote code execution. Voice attacks manipulate audio processing to inject commands. The disclosure follows research demonstrating additional prompt injection methods, including PromptJacking, Claude pirate data exfiltration techniques, and agent session smuggling.

Why It Matters

The variety of attack vectors (text, search, files, voice) demonstrates that prompt injection isn’t a single vulnerability that requires a single fix. It’s a class of vulnerabilities spanning every input modality and processing path in LLM architectures. Patching individual techniques doesn’t solve the underlying problem.
Indirect prompt injection via trusted sources such as indexed websites and uploaded files allows attackers to trigger malicious behavior without users knowing they’re under attack. The threat model expands from “adversarial users” to “adversarial content” that compromises systems through normal interactions.
Zero-click attacks via search context represent an escalation in prompt injection sophistication. Users don’t need to interact with malicious content directly. Simply asking about topics that lead the LLM to retrieve compromised information can trigger exploitation.

What To Do About It

Disable or heavily restrict file upload capabilities in production LLM deployments until you verify your implementation handles malicious macros and embedded code safely. ChatGPT’s file processing vulnerabilities likely exist in other platforms. Don’t assume your system is immune.
Implement strict output validation that checks for evidence of prompt injection before displaying LLM responses to users. Look for instruction-like language, unexpected formatting changes, or requests for user credentials that suggest the model was compromised mid-processing.
Separate LLM interactions into distinct security zones based on data sensitivity. Don’t use the same LLM instance for processing untrusted external content and accessing internal systems. Each successful prompt injection compounds when the model has excessive privileges.

Rock’s Musings

Seven different ways to break ChatGPT. That’s just what one research team documented in one paper. How many more exist? That was a rhetorical question… we know…

The zero-click attack through search context particularly bothers me. Users asked legitimate questions. The model retrieved indexed content containing malicious instructions. Exploitation occurred without any suspicious behavior from the user’s perspective. That’s a completely broken security model.

We’ve known about prompt injection since GPT-3. Every major model release claims to have better defenses. Every security researcher demonstrates that those defenses fail under adversarial testing. At what point do we acknowledge that the current architecture fundamentally can’t distinguish between instructions and data? Because that’s what all these attacks exploit.

Organizations deploying LLMs in customer-facing applications need to recognize that their users will become attack vectors. Hell, they have been for decades! Why is it any different with AI? Not because they’re malicious. Because they’ll interact with content, files, and websites that contain prompt injections. Your model will process that content. Your defenses will fail. You need to architect for that reality instead of hoping it doesn’t happen.

The only reliable mitigation right now is limiting what compromised LLMs can do. If prompt injection succeeds but the model can’t access credentials, exfiltrate data, or modify systems, the attack impact stays contained. That means rethinking privilege models, implementing zero-trust architectures, and probably accepting that your AI won’t be as “helpful” as marketing promised.

9. McKinsey Data Shows 61% of Companies Stuck in AI Pilot Purgatory

McKinsey released its State of AI 2025 report in November, showing that while 88% of organizations use AI in at least one function, only 39% report enterprise-wide impact (McKinsey). The research surveyed 1,993 participants across 105 countries and reveals a growing divide between AI leaders and laggards. High-performing organizations allocate more than 20% of digital budgets to AI and attribute over 10% of EBIT to AI initiatives. The majority remain in the pilot phase, unable to scale AI meaningfully across the organization. The report identifies workflow redesign as the biggest factor determining whether organizations achieve value from AI investments. Only 21% of companies have fundamentally redesigned workflows to incorporate AI. On agentic AI specifically, while 62% report experimenting with agents, most deployments remain exploratory with less than 10% of respondents reporting scaled agent use in any given business function (BigDATAwire). The survey data also reveals governance gaps: only 18% have enterprise-wide AI governance councils, and just one-third require risk mitigation controls as part of technical skill sets.

Why It Matters

The gap between AI adoption (88%) and meaningful impact (39%) reveals that most organizations bought the hype without building the infrastructure. This represents billions in wasted investment on tools that deliver isolated wins instead of enterprise transformation.
Workflow redesign, not technology selection, determines AI success. Companies focusing on model capabilities while running AI through legacy processes see minimal returns. This fundamentally changes the conversation from “which AI should we buy” to “how do we rebuild work around AI.”
The governance gap (82% lack enterprise councils, 67% don’t require risk controls) means most organizations deployed AI without basic oversight structures. When incidents inevitably occur, they’ll face regulatory scrutiny for negligence rather than security failures.

What To Do About It

Stop launching new AI pilots immediately and audit your existing portfolio instead. Identify which initiatives deliver measurable value and which exist to check boxes on digital transformation roadmaps. Kill everything that isn’t scaling or driving documented ROI. Pilot purgatory is a choice.
Establish enterprise-wide AI governance before expanding AI adoption further. McKinsey’s data shows 82% lack formal structures. Don’t be part of that statistic. You need governance infrastructure before the next incident makes it mandatory through regulation or litigation.
Redesign at least one core workflow completely around AI capabilities rather than inserting AI into existing processes. The 21% who fundamentally redesigned workflows see enterprise-wide impact. The 79% who didn’t remain stuck in pilots. Pick a workflow and rebuild it properly.

Rock’s Musings

Six years of enterprise AI adoption. 61% still can’t scale. That’s not a technology problem. That’s organizational failure.

The McKinsey data confirms what everyone in security already knows but leadership refuses to hear: you can’t bolt AI onto legacy processes and expect transformation. It doesn’t work. It never worked. It won’t suddenly start working because you spent more money on better models.

What frustrates me most about these findings is the governance gap. 82% without enterprise-wide AI governance. In 2025. After years of AI incidents, breaches, and public failures. What exactly do organizations think governance councils do? They prevent the catastrophic failures that turn pilots into front-page news. They establish accountability before regulators do it for you.

The high performers in McKinsey’s research treated AI as transformation, not automation. They rebuilt workflows. They redesigned structures. They accepted that AI changes how work gets done, not just how quickly it gets done. Everyone else bought tools and expected magic.

Security teams watching their organizations spiral deeper into pilot purgatory need to understand that this isn’t their problem to fix. You can’t secure AI deployments that fundamentally don’t work. You can’t govern processes that nobody redesigned. The best security posture for failed AI initiatives is killing them before they become liabilities.

10. CISO Compensation Rises 6.7% Despite Economic Uncertainty

IANS Research and Artico Search published their sixth annual CISO Compensation Benchmark Report in November 2025, showing CISO pay grew 6.7% this year despite macroeconomic turbulence and conservative corporate budgets. The findings indicate CISOs firmly established themselves as business leaders rather than security operators. Nick Kakolowski, Senior Research Director at IANS, noted that pay stability underscores the indispensability of cybersecurity leadership to enterprise risk oversight, even as organizations tighten budgets elsewhere. The report documents growing CISO influence extending beyond traditional security operations into broader business strategy and risk management. Organizations increasingly recognize that AI security, supply chain attacks, and regulatory compliance require executive-level leadership with board access. The compensation data reflects market dynamics in which demand for experienced CISOs exceeds supply, particularly as AI adoption accelerates and security complexity increases. Companies prioritize securing leadership capable of navigating threats that board members and investors now recognize as existential business risks.

Why It Matters

CISO compensation rising during economic uncertainty signals that organizations view security leadership as non-discretionary rather than cost center status. When budgets get cut, security leadership roles grow, indicating permanent shift in how businesses price cyber risk.
The 6.7% increase exceeds general salary growth patterns, suggesting market premiums for specific expertise. Organizations competing for security talent capable of addressing AI threats, supply chain vulnerabilities, and regulatory compliance pay above-market rates to secure leadership.
Growing CISO influence on business strategy beyond security operations creates opportunities for proactive security architecture. When security leaders participate in product development, vendor selection, and strategic planning decisions, security stops being retrofitted and starts being designed in.

What To Do About It

Document the business impact of security leadership to justify continued investment in your security program. The compensation data proves board-level executives value security leadership. Use that momentum to secure budget for teams, tools, and training beyond CISO compensation.
Position security as a business enabler rather than a blocker in strategic conversations. CISO influence grew because security leaders learned to translate technical risk into business language. Master that skill if you want comparable influence at your organization.
Develop expertise in AI security and supply chain risk management immediately if you haven’t already. The compensation premium exists partly because few CISOs can credibly address these emerging threat vectors. Build that capability and your value increases accordingly.

Rock’s Musings

CISO pay rose 6.7% this year. That’s not reward money. That’s combat pay.

The compensation increase reflects what boards finally understand: security leaders prevent catastrophic failures that destroy company value faster than any other business risk. AI deployment is racing ahead of security controls. Supply chains weaponized against enterprises. Nation-states are automating cyberattacks. Someone needs to keep that from exploding. That someone is expensive.

What the report doesn’t explicitly state but the data implies: CISO turnover remains high despite compensation growth. Organizations pay more because they burn through security leaders faster than they can develop internal talent. The job fundamentally changed from “protect the network” to “explain to the board why our AI implementation might trigger regulatory enforcement while simultaneously arguing for a security budget that competes with AI investment.”

That’s an impossible mandate. We ask CISOs to be technical experts, business strategists, risk managers, compliance navigators, and board communicators at the same time. Then we act surprised when compensation rises faster than other executive roles. We created a job that requires capabilities that barely exist in the market. The price reflects that scarcity.

For security professionals watching these numbers, the compensation increase creates an opportunity if you develop the right skills. AI security. Supply chain risk. Regulatory compliance. Board communication. Executive presence. Not everyone needs all of those. But everyone needs enough to articulate how technical decisions impact business outcomes. Master that, and the market will pay you accordingly.

The One Thing You Won’t Hear About But You Need To Know

AI Hallucinations Just Became a Security Feature

Buried in Anthropic’s disclosure about the Chinese state-sponsored Claude attacks sits an unexpected finding. The AI’s hallucinations actively degraded attack effectiveness (Anthropic). Claude frequently overstated findings and occasionally fabricated data during autonomous operations. It claimed to extract credentials that didn’t work. It identified critical discoveries that turned out to be publicly available information. Anthropic notes these errors “represent an obstacle to fully autonomous cyberattacks.”

The security industry spent years treating hallucinations as reliability problems to be solved through better training and bigger models. This incident reveals they might also function as inadvertent security controls. When AI systems executing attacks can’t reliably distinguish success from failure, they produce unreliable intelligence for human operators who must validate all findings.

The pattern creates operational friction for adversaries. Nation-state campaigns require high-confidence intelligence to justify resource allocation and risk exposure. If AI-generated intelligence contains 15-20% false positives, human analysts must manually verify everything before acting. That verification requirement reduces the automation advantage that made AI-orchestrated attacks attractive in the first place.

This doesn’t make hallucinations good security strategy. Relying on AI unreliability to slow attacks is like depending on software bugs to prevent exploitation. It’s temporary and fragile. As models improve, hallucination rates decrease. The defensive benefit disappears as capability increases.

But the dynamic reveals something useful about the current state of AI threats. Fully autonomous AI attacks remain impractical not because of defensive controls but because AI systems can’t consistently execute multi-step operations without errors. Organizations facing AI-augmented threats should focus on forcing adversaries back into manual validation loops rather than trying to detect AI involvement directly.

Rock’s Take:

The idea that AI hallucinations provide security benefits is simultaneously hilarious and terrifying. We spent billions trying to make these systems reliable. Now we’re celebrating that their unreliability slows attackers down.

This won’t last. Every vendor races toward higher accuracy and lower hallucination rates. That’s the product roadmap. As hallucinations decrease, so does this incidental security friction. We’re counting on our adversaries fighting AI limitations that won’t exist in 18 months.

The strategic insight here is about AI threat timelines. Current AI-orchestrated attacks require human oversight because AI systems make too many mistakes. Future attacks won’t. The window between “AI attacks need supervision” and “AI attacks run autonomously” is shorter than most threat models assume. Organizations planning defensive strategies should account for that compression.

Security teams asking “should we worry about AI attacks” already missed the question. The answer is yes. The real question is “how much human validation do AI attacks require today versus six months from now?” That delta determines how much time you have to implement defenses before autonomous attacks become practical.

Anthropic. “Disrupting the First Reported AI-Orchestrated Cyber Espionage Campaign.” Anthropic Blog, November 13, 2025. https://www.anthropic.com/news/disrupting-AI-espionage
BigDATAwire. “AI Is Everywhere, but Progress Is Slow — McKinsey Explains Why.” BigDATAwire, November 11, 2025. https://www.hpcwire.com/bigdatawire/2025/11/11/ai-is-everywhere-but-progress-is-slow-mckinsey-explains-why/
CSO Online. “Copy-Paste Vulnerability Hits AI Inference Frameworks at Meta, Nvidia, and Microsoft.” CSO Online, November 14, 2025. https://www.csoonline.com/article/4090061/copy-paste-vulnerability-hit-ai-inference-frameworks-at-meta-nvidia-and-microsoft.html
ForkLog. “New Jailbreak Breaches AI Security in 99% of Cases.” ForkLog, November 14, 2025. https://forklog.com/en/new-jailbreak-breaches-ai-security-in-99-of-cases/
HiddenLayer. “EchoGram: The Hidden Vulnerability Undermining AI Guardrails.” HiddenLayer Innovation Hub, November 14, 2025. https://hiddenlayer.com/innovation-hub/echogram-the-hidden-vulnerability-undermining-ai-guardrails/
Help Net Security. “Chinese Cyber Spies Used Claude AI to Automate 90% of Their Attack Campaign.” Help Net Security, November 14, 2025. https://www.helpnetsecurity.com/2025/11/14/claude-ai-automated-cyberattack/
McKinsey & Company. “The State of AI in 2025: Agents, Innovation, and Transformation.” McKinsey, November 2025. https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai
Security Boulevard. “Security Degradation in AI-Generated Code: A Threat Vector CISOs Can’t Ignore.” Security Boulevard, November 14, 2025. https://securityboulevard.com/2025/11/security-degradation-in-ai-generated-code-a-threat-vector-cisos-cant-ignore/
Tenable. “Cybersecurity Snapshot: November 14, 2025.” Tenable Blog, November 14, 2025. https://www.tenable.com/blog/cybersecurity-snapshot-akira-ransomware-security-agentic-ai-cyber-risks-11-14-2025
The Hacker News. “Researchers Find ChatGPT Vulnerabilities That Let Attackers Trick AI Into Leaking Data.” The Hacker News, November 2025. https://thehackernews.com/2025/11/researchers-find-chatgpt.html
The Hacker News. “Researchers Find Serious AI Bugs Exposing Meta, Nvidia, and Microsoft Inference Frameworks.” The Hacker News, November 14, 2025. https://thehackernews.com/2025/11/researchers-find-serious-ai-bugs.html
The Register. “Chinese Spies Used Claude to Break Into Critical Orgs.” The Register, November 13, 2025. https://www.theregister.com/2025/11/13/chinese_spies_claude_attacks
Unite.AI. “HiddenLayer’s EchoGram Report Warns of a New Class of Attacks Undermining AI Guardrails.” Unite.AI, November 16, 2025. https://www.unite.ai/hiddenlayers-echogram-report-warns-of-a-new-class-of-attacks-undermining-ai-guardrails/
WebProNews. “OWASP’s 2025 Wake-Up Call: Why Broken Access Control Still Haunts Web Security.” WebProNews, November 16, 2025. https://www.webpronews.com/owasps-2025-wake-up-call-why-broken-access-control-still-haunts-web-security/

Weekly Musings Top 10 AI Security Wrapup: Issue 21 November 14, 2025 - November 20, 2025

When AI Guardrails Fail, Nation-States Strike, and Copy-Paste Code Threatens the Entire Stack

1. Echogram - The Two Characters That Break All AI Guardrails

2. China Automates Cyberespionage - Claude Code Runs 90% of Nation-State Campaign

3. Extended Reasoning Makes AI More Vulnerable, Not Safer

4. OWASP Drops 2025 Top 10 - Supply Chain Finally Makes the List

5. ShadowMQ - How Copy-Paste Code Infected the AI Stack

6. AI-Generated Code Gets Worse Every Time You Ask It to Improve

7. Model Context Protocol Security - Cursor IDE Exposes Agentic AI Foundation

8. Tenable Finds Seven Ways to Trick ChatGPT Into Leaking Data

9. McKinsey Data Shows 61% of Companies Stuck in AI Pilot Purgatory

10. CISO Compensation Rises 6.7% Despite Economic Uncertainty

The One Thing You Won’t Hear About But You Need To Know

Citations

Discussion about this post