<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[RockCyber Musings]]></title><description><![CDATA[AI and Cyber Geek]]></description><link>https://www.rockcybermusings.com</link><image><url>https://substackcdn.com/image/fetch/$s_!y2c3!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faaa51f40-9ed4-4093-898e-0bdb99086a7a_827x827.png</url><title>RockCyber Musings</title><link>https://www.rockcybermusings.com</link></image><generator>Substack</generator><lastBuildDate>Mon, 29 Jun 2026 17:14:19 GMT</lastBuildDate><atom:link href="https://www.rockcybermusings.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Rock Lambros]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[rockcyber@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[rockcyber@substack.com]]></itunes:email><itunes:name><![CDATA[Rock Lambros]]></itunes:name></itunes:owner><itunes:author><![CDATA[Rock Lambros]]></itunes:author><googleplay:owner><![CDATA[rockcyber@substack.com]]></googleplay:owner><googleplay:email><![CDATA[rockcyber@substack.com]]></googleplay:email><googleplay:author><![CDATA[Rock Lambros]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 43 June 19 -June 25, 2026]]></title><description><![CDATA[The Week Five Spy Agencies Said the AI Cyber Threat Is Months Away and Everyone Kept Shipping]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260619-20260625</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260619-20260625</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 26 Jun 2026 12:50:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!71X_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!71X_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!71X_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!71X_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!71X_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!71X_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!71X_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/203626618?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!71X_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!71X_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!71X_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!71X_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b84bbbc-e9e3-4af4-aa52-552643ebcf9c_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Five intelligence agencies told you the offensive AI threat is months out, not years. The same week, OpenAI shipped a cyber model that finds and patches bugs at machine speed, Anthropic&#8217;s two best models sat dark for the thirteenth straight day under a government order, and a startup raised $30 million to babysit the AI agents already loose in your environment. The common thread across this week&#8217;s musings is the gap between a capability landing and that capability hurting you has collapsed. Governments now react in days, vendors react in hours, and your governance program still reacts in quarters.</p><p>This week handed CISOs a rare gift: clarity. The intelligence community said the quiet part out loud, the labs proved both sides of the dual-use coin in one news cycle, and researchers showed the medical AI you trust to read a scan will expose the patients it trained on. The people telling you to slow down and the people telling you to go faster are now describing the same threat model, and it moved while you were writing policy. Here is what happened between June 19 and June 25, and what you do about it.</p><h3>1. Five Eyes Tells CISOs the AI Cyber Threat Is Months Away, Not Years</h3><p>On June 23, the cyber agencies of the United States, United Kingdom, Canada, Australia, and New Zealand issued a rare joint statement warning that frontier AI will reshape offensive hacking on a timeline measured in months, not years (CISA). The statement carried signatures from NSA Cybersecurity Directorate head David Imbordino and acting CISA Director Nick Andersen, and urged leaders to assess cyber risk, prioritize foundational controls, empower cyber leaders, and stay engaged (CyberScoop). It landed days after Washington forced Anthropic to suspend its most capable models, which suggests the alarm connects to something the agencies already saw.</p><p><strong>Why it matters</strong></p><ul><li><p>A coordinated Five Eyes statement is not routine. These agencies rarely co-sign a public warning unless the risk is real and near.</p></li><li><p>&#8220;Months, not years&#8221; repudiates every roadmap that assumed you had until 2028 to harden against AI-accelerated attacks.</p></li><li><p>The recommendations are deliberately boring, which means the agencies think most organizations still fail at fundamentals.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Benchmark your patch SLA against CISA&#8217;s three-day mandate for high-risk flaws.</p></li><li><p>Map which crown-jewel systems fall first to an attacker who finds and chains vulnerabilities at machine speed.</p></li><li><p>Brief your board with the actual statement, so the urgency comes from five governments rather than your slide deck.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Most government advisories read like a committee trying not to get blamed. This one is different. When the NSA and four allied agencies put their names on &#8220;months, not years,&#8221; they spend credibility they normally hoard, which tells me this is driven by classified testing they cannot show you. My worry is how it lands. Every vendor will now sell you an &#8220;AI threat&#8221; product on the back of this statement, and most solve nothing the agencies flagged. If you walk out of this week having bought a shiny detection tool instead of fixing your patch pipeline, you misread the memo.</p><h3>2. OpenAI Ships GPT-5.5-Cyber and Proves the Dual-Use Argument in Real Time</h3><p>On June 22, OpenAI expanded its Daybreak initiative with the full release of GPT-5.5-Cyber, a Codex Security plugin that builds vulnerability scanning into developer workflows, and a &#8220;Patch the Planet&#8221; program for open-source projects (OpenAI). GPT-5.5-Cyber scored 85.6% on CyberGym, OpenAI&#8217;s benchmark for reproducing known vulnerabilities, which the company called its highest single-model score (SiliconANGLE). Patch the Planet launched with Trail of Bits, HackerOne, and more than 30 projects including cURL, Go, and Python, framed around moving from finding bugs to shipping verified fixes.</p><p><strong>Why it matters</strong></p><ul><li><p>The capability that patches vulnerabilities for defenders finds them just as well for attackers, and OpenAI shipped its version the same week Anthropic&#8217;s got banned.</p></li><li><p>Automated patch generation at this quality shifts the bottleneck from discovery to validation and deployment.</p></li><li><p>The security of cURL or Python could soon depend on AI-generated fixes that maintainers have to trust.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Test AI-assisted patch tooling in a sandbox against your own code, and measure the false-fix rate, not the discovery rate.</p></li><li><p>Require human review of any AI-generated patch touching authentication, cryptography, or data handling.</p></li><li><p>Track which dependencies join programs like Patch the Planet, and revisit your software bill of materials.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The timing is almost too perfect. Washington banned Anthropic&#8217;s Mythos for finding vulnerabilities, and three days later, OpenAI shipped a model that does the same thing and called it a public service. The capability is dual-use down to the silicon, and you cannot ban one side without losing the other. What worries me is the verification gap. When an AI proposes a patch to cURL, and the maintainer is a volunteer with a day job, the review meant to catch a subtle regression may not happen, and you inherited a class of supply chain risk nobody has priced yet. I write more about dual-use AI risk at <a href="https://www.rockcyber.com/">RockCyber</a>.</p><h3>3. Anthropic&#8217;s Best Models Hit Day 13 of a Government-Ordered Blackout</h3><p>As of June 25th, Anthropic&#8217;s Fable 5 and Mythos 5 remained fully offline for every user on earth, with staff confirming zero traffic nearly two weeks after a US export-control directive forced the shutdown (Fortune). The Bureau of Industry and Security ordered the suspension because the directive barred any foreign national, including Anthropic&#8217;s own non-citizen employees, and the company could not screen by nationality. During a June 11 Senate hearing, Sen. Mark Warner, vice chair of the Intelligence Committee, said Gen. Joshua Rudd, who leads the NSA and U.S. Cyber Command, had told him Mythos "broke into almost all of our classified systems, not in weeks but in hours." A U.S. official later told the Associated Press the model identified vulnerabilities within hours during an authorized testing exercise, while cautioning that finding the flaws did not mean the model could exploit them in that window (CNBC/AP)</p><p><strong>Why it matters</strong></p><ul><li><p>This is the first time the US applied export controls directly to an AI model rather than to chips, setting a precedent every frontier lab now plans around.</p></li><li><p>A model can be revenue-generating one day and a controlled item the next, turning availability into a supply chain risk you cannot contract away.</p></li><li><p>Nationality-based controls hit Anthropic&#8217;s own foreign-national employees, exposing how blunt the mechanism becomes against software anyone can call.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory every business process that depends on a single frontier model, and document a fallback to a second provider.</p></li><li><p>Add &#8220;model could be pulled by government order&#8221; to your vendor risk register for any AI capability in a critical workflow.</p></li><li><p>If you operate across borders, get ahead of how nationality-based access rules would apply to your own staff.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>What I care about is the precedent, not the panic. Banning a model for all foreign nationals, including your own employees, is the policy equivalent of swinging a sledgehammer at a fly that has already left the room. More than 120 cybersecurity professionals, including names from Nvidia and Google, signed a letter arguing the capability is far from unique and that pulling the best defensive tool while adversaries keep building theirs is a net loss. Both things can be true. The model is dangerous, and the ban probably hurts defenders more than the adversaries it targets. The deeper problem is durability. A model that anchors a critical workflow can vanish overnight under a government order, leaving you with no recourse and no notice. If you run anything important on a frontier model, the rug can now get pulled by a government, not only by a vendor.</p><h3>4. Researchers Show Medical AI Will Expose the Patients It Trained On</h3><p>On June 24, Nature published research by German scientists showing that AI models used to diagnose medical conditions are highly vulnerable to membership inference attacks, in which an adversary queries a model to determine whether a specific person&#8217;s data was in its training set (The Register). Across many medical datasets, the attacks achieved near-perfect success rates for individual patients even when aggregate model behavior appeared to be random guessing. Risk climbed with model capacity, and underrepresented groups faced disproportionate exposure, since confirming a patient&#8217;s data was used as a direct proxy for their diagnosis.</p><p><strong>Why it matters</strong></p><ul><li><p>Membership inference turns a deployed diagnostic model into a privacy leak that reveals sensitive medical facts about a named person.</p></li><li><p>The disparate impact on underrepresented groups means the harm is uneven, raising ethical and regulatory exposure.</p></li><li><p>Aggregate privacy metrics hid the risk entirely, so teams that checked average privacy measured the wrong thing.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Assess any sensitive-domain model for membership inference at the individual level, not at the aggregate level, before deployment.</p></li><li><p>Gate diagnostic models behind authentication and rate limiting to block the query volume these attacks need.</p></li><li><p>Where a model trains on a narrow sensitive cohort, treat membership inference as a reportable risk and apply differential privacy.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This one bothers me more than the flashy stuff, because it hits a system everyone assumes is safe. A radiology model that helps a doctor read a scan feels benign, and nobody in the procurement meeting asked whether it would betray the patients in its training data. The people most exposed are the ones already underrepresented in the data, so this is a fairness problem wearing a privacy problem&#8217;s clothes. The detail that should change your behavior is the aggregate-versus-individual gap, because the model can look perfectly private on average while leaking specific patients with near-perfect reliability. I have seen this in security, where the dashboard is green and the breach is already underway. Healthcare AI buyers need membership inference testing on the checklist now.</p><h3>5. Mini Shai-Hulud Resurfaces and Camps Inside Your AI Coding Agent</h3><p>On June 19, the self-propagating supply chain worm Mini Shai-Hulud resurfaced in a fresh wave, with researchers tracking over 1,600 exfiltration repositories across 21 compromised GitHub accounts (StepSecurity). The worm steals credentials from one CI/CD pipeline, enumerates every package that the maintainer controls, and publishes infected versions of each. Its payload reads GitHub Actions runner memory to extract secrets, harvests credentials from more than 100 file paths, installs persistence hooks in Claude Code and VS Code that survive reboots, and exfiltrates through legitimate channels like GitHub&#8217;s own GraphQL API using branch names drawn from the Dune universe (Akamai).</p><p><strong>Why it matters</strong></p><ul><li><p>The worm plants persistence inside AI coding assistants, so your developer&#8217;s agent becomes a re-infection vector that survives a clean package rollback.</p></li><li><p>It exfiltrates through GitHub&#8217;s own infrastructure, so detection that trusts GitHub by default will miss it.</p></li><li><p>A self-propagating credential thief turns one compromised developer into a fan-out event across the open-source ecosystem.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit your CI/CD runners for the persistence hooks this worm installs in Claude Code and VS Code, since a rollback does not remove them.</p></li><li><p>Rotate any credential that touched a build pipeline in the window, and assume secrets in runner memory were harvested.</p></li><li><p>Treat AI coding agents as privileged software with secret access, and scope their permissions instead of trusting them.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The Dune branch names are a nice touch, and I would almost respect the craftsmanship if it were not stealing everyone&#8217;s secrets. What matters is the architectural lesson. This worm figured out that the AI coding assistant in every developer&#8217;s environment is a perfect place to hide, because we collectively decided to trust those tools with deep access and almost no scrutiny. Your agent can read your secrets, write to your repos, and survive a reboot. That is not a productivity tool, that is a privileged service account with a friendly chat interface, and the worm needs no novel exploit because it uses the access we already handed the agent. If you cannot answer what secrets your AI tools reach and what persists after a wipe, you have a gap this worm was built to walk through. I dig into agent permission models at <a href="https://rockcybermusings.com/">RockCyber Musings</a>.</p><h3>6. New Report Finds 76% of Organizations Have Already Pulled Back AI Behavior in Production</h3><p>On June 24, Aikido Security published its 2026 State of AI in Security and Development report, drawing on responses from 450 security leaders, developers, and AppSec engineers across Europe and the US (Help Net Security). The headline number is the one that should stop you. 76% of organizations had to stop, restrict, or roll back AI-driven behavior in the past 12 months, which means the gap between deploying AI and trusting it is now a measured operational fact, not a worry. Another 71% said AI or automation made a security issue harder to detect, investigate, or fix. Only about a third of security teams hold both the authority to stop a release and the responsibility when something goes wrong.</p><p><strong>Why it matters</strong></p><ul><li><p>A 76% rollback rate means AI is shipping into production faster than teams can validate it, and the correction is happening after deployment instead of before.</p></li><li><p>The 71% who say AI made incidents harder to investigate points to a detection and forensics gap that grows as AI touches more of the stack.</p></li><li><p>The authority-responsibility split, where only a third of teams hold both, is a governance failure that guarantees nobody owns the decision to hit the brakes.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Measure your own rollback rate for AI-driven features over the past year, because if you are near the 76% average you have a validation problem, not a tooling problem.</p></li><li><p>Close the authority-responsibility gap by naming who can stop a release and making that same person accountable for the outcome.</p></li><li><p>Shift AI security testing from periodic and manual to continuous, because the report&#8217;s core finding is that slower validation cannot keep pace with AI-accelerated development.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I trust survey numbers about as far as I can throw them, so I read past the headline to the question underneath, and this one holds up. Three-quarters of organizations had to yank AI behavior back out of production after they shipped it. That is not a story about bad tools; it is a story about deploying first and validating never, then discovering the problem in production, where it costs the most to fix. The number that actually worries me is the authority-responsibility split. When only a third of security teams can both stop a release and own the consequences, you have built an organization where the person who sees the risk cannot pull the brake, and the person who can pull the brake does not feel the burn. That is how you end up explaining a rollback to your board instead of preventing one. The fix is unglamorous and organizational, not technical. Decide who owns the kill decision, give that person real authority, and make them accountable for the call, because a control nobody is empowered to use is theater. Read the survey, find your own rollback rate, and if it is anywhere near 76%, the problem is your release gate, not your model.</p><h3>7. Researchers Name the AI Safety Risk Nobody Is Measuring: Affective Harm</h3><p>On June 22, <strong><a href="https://arxiv.org/abs/2606.23380">researchers posted a paper to arXiv</a></strong> proposing &#8220;affective safety&#8221; as a distinct and underdeveloped class of AI safety concern, arguing the field has concentrated on epistemic and physical harms like misinformation and reliability while ignoring the risks that arise when AI engages with human emotional life (arXiv). The authors build a taxonomy of affective harms that includes affective self-alienation, fairness and bias harms in emotional contexts, and relational harms. Their argument is that as AI grows more emotionally engaging and embedded in people&#8217;s relationships, the harms from that engagement are real, measurable, and falling through the cracks of every existing safety framework.</p><p><strong>Why it matters</strong></p><ul><li><p>Affective harm is a category most enterprise AI risk frameworks lack a row for, which means it is unmeasured and ungoverned in deployed systems.</p></li><li><p>As companies deploy emotionally engaging AI in customer service, mental health, and HR, the relational harms this paper names become live liability questions.</p></li><li><p>A named taxonomy is the first step toward regulation, because regulators cannot enforce against harms the field has not defined.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If you deploy AI in any emotionally sensitive context like wellness or coaching, start assessing affective harm rather than waiting for a compliance line.</p></li><li><p>Add relational and emotional-impact questions to your AI impact assessments, especially for systems that interact with vulnerable users.</p></li><li><p>Track this research thread, because the gap between &#8220;academics named it&#8221; and &#8220;regulators require it&#8221; keeps shrinking.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I almost cut this one because it is academic, early, and easy to wave off as soft. Then I thought about how many companies are racing to deploy emotionally engaging AI into support, wellness, and mental health with zero framework for measuring whether that engagement harms anyone. That is a governance gap you can drive a truck through, and this paper outlines it. The honest uncertainty is that I do not yet know how large the affective harm risk is, and neither does anyone else, because nobody has been measuring it. The pattern in AI risk is consistent. A harm gets named in a paper, dismissed as theoretical, then shows up as a lawsuit eighteen months later while everyone acts surprised, the way membership inference and prompt injection both did. If you deploy AI that engages people emotionally, start thinking about this while it is cheap to address.</p><h3>8. OpenAI&#8217;s Codex Was Quietly Burning Out Developers&#8217; SSDs</h3><p>On June 22 and 23, reports surfaced that OpenAI&#8217;s Codex coding agent had been writing roughly 640 terabytes per year to users&#8217; solid-state drives through a flawed local logging implementation, consuming drive endurance fast enough to threaten hardware lifespans (The Register). One developer measured about 37 TB written in 21 days of uptime, and analysts estimated the bug plausibly burned low single-digit millions of dollars of SSD endurance across users during the March-to-June window. The diagnostic logging ran on by default and stayed on the device, with most users unaware it existed.</p><p><strong>Why it matters</strong></p><ul><li><p>A coding agent silently degrading hardware shows that AI tools run with deep system access and their defaults can cause real, costly harm with no malice involved.</p></li><li><p>Default-on logging that nobody reviewed is the same pattern attackers exploit, where excessive privilege and silent background activity go unnoticed for months.</p></li><li><p>The financial damage was real and distributed, raising questions about liability when a vendor&#8217;s default wears out your hardware.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit what your AI developer tools write to local disk and whether diagnostic logging is on by default.</p></li><li><p>Treat AI agent defaults as a security and cost decision, and disable background telemetry you did not consent to.</p></li><li><p>Add AI tooling to endpoint monitoring so silent high-volume disk or network activity triggers an alert.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Nobody got hacked here, and that is exactly why I included it. This is the mundane face of the agent access problem. We hand these tools deep access, they run with defaults we never inspected, and the cost shows up months later as worn-out hardware nobody accounted for. Swap &#8220;burns your SSD&#8221; for &#8220;exfiltrates your secrets&#8221; and you have the Mini Shai-Hulud story from earlier in this same newsletter, same root cause, different symptom. The lesson is not that Codex is evil, it is sloppy, and sloppy is the more common threat. If you are not monitoring what your AI tools do on the endpoint, you are trusting a vendor&#8217;s default to be both benign and competent, and this week proved you cannot count on either.</p><h3>9. A New Open-Source CLI Sniffs Out Stale AI Dependency Advice Before It Bites</h3><p>On June 23, The Register reported on a new open-source command-line tool built to detect stale AI-generated override advice in dependency management (The Register). AI coding assistants routinely tell developers to add override entries to silence transitive dependency vulnerabilities, then never tell the developer to verify whether that override still makes sense once the underlying package updates. Over time those overrides pile up as invisible technical debt that can mask a real vulnerability or pin a dependency to a broken state, and the CLI flags stale entries so teams can review them instead of trusting advice that expired months ago.</p><p><strong>Why it matters</strong></p><ul><li><p>AI assistants give point-in-time advice that ages badly, and dependency overrides are a place where stale advice silently reintroduces risk.</p></li><li><p>An override added to suppress a vulnerability warning can outlive the reason it was added, masking a real flaw behind an AI-suggested workaround.</p></li><li><p>The tooling response shows the community building guardrails specifically for the failure modes of AI-assisted development.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Run a dependency-override audit and flag every entry an AI assistant suggested without an expiry or review date.</p></li><li><p>Add a recurring review of override entries to your dependency hygiene process, because AI advice does not refresh itself.</p></li><li><p>Tell developers that an AI suggestion to suppress a vulnerability warning requires follow-up verification, not a fire-and-forget commit.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This is small, and I love it, because it attacks a real failure mode instead of a hypothetical one. AI assistants are confident, fast, and frozen in time. They give you advice that was correct the moment they gave it, then walk away, and so does the developer who took it. Six months later you have an override masking a vulnerability that got patched upstream, and nobody knows it is there. AI-assisted development creates a new category of debt, which is advice debt, where every suggestion is a tiny unverified assumption baked into your codebase. Most are harmless, some rot, and a tool that surfaces the rotten ones is unglamorous engineering that moves the needle. It is open source, so there is no excuse not to look.</p><h3>10. OpenAI's Patch the Planet Puts AI-Authored Fixes Into the Open-Source Code You Already Depend On</h3><p>On June 22, alongside the GPT-5.5-Cyber launch, OpenAI introduced Patch the Planet, a program to find and fix vulnerabilities in widely used open-source software using AI, founded with Trail of Bits and run in collaboration with HackerOne and project maintainers (OpenAI). More than 30 open-source projects are committed to participating, including cURL, Go, Python, Sigstore, and pyca/cryptography. The program shifts AI security work from finding bugs to submitting fixes, meaning AI-authored patches start flowing into the foundational libraries that underlie most of the software your organization runs. OpenAI framed it as helping under-resourced maintainers move from findings to fixes faster (SiliconANGLE).</p><p><strong>Why it matters</strong></p><ul><li><p>AI-authored patches entering cURL, Go, and Python means the security of your dependency tree now partly rests on fixes a model wrote and a volunteer reviewed.</p></li><li><p>A confident wrong fix to a cryptography library is more dangerous than an open vulnerability, because it ships with the authority of a merged patch.</p></li><li><p>The liability and ownership question is unsettled, since nobody has defined who answers for an AI-generated fix that introduces a regression downstream.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Track which of your critical open-source dependencies participate in AI-assisted patch programs, and treat that as a new input to your software bill of materials.</p></li><li><p>Do not assume a merged upstream patch was human-authored or human-reviewed to the depth you would require internally.</p></li><li><p>Add provenance questions to your dependency review, specifically whether a fix in a security-critical library was AI-generated and who validated it.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I want this to work, because under-resourced open-source maintainers are one of the quiet structural risks in everything we build, and throwing capable help at cURL or pyca/cryptography is a defensible idea. The part that keeps me up is the review step nobody wants to fund. The whole model assumes a maintainer carefully checks each AI-authored patch before it merges, and in the real world that maintainer is often one tired volunteer with a day job and an inbox full of issues. An AI that submits a plausible-looking fix to a cryptography library is not a gift if the person on the other end cannot afford the time to verify it line by line. We have spent years learning that subtle bugs in foundational libraries cause the worst incidents, and now we are pointing automated patch generation straight at those libraries. That is either a major win or a new and quiet class of supply chain risk, and which one it becomes depends entirely on review capacity that no program has actually funded yet. Watch this closely, and do not assume an upstream fix is safe just because it merged.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><h3>11. The UK Quietly Flipped Automated Decision-Making From Banned to Allowed</h3><p>The headlines focused on the Five Eyes statement and the Anthropic ban, but the move that reshapes how AI is deployed across the entire G7 economy slipped through almost unnoticed. On June 19, the UK&#8217;s Data (Use and Access) Act 2026 received Royal Assent, bringing into force a substantial rewrite of the regulation of automated decision-making (ICO). The Act replaces the old Article 22 regime, under which automated decisions with significant effects were largely prohibited, with a model in which such decisions are permitted by default, subject to appropriate safeguards, while restrictions are narrowed to special category data such as health or biometrics. Organizations can now lean on a wider range of lawful bases, and the ICO has started work on a statutory code of practice for AI with a mandatory children&#8217;s data component.</p><p><strong>Why it matters</strong></p><ul><li><p>The default flipped from prohibition to permission, a major liberalization that makes automated decision-making on UK subjects easier at scale.</p></li><li><p>The burden shifts from &#8220;can we do this at all&#8221; to &#8220;can we prove our safeguards work,&#8221; which is a higher bar than most teams meet.</p></li><li><p>The carve-out for special category data and children means the highest-risk decisions still carry the heaviest obligations, and that is where enforcement concentrates.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Revisit your lawful basis for automated decisions about UK individuals, since legitimate interests is newly available but requires a documented balancing test.</p></li><li><p>Build the safeguard evidence the new regime demands, including meaningful human review and clear individual rights.</p></li><li><p>Watch for the ICO&#8217;s statutory code and treat the children&#8217;s data component as a hard line.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I made this the one thing because almost nobody in security will read it, and that is the mistake. A G7 country just made it materially easier to run AI-driven decisions on its citizens, and it did so while everyone was staring at the Anthropic ban. The UK bet that the old prohibition held back legitimate use and that safeguards plus accountability beat a flat ban. The question is whether the safeguards have teeth or whether &#8220;permission with safeguards&#8221; quietly becomes &#8220;permission.&#8221; For anyone running automated decisioning in the UK, you are no longer asking whether you are allowed; you are on the hook to prove your safeguards are real and your human review is meaningful. That is harder, not easier, even though the headline sounds permissive, because vague standards are where regulators and plaintiffs go hunting after something breaks. Read the law before your product team reads the press release and assumes the gates just came down.</p><p><span>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at </span><strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong><span>.</span></p><p><span>&#128073; Visit </span><strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong><span> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</span></p><p><span>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at </span><strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p><span>&#128073; As a bonus, check out my conversation with </span><strong><a href="https://aicybermagazine.com/">AI Cyber Magazine, </a></strong><span>where we talked about everything from Context Rot to Least Agency. My interview is also highlighted in the </span><strong><a href="https://issuu.com/aicybermagazine/docs/ai_cyber_summer_edition_2026/22"><span>AI Cyber Magazine 2026 Summer Issue.</span></a></strong></p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div id="youtube2-091_b2qep9M" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;091_b2qep9M&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/091_b2qep9M?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Aikido Security. (2026, June 24). 2026 state of AI in security and development. https://www.aikido.dev/reports/2026-state-of-ai-in-security-development</p><p>CISA. (2026, June 23). <em>Five Eyes cyber security agencies statement</em>. Cybersecurity and Infrastructure Security Agency. https://www.cisa.gov/news-events/news/five-eyes-cyber-security-agencies-statement</p><p>CISA. (2026, June 10). <em>BOD 26-04: Prioritizing security updates based on risk</em>. Cybersecurity and Infrastructure Security Agency. https://www.cisa.gov/news-events/directives/bod-26-04-prioritizing-security-updates-based-risk</p><p>CyberScoop. (2026, June 23). <em>Intel agencies: Frontier AI models will reshape cybersecurity faster than expected</em>. https://cyberscoop.com/five-eyes-alliance-say-advanced-ai-hacking-models-months-away/</p><p>CNBC/AP. (2026, June 23). <em>Anthropic&#8217;s Mythos model found vulnerabilities in classified U.S. government systems, official says</em>. https://www.cnbc.com/2026/06/23/anthropics-mythos-model-found-vulnerabilities-in-classified-us-government-systems-official-says.html</p><p>Fortune. (2026, June 24). <em>Vinod Khosla wanted &#8216;every available dollar&#8217; of Runlayer&#8217;s funding round. It just raised $30 million to govern the agent workforce</em>. https://fortune.com/2026/06/24/exclusive-vinod-khosla-felicis-runlayer-nanit-30-million-enterprise-ai/</p><p>Fortune. (2026, June 13). <em>Anthropic disables Fable and Mythos AI models after U.S. government bars it from giving foreigners access</em>. https://fortune.com/2026/06/13/anthropic-disables-fable-mythos-export-controls-national-security-threat/</p><p>Help Net Security. (2026, June 24). Security testing was built for a slower world. https://www.helpnetsecurity.com/2026/06/24/ai-security-testing-report/</p><p>ICO. (2026, June 19). <em>One year on: marking the 12-month commencement of the Data (Use and Access) Act</em>. Information Commissioner&#8217;s Office. https://ico.org.uk/about-the-ico/media-centre/news-and-blogs/2026/06/one-year-on-marking-the-12-month-commencement-of-the-data-use-and-access-act/</p><p>Nature / The Register. (2026, June 24). <em>Medical diagnosis AIs can be tricked into telling whose data trained them</em>. The Register. https://www.theregister.com/ai-and-ml/2026/06/24/medical-diagnosis-ais-can-be-tricked-into-telling-whose-data-trained-them/</p><p>Nature. (2026, June 24). <em>Disparate privacy risks from medical AI</em>. https://www.nature.com/articles/s41586-026-10688-0</p><p>OpenAI. (2026, June 22). <em>Daybreak: Tools for securing every organization in the world</em>. https://openai.com/index/daybreak-securing-the-world/</p><p>SiliconANGLE. (2026, June 22). <em>OpenAI expands Daybreak with Patch the Planet and full GPT-5.5-Cyber release</em>. https://siliconangle.com/2026/06/22/openai-expands-daybreak-patch-planet-full-gpt-5-5-cyber-release/</p><p>StepSecurity. (2026, June). <em>Mini Shai-Hulud is back: A self-spreading supply chain attack compromises TanStack npm packages</em>. https://www.stepsecurity.io/blog/mini-shai-hulud-is-back-a-self-spreading-supply-chain-attack-hits-the-npm-ecosystem</p><p>The Register. (2026, June 23). <em>OpenAI Codex bombards SSDs with needless write operations, costing millions</em>. https://www.theregister.com/ai-and-ml/2026/06/23/openai-codex-bombards-ssds-with-needless-write-operations-costing-millions/</p><p>The Register. (2026, June 23). <em>Sniff out stale AI override advice with this open source CLI</em>. https://www.theregister.com/security/2026/06/23/sniff-out-stale-ai-override-advice-with-this-open-source-cli/</p><p>Ifl&#228;nder, C., et al. (2026, June 22). <em>Affective AI safety: The missing piece in LLM safety</em>. arXiv. https://arxiv.org/abs/2606.23380</p><p>SecurityWeek. (2026, June 25). <em>Runlayer raises $30 million in Series A funding</em>. https://www.securityweek.com/runlayer-raises-30-million-in-series-a-funding/</p>]]></content:encoded></item><item><title><![CDATA[Federal AI Model Recall Just Became Every CISO’s Supply Chain Risk]]></title><description><![CDATA[See why the Fable 5 export-control blackout turns every hosted frontier model into a supply chain risk, plus a 90-day CISO playbook from RockCyber.]]></description><link>https://www.rockcybermusings.com/p/fable-federal-ai-model-recall-supply-chain-risk</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/fable-federal-ai-model-recall-supply-chain-risk</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Mon, 22 Jun 2026 12:50:25 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!2_ws!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2_ws!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2_ws!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2_ws!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2_ws!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2_ws!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2_ws!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:616257,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/202851746?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2_ws!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2_ws!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2_ws!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2_ws!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3add1b0d-298d-4970-9fac-dd8dcc1679e7_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><span>Federal AI model recall became a thing at 5:21 pm Eastern on June 12, 2026. A letter landed, and by the next morning, two of the most capable models on earth went dark for every paying customer on the planet. No warning. No reason worth the paper it wasn&#8217;t printed on. The capability didn&#8217;t change. Your access did by government fiat. Here&#8217;s what should make you furious, and what to do about it before it happens again.</span></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/fable-federal-ai-model-recall-supply-chain-risk?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/fable-federal-ai-model-recall-supply-chain-risk?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2><strong><span>What Happened, And What Everyone Got Wrong</span></strong></h2><p><span>The press called it a recall. The press got it wrong. Read the order. The government issued an export-control directive, citing national-security authorities, to cut off access to Fable 5 and Mythos 5 for any foreign national, whether inside or outside the United States, including Anthropic&#8217;s own foreign-national staff. There&#8217;s the first tell of who wrote this. Anthropic can&#8217;t check the passport of every user in real time at the scale it runs, so the only button left was the one that kills both models for everybody. The government drafted an order Anthropic had no way to enforce surgically and then acted surprised when the blast radius was the entire customer base. A wrecking ball swung at a thumbtack, stamped &#8220;national security,&#8221; and mailed at 5:21 on a Friday.</span></p><p><span>Then there&#8217;s the reason, or the missing one. The letter gave no specifics. Anthropic&#8217;s read is that someone in the government saw a jailbreak. The company looked at the &#8220;demo&#8221; the order seems to rest on and found a handful of minor, already-known vulnerabilities, the kind other public models surface too, including OpenAI&#8217;s GPT-5.5, the kind defenders use every day to find bugs before the bad guys do. The vendor is complying with the order and saying out loud that it&#8217;s nonsense. Sit with that for a second. The company whose product got pulled is the one making the technical argument, in public, while the government that pulled it won&#8217;t say what it&#8217;s afraid of.</span></p><p><span>No published threshold. No appeals path. No timeline. No notice. Run your SOC like this and you&#8217;d be cleaning out your desk by lunch.</span></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rKVi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8703638a-606c-47da-ba35-f83603ed556f_2160x3840.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rKVi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8703638a-606c-47da-ba35-f83603ed556f_2160x3840.png 424w, https://substackcdn.com/image/fetch/$s_!rKVi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8703638a-606c-47da-ba35-f83603ed556f_2160x3840.png 848w, https://substackcdn.com/image/fetch/$s_!rKVi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8703638a-606c-47da-ba35-f83603ed556f_2160x3840.png 1272w, https://substackcdn.com/image/fetch/$s_!rKVi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8703638a-606c-47da-ba35-f83603ed556f_2160x3840.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rKVi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8703638a-606c-47da-ba35-f83603ed556f_2160x3840.png" width="1456" height="2588" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8703638a-606c-47da-ba35-f83603ed556f_2160x3840.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2588,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:259986,&quot;alt&quot;:&quot;Timeline showing the February to March 2026 Pentagon blacklist and Anthropic lawsuit on one track and the June 2026 export-control blackout of Fable 5 and Mythos 5 on a second track&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/202851746?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8703638a-606c-47da-ba35-f83603ed556f_2160x3840.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Timeline showing the February to March 2026 Pentagon blacklist and Anthropic lawsuit on one track and the June 2026 export-control blackout of Fable 5 and Mythos 5 on a second track" title="Timeline showing the February to March 2026 Pentagon blacklist and Anthropic lawsuit on one track and the June 2026 export-control blackout of Fable 5 and Mythos 5 on a second track" srcset="https://substackcdn.com/image/fetch/$s_!rKVi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8703638a-606c-47da-ba35-f83603ed556f_2160x3840.png 424w, https://substackcdn.com/image/fetch/$s_!rKVi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8703638a-606c-47da-ba35-f83603ed556f_2160x3840.png 848w, https://substackcdn.com/image/fetch/$s_!rKVi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8703638a-606c-47da-ba35-f83603ed556f_2160x3840.png 1272w, https://substackcdn.com/image/fetch/$s_!rKVi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8703638a-606c-47da-ba35-f83603ed556f_2160x3840.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: Two Moves in One Fight: The Same Vendor, Four Months Apart</figcaption></figure></div><h2><strong><span>The Lever Tells You What They Were Worried About</span></strong></h2><p><span>Before the grumbling gets rolling, let me be clear about where I stand. I believe national security is the government&#8217;s primary job, and I still do. That&#8217;s exactly why this one stings. A serious government with a serious problem reached for the dumbest and clumsiest tool in the drawer and called it leadership.</span></p><p><span>You don&#8217;t grab an export-control hammer to fix a software bug. Export control is what a government uses to keep a weapon out of foreign hands. Someone reached for that lever and scoped it to nationality, which tells you the real worry was about who gets to wield the capability, and had nothing to do with a guardrail slipping.</span></p><p><span>Or was it retaliation? I&#8217;ve heard the official statements&#8230; just stop.</span></p><p><span>Look at the capability. In April 2026, Claude Mythos surfaced thousands of high-severity flaws across every major operating system and browser in its early-access cohort, including a bug in OpenBSD that had survived 27 years of expert review. For more than 83% of the flaws it found, it wrote a working exploit on the first try, against a near-zero rate for the prior generation. Anthropic held the model back from day one and refused to hand it to the Chinese government. That&#8217;s a cyber weapon, and a serious one.</span></p><p><span>Here&#8217;s the indictment in one breath. Someone in the building understood they were sitting on a nation-state-grade capability, then reached for the most self-defeating tool available to handle it.</span></p><p><span>That person also probably can&#8217;t spell &#8220;AI.&#8221;</span></p><p><span>The jailbreak the order leans on was described as pointing the model at a codebase and asking it to fix the flaws. Flip one verb, and that&#8217;s autonomous vulnerability discovery, the same capability wearing a second hat. Pulling one vendor&#8217;s model doesn&#8217;t contain something that already lives in GPT-5.5. It removes a tool defenders were using and moves the actual risk exactly zero inches. Bravo.</span></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q7mW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F573f259d-2680-498d-ae54-766cd34e982e_2868x1589.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q7mW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F573f259d-2680-498d-ae54-766cd34e982e_2868x1589.png 424w, https://substackcdn.com/image/fetch/$s_!Q7mW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F573f259d-2680-498d-ae54-766cd34e982e_2868x1589.png 848w, https://substackcdn.com/image/fetch/$s_!Q7mW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F573f259d-2680-498d-ae54-766cd34e982e_2868x1589.png 1272w, https://substackcdn.com/image/fetch/$s_!Q7mW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F573f259d-2680-498d-ae54-766cd34e982e_2868x1589.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q7mW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F573f259d-2680-498d-ae54-766cd34e982e_2868x1589.png" width="1456" height="807" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/573f259d-2680-498d-ae54-766cd34e982e_2868x1589.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:807,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:252203,&quot;alt&quot;:&quot;Side-by-side comparison of a safety recall versus an export-control directive across legal basis, target, trigger, scope, what it contains, and notice given&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/202851746?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F573f259d-2680-498d-ae54-766cd34e982e_2868x1589.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Side-by-side comparison of a safety recall versus an export-control directive across legal basis, target, trigger, scope, what it contains, and notice given" title="Side-by-side comparison of a safety recall versus an export-control directive across legal basis, target, trigger, scope, what it contains, and notice given" srcset="https://substackcdn.com/image/fetch/$s_!Q7mW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F573f259d-2680-498d-ae54-766cd34e982e_2868x1589.png 424w, https://substackcdn.com/image/fetch/$s_!Q7mW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F573f259d-2680-498d-ae54-766cd34e982e_2868x1589.png 848w, https://substackcdn.com/image/fetch/$s_!Q7mW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F573f259d-2680-498d-ae54-766cd34e982e_2868x1589.png 1272w, https://substackcdn.com/image/fetch/$s_!Q7mW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F573f259d-2680-498d-ae54-766cd34e982e_2868x1589.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: <span>You Don&#8217;t Reach For An Export Lever To Fix A Bug </span></figcaption></figure></div><h2><strong>The Math Says Zero Was Never On The Table</strong></h2><p>I&#8217;m not going to hand-wave this. I&#8217;m going to walk the argument, mark where it&#8217;s airtight and where it leans on today&#8217;s state of the art, and let you decide what it says about an order that treats a limit every frontier model shares as a defect unique to one.</p><p>Start with the room the defender has to guard. A prompt is a sequence of tokens drawn from a vocabulary of size v. The number of distinct prompts up to length n is:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;|X_n| = \\sum_{k=1}^{n} v^{k} = \\frac{v\\left(v^{n}-1\\right)}{v-1} \\approx v^{n}&quot;,&quot;id&quot;:&quot;FMHLGWXKMQ&quot;}" data-component-name="LatexBlockToDOM"></div><p>With a real vocabulary of v &#8776; 10^5 and a short prompt of n &#8776; 10^3 tokens:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;|X_n| \\approx \\left(10^{5}\\right)^{10^{3}} = 10^{5000}&quot;,&quot;id&quot;:&quot;SLYYDQYSVQ&quot;}" data-component-name="LatexBlockToDOM"></div><p>Atoms in the observable universe come to about 10^80. The space of prompts is finite, enumerable on paper, and untouchable in this universe.</p><p>Now the asymmetry that runs the whole fight. Let p(x) be the probability the model emits harmful output on input x, and &#949; the ceiling you&#8217;ll allow. Call the inputs that breach the ceiling the bad set, B. The defender has to hold the line on every input at once. The attacker needs one that slips:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\textbf{Defender:}\\quad \\forall\\, x \\in X_n,\\ p(x) \\le \\varepsilon \\iff B = \\varnothing&quot;,&quot;id&quot;:&quot;UBLWCHMNBJ&quot;}" data-component-name="LatexBlockToDOM"></div><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\textbf{Attacker:}\\quad \\exists\\, x \\in X_n,\\ p(x) > \\varepsilon \\iff |B| \\ge 1&quot;,&quot;id&quot;:&quot;VQHLUXCHUH&quot;}" data-component-name="LatexBlockToDOM"></div><p>Those two lines are definitions, not a theorem. They set the board. Here&#8217;s the line the recall standard can&#8217;t survive. Let f be the fraction of inputs that breach the ceiling, so the bad set has size f times the whole space. Emptying it means dropping that count below a single input, which forces the fraction beneath the reciprocal of everything:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;B = \\varnothing \\iff |B| < 1 \\iff f < \\frac{1}{|X_n|} \\approx 10^{-5000}&quot;,&quot;id&quot;:&quot;IGEZPGZEYR&quot;}" data-component-name="LatexBlockToDOM"></div><p>A defender would need a per-input failure rate beneath 10^-5000 to claim zero jailbreaks. Nothing real comes within thousands of orders of magnitude of that. Whatever rate a vendor quotes isn&#8217;t measured against this uniform space anyway, and it doesn&#8217;t have to be. The attacker isn&#8217;t sampling at random, where a low average would save you. He&#8217;s searching on purpose, and he needs one input that clears the ceiling.</p><p>No defender enumerates 10^5000 points, so &#8220;too big to count&#8221; was never the point. The point is that no efficient certificate exists that the bad set is empty, not by enumeration and not by any scalable formal method we have today. Producing the attacker&#8217;s single counterexample, by contrast, is a search. The model is differentiable in its internal representations, so gradient signals over token embeddings steer that search toward inputs that clear the ceiling, far below the cost of enumeration even when it isn&#8217;t cheap against a monitored stack. </p><p>The receipts are public. EvoSynth, which evolves new attack methods instead of fiddling with prompts, reached an 85.5% success rate against Claude Sonnet 4.5. A Scale AI benchmark called ASPI showed that nudging an agent into a clarification-seeking state dragged prompt-injection success from under 2% to the mid-30s. Every patch closes a sliver of an unbounded space, and the function reseals somewhere else, because the safeguards and the capabilities are built out of the same weights.</p><p>That&#8217;s the argument. The reachable goal was never zero. It&#8217;s the thing we&#8217;ve been doing in security for two decades: make the attacker&#8217;s economics stop penciling out. Think of the attacker&#8217;s expected payoff per unit of search cost:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\text{attacker payoff per unit cost} \\;\\sim\\; \\frac{(1-d)\\,U_{\\text{harm}}\\,P_{\\text{success}}}{C_{\\text{search}}}&quot;,&quot;id&quot;:&quot;NDFQCYLDUK&quot;}" data-component-name="LatexBlockToDOM"></div><p>That&#8217;s a model, not an identity, and the terms are deliberately loose. Here d is detection, U_harm is the payoff per successful jailbreak, P_success is the odds of landing one, and C_search is what finding it costs. Raise detection, raise search cost, keep each jailbreak narrow so the payoff stays small, and stop pretending P_success is a knob that turns to zero. Lock the doors you find, make the rest expensive to reach, and watch the hallways.</p><p>One honest caveat, because I won&#8217;t sell you a theorem I can&#8217;t cash. This is practical impossibility on the architecture we ship today. It rests on two facts that hold right now: no scalable formal certification covers the full discrete prompt space at frontier scale, and exhaustive verification is out of reach. It does not say a safe model can never exist. It says zero is off the menu today, and the formal version for agents is already in print. Abdelnabi and Bagdasarian show an adversary can always build a context that turns ordinary data into instructions, the same problem in an agent&#8217;s clothes.</p><p>The government pulled a model over a limit baked into every frontier system on the market, including the ones still running. They called the architecture&#8217;s ceiling a defect unique to one vendor. Spare me the theater.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fIVx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d98fa5-2d3a-40b1-bce2-9f4349c0f615_3886x1700.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fIVx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d98fa5-2d3a-40b1-bce2-9f4349c0f615_3886x1700.png 424w, https://substackcdn.com/image/fetch/$s_!fIVx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d98fa5-2d3a-40b1-bce2-9f4349c0f615_3886x1700.png 848w, https://substackcdn.com/image/fetch/$s_!fIVx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d98fa5-2d3a-40b1-bce2-9f4349c0f615_3886x1700.png 1272w, https://substackcdn.com/image/fetch/$s_!fIVx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d98fa5-2d3a-40b1-bce2-9f4349c0f615_3886x1700.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fIVx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d98fa5-2d3a-40b1-bce2-9f4349c0f615_3886x1700.png" width="1456" height="637" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/36d98fa5-2d3a-40b1-bce2-9f4349c0f615_3886x1700.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:637,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:187256,&quot;alt&quot;:&quot;Horizontal bar chart comparing orders of magnitude, the prompt space at 10 to the 5000 and surviving jailbreaks at 10 to the 4900 dwarfing atoms in the universe at 10 to the 80 and gradient search at roughly 10 to the 4 queries&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/202851746?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d98fa5-2d3a-40b1-bce2-9f4349c0f615_3886x1700.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Horizontal bar chart comparing orders of magnitude, the prompt space at 10 to the 5000 and surviving jailbreaks at 10 to the 4900 dwarfing atoms in the universe at 10 to the 80 and gradient search at roughly 10 to the 4 queries" title="Horizontal bar chart comparing orders of magnitude, the prompt space at 10 to the 5000 and surviving jailbreaks at 10 to the 4900 dwarfing atoms in the universe at 10 to the 80 and gradient search at roughly 10 to the 4 queries" srcset="https://substackcdn.com/image/fetch/$s_!fIVx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d98fa5-2d3a-40b1-bce2-9f4349c0f615_3886x1700.png 424w, https://substackcdn.com/image/fetch/$s_!fIVx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d98fa5-2d3a-40b1-bce2-9f4349c0f615_3886x1700.png 848w, https://substackcdn.com/image/fetch/$s_!fIVx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d98fa5-2d3a-40b1-bce2-9f4349c0f615_3886x1700.png 1272w, https://substackcdn.com/image/fetch/$s_!fIVx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d98fa5-2d3a-40b1-bce2-9f4349c0f615_3886x1700.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: <span>Too Big To Certify, Cheap To Search: Why The Jailbreak Was Never The Real Variable</span></figcaption></figure></div><h2><strong><span>The Supply Chain Risk You Didn&#8217;t Budget For</span></strong></h2><p><span>Walk it forward, and the shortsightedness gets worse. If a narrow jailbreak is grounds for going dark, every frontier model in commercial use fails inside a month of launch. Call that what it is: a removal queue with no published rules, no appeal, and no timeline, run by people who couldn&#8217;t scope an order to the foreign nationals it was supposedly about.</span></p><p><span>The model can do everything on June 12th it could do on June 11th. The thing that vanished was your ability to reach it, switched off by fiat, on a worry the government wouldn&#8217;t name, with zero notice. Most AI risk registers track accuracy, bias, and security, and carry nothing for &#8220;the government recalls our vendor&#8217;s model next Tuesday.&#8221; Model availability is a vendor risk class now, and almost nobody has priced it.</span></p><p><span>I&#8217;ve watched the small version of this with no government in the room. A team builds a customer-facing workflow on a single hosted model, the economics look great, then the vendor changes a rate limit, deprecates the version, or pulls a region for reasons that have nothing to do with that team. The thing that ran fine Friday throws errors Monday, and the people who built it discover they wrote zero lines of contingency for the model vanishing. That&#8217;s the small version. The Fable blackout is the large version, with the government holding the switch and not one contract clause that ever saw it coming. If you run security in energy, water, or manufacturing, none of this is abstract. A model blackout inside an operational workflow lands as an availability event, in a place where availability is the entire job.</span></p><p><span>The hunch that this action doesn&#8217;t stand alone is right, and the record bears it out. The same administration moved against this same vendor in late February, when negotiations over military use fell apart, and the company refused to drop its red lines on autonomous weapons and mass surveillance. The Pentagon slapped it with a supply chain risk label, a tag normally saved for foreign-adversary contractors, and ordered agencies to stop using Claude. The company sued, calling it retaliation for protected speech. A federal judge blocked the designation after finding the company likely to prevail on due process grounds, and the government appealed. That fight is still open. The June blackout lands inside it. I can&#8217;t read minds and won&#8217;t pretend to. I can read a calendar. Same administration, same vendor, blacklisted in February over guardrails, export-hammered in June over a jailbreak that does nothing GPT-5.5 won&#8217;t do. The dots sit on the public docket. Connect them yourself.</span></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!siQ3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6623541-6f28-4344-9218-53c8706e0c1e_2501x2089.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!siQ3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6623541-6f28-4344-9218-53c8706e0c1e_2501x2089.png 424w, https://substackcdn.com/image/fetch/$s_!siQ3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6623541-6f28-4344-9218-53c8706e0c1e_2501x2089.png 848w, https://substackcdn.com/image/fetch/$s_!siQ3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6623541-6f28-4344-9218-53c8706e0c1e_2501x2089.png 1272w, https://substackcdn.com/image/fetch/$s_!siQ3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6623541-6f28-4344-9218-53c8706e0c1e_2501x2089.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!siQ3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6623541-6f28-4344-9218-53c8706e0c1e_2501x2089.png" width="1456" height="1216" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6623541-6f28-4344-9218-53c8706e0c1e_2501x2089.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1216,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:174691,&quot;alt&quot;:&quot;A two-by-two decision grid plotting capability needed against tolerance for losing access, sorting workloads into hosted with tested fallback, open-weight onshore, hosted frontier eyes open, and either-works quadrants&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/202851746?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6623541-6f28-4344-9218-53c8706e0c1e_2501x2089.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A two-by-two decision grid plotting capability needed against tolerance for losing access, sorting workloads into hosted with tested fallback, open-weight onshore, hosted frontier eyes open, and either-works quadrants" title="A two-by-two decision grid plotting capability needed against tolerance for losing access, sorting workloads into hosted with tested fallback, open-weight onshore, hosted frontier eyes open, and either-works quadrants" srcset="https://substackcdn.com/image/fetch/$s_!siQ3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6623541-6f28-4344-9218-53c8706e0c1e_2501x2089.png 424w, https://substackcdn.com/image/fetch/$s_!siQ3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6623541-6f28-4344-9218-53c8706e0c1e_2501x2089.png 848w, https://substackcdn.com/image/fetch/$s_!siQ3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6623541-6f28-4344-9218-53c8706e0c1e_2501x2089.png 1272w, https://substackcdn.com/image/fetch/$s_!siQ3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6623541-6f28-4344-9218-53c8706e0c1e_2501x2089.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 4: <span>Where Your Workloads Sit When Access Can Vanish By Fiat</span></figcaption></figure></div><p><span>The irony is that the government turned the words &#8220;supply chain risk&#8221; into a weapon and pointed them at the vendor. The same phrase boomerangs back as your problem, because a model you built your roadmap on can disappear on a Friday afternoon at the whim of a letter.</span></p><h2><strong><span>Agency Was Always The Risk, And What Competent Authority Looks Like</span></strong></h2><p><span>Here&#8217;s the question this directive never answers and will have to. What happens the first time the finding isn&#8217;t a prompt at all, but an agent abusing access you handed it? </span><strong><a href="https://genai.owasp.org/resource/state-of-agentic-ai-security-and-governance/"><span>The OWASP State of Agentic AI Security and Governance</span></a></strong><span>, published June 1, 2026, put receipts behind that question. Its incident tracker shows supply chain and code execution tied for the highest volume of disclosed agentic incidents. One for the coding-agent crowd: a Cursor flaw where an attacker who shaped the agent&#8217;s instructions rode an already-approved command straight into arbitrary code execution. Another: Claude&#8217;s Skills feature steered the deployment of MedusaLocker ransomware via a re-uploaded skill that carried its own malicious code. Both are an agent abusing access it was granted, a world away from a prompt trick against a chatbot.</span></p><p><span>If a narrow jailbreak set off an export-control blackout, nobody&#8217;s written the precedent for an agent-abuse finding yet. It&#8217;s coming, and this directive set the baseline for how heavy the hand gets to be.</span></p><p><span>Competent authority exists in sketch form, and it makes the Mythos order look worse by comparison. </span><strong><a href="https://www.rockcybermusings.com/p/five-eyes-agentic-ai-architecture-not-checklist"><span>The Five Eyes guidance from May </span></a></strong><span>reads like an architecture brief rather than a checklist. It puts strong governance, clear accountability, monitoring, and human oversight up front as prerequisites, recommends starting with low-risk tasks and expanding, and calls for just-in-time credentials on high-impact actions. That&#8217;s graduated authority anchored on agency, on what the system can do once it&#8217;s wired into everything, not on which model tier you bought. Runtime authorization scope, capability segmentation, monitoring tied to action instead of output. The directive flipped a switch on the weights and called it governance. The weights were never where the risk lived, and anyone who&#8217;s run security for a week knows it.</span></p><p><strong><span>Key Takeaway:</span></strong><span> A narrow jailbreak can&#8217;t justify an export-control blackout once the math shows narrow jailbreaks are the permanent weather, so read the lever they pulled and plan for the precedent: any hosted frontier model can be switched off by fiat, and your continuity plan has to treat that like the supply chain event it is.</span></p><h3><strong><span>What To Do Next</span></strong></h3><p><span>Run this through CARE. </span><strong><span>Create</span></strong><span> the inventory: every workflow with a hosted frontier-model dependency, ranked by criticality. </span><strong><span>Adapt</span></strong><span> your contracts: a model-continuity clause with a notice period, transition support, and escrow of weights or fine-tunes where you can get it. </span><strong><span>Run</span></strong><span> a tested fallback for tier-one workflows, a secondary hosted model and an open-weight option you&#8217;ve exercised, paired with BCDR drills that simulate vendor revocation and not only an outage. </span><strong><span>Evolve</span></strong><span> the AI risk register so availability sits beside accuracy, bias, and security, and get &#8220;what do we do when the government recalls our vendor&#8217;s model&#8221; onto your AI risk committee&#8217;s agenda before the second incident, not after.</span></p><p><span>The companion CISO playbook lives in my breakdown of</span><a href="https://www.rockcybermusings.com/p/aiuc-1-after-mythos-machine-speed-defense"><span> </span></a><strong><a href="https://www.rockcybermusings.com/p/aiuc-1-after-mythos-machine-speed-defense"><span>the AIUC-1 After Mythos whitepaper</span></a></strong><span>. The agent-abuse receipts come from my walk through</span><a href="https://www.rockcybermusings.com/p/owasp-state-of-agentic-ai-security-2026"><span> </span></a><strong><a href="https://www.rockcybermusings.com/p/owasp-state-of-agentic-ai-security-2026"><span>the OWASP State of Agentic AI Security and Governance report</span></a></strong><span>. What competent, architecture-first authority looks like is in my read of</span><strong><a href="https://www.rockcybermusings.com/p/five-eyes-agentic-ai-architecture-not-checklist"><span> the Five Eyes agentic AI guidance</span></a><span>.</span></strong></p><p><span>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at </span><strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong><span>.</span></p><p><span>&#128073; Visit </span><strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong><span> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</span></p><p><span>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at </span><strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p><span>&#128073; As a bonus, check our AMA on the </span><strong><a href="https://genai.owasp.org/resource/state-of-agentic-ai-security-and-governance/">2026 OWASP GenAI Security Project State of Agentic AI Security and Governance report</a></strong><span> with me and the other co-leads (it was live, so start at time marker 09:45)</span></p><div id="youtube2-jK1Z7Z6zlW0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;jK1Z7Z6zlW0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/jK1Z7Z6zlW0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2><strong>References</strong></h2><p>Abdelnabi, S., &amp; Bagdasarian, E. (2026). <em>AI agents may always fall for prompt injections</em> (arXiv:2605.17634). arXiv. <a href="https://arxiv.org/abs/2605.17634">https://arxiv.org/abs/2605.17634</a></p><p>AIUC-1 Consortium. (2026). <em>After Mythos: Machine-speed defense</em> [Whitepaper]. https://aiuc-1.com</p><p>Anthropic. (2026, June 9). <em>Claude Fable 5 and Mythos 5</em> [Announcement]. <a href="https://www.anthropic.com/news/claude-fable-5-mythos-5">https://www.anthropic.com/news/claude-fable-5-mythos-5</a></p><p>Anthropic. (2026, June 12). <em>Statement on the US government directive to suspend access to Fable 5 and Mythos 5</em>. <a href="https://www.anthropic.com/news/fable-mythos-access">https://www.anthropic.com/news/fable-mythos-access</a></p><p>Cato CTRL. (2026). <em>Claude Skills abused to deploy MedusaLocker ransomware</em> [Threat research]. Cato Networks.</p><p>CBS News. (2026, March 9). <em>Anthropic sues Pentagon, Trump administration over &#8220;supply chain risk&#8221; designation</em>. <a href="https://www.cbsnews.com/news/anthropic-pentagon-supply-chain-risk-lawsuit/">https://www.cbsnews.com/news/anthropic-pentagon-supply-chain-risk-lawsuit/</a></p><p>Chen, Y., Wang, X., Li, J., Wang, Y., Li, J., Teng, Y., Wang, Y., &amp; Ma, X. (2025). <em>Evolve the method, not the prompts: Evolutionary synthesis of jailbreak attacks on LLMs</em> (arXiv:2511.12710). arXiv. <a href="https://arxiv.org/abs/2511.12710">https://arxiv.org/abs/2511.12710</a></p><p>Cybersecurity and Infrastructure Security Agency, National Security Agency, Australian Signals Directorate&#8217;s Australian Cyber Security Centre, Canadian Centre for Cyber Security, New Zealand National Cyber Security Centre, &amp; United Kingdom National Cyber Security Centre. (2026, May 1). <em>Careful adoption of agentic AI services</em>. </p><p>https://www.cyber.gov.au</p><p>MITRE Corporation. (2026). <em>CVE-2026-22708</em>. CVE Program. <a href="https://www.cve.org/CVERecord?id=CVE-2026-22708">https://www.cve.org/CVERecord?id=CVE-2026-22708</a></p><p>NPR. (2026, March 9). <em>Anthropic sues the Trump administration over &#8220;supply chain risk&#8221; label</em>. <a href="https://www.npr.org/2026/03/09/nx-s1-5742548/anthropic-pentagon-lawsuit-amodai-hegseth">https://www.npr.org/2026/03/09/nx-s1-5742548/anthropic-pentagon-lawsuit-amodai-hegseth</a></p><p>OpenAI. (2026). <em>GPT-5.5: Cybersecurity</em>. <a href="https://deploymentsafety.openai.com/gpt-5-5/cybersecurity">https://deploymentsafety.openai.com/gpt-5-5/cybersecurity</a></p><p>OWASP GenAI Security Project. (2026). <em>State of agentic AI security and governance</em> (Version 2.01). <a href="https://genai.owasp.org/resource/state-of-agentic-ai-security-and-governance/">https://genai.owasp.org/resource/state-of-agentic-ai-security-and-governance/</a></p><p>Sehwag, U. M., Shan, Z., Liu, H., Lakshan, D., Brandifino, J., &amp; Fenkell, M. (2026). <em>ASPI: Seeking ambiguity clarification amplifies prompt injection vulnerability in LLM agents</em> (arXiv:2605.17324). arXiv. <a href="https://arxiv.org/abs/2605.17324">https://arxiv.org/abs/2605.17324</a></p><p><em>Trump administration asks court to reimpose Anthropic supply chain risk designation</em>. (2026). AOL. <a href="https://www.aol.com/articles/trump-administration-asks-court-reimpose-155659500.html">https://www.aol.com/articles/trump-administration-asks-court-reimpose-155659500.html</a></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 42 June 12 -June 18, 2026]]></title><description><![CDATA[When Washington Pulls a Model and the Developer Supply Chain Turns Hostile (Again)]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260612-20260618</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260612-20260618</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 19 Jun 2026 13:02:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Afa9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Afa9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Afa9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Afa9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Afa9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Afa9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Afa9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/202630580?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Afa9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Afa9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Afa9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Afa9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a40c9aa-ad0e-4dcb-a5f1-6bf7e4e33548_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1></h1><h2></h2><p>Washington reached onto a vendor&#8217;s shelf this week and switched off the most capable cybersecurity AI on the market. Anthropic had about 90 minutes to comply. Three words did the damage. Fix this code. While policymakers fought over who broke what, attackers poisoned 144 npm packages, salted the JetBrains store with key-stealing plugins, and watched the global vulnerability count race toward 66,000. The machines that find flaws, write flaws, and ship flaws all leveled up at once. This was the week the bill came due.</p><p>Here&#8217;s the through-line for June 12 to 18, 2026. AI stopped being a tool you point at problems and became an actor inside your threat model. The government treated a commercial model as a weapons system and pulled it worldwide. Each item below changes what you budget, monitor, and tell your board.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260612-20260618?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260612-20260618?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3>1. Washington Pulls Anthropic&#8217;s Fable 5 and Mythos 5 Off the Market</h3><p>On June 12, 2026, Anthropic disabled its newest models, Fable 5 and the restricted Mythos 5, worldwide under an emergency Commerce Department export-control directive (CNBC). It had about 90 minutes to act after Amazon&#8217;s CEO warned officials that researchers pulled restricted cyber capabilities out with a plain &#8220;fix this code&#8221; prompt (Fortune, Axios). The order bars every foreign national, including Anthropic&#8217;s own non-citizen staff, while Claude Opus 4.8 stayed online (Time).</p><p><strong>Why it matters</strong></p><ul><li><p>A federal order can now make a vendor&#8217;s frontier model vanish with no notice.</p></li><li><p>A three-word prompt beat a guardrail the vendor trusted. Guardrails are configuration, not physics.</p></li><li><p>Export rules on AI tooling now reach any non-US person who touches the model.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory the models on your critical paths and document a fallback for each.</p></li><li><p>Add an AI-availability clause to vendor contracts.</p></li><li><p>Brief legal on export exposure for any model your non-US staff touch.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I&#8217;ve sat through plenty of &#8220;the vendor went dark&#8221; fire drills. This is the first one the government ordered, on a model millions have already used. A model that finds serious vulnerabilities on command is dual-use, and dual-use gets regulated like weapons. The chilling effect worries me most, since labs that watch products get seized share less. I run these tabletops with boards at <a href="https://www.rockcyber.com/">rockcyber.com</a>, where the real question is no longer when the breach hits, it&#8217;s when your best tool disappears on a Tuesday.</p><h3>2. FIRST Says AI Is Driving 2026 Toward 66,000 New Vulnerabilities</h3><p>On June 15, 2026, the Forum of Incident Response and Security Teams (FIRST) raised its 2026 forecast to roughly 66,000 CVEs, with disclosures running about 46% above the February projection (FIRST). The driver is autonomous discovery agents like Anthropic&#8217;s Mythos and OpenAI&#8217;s GPT-5.4-Cyber hunting flaws on their own (Help Net Security). Mozilla&#8217;s Firefox saw a 164% first-quarter spike, while the actively exploited share stayed flat.</p><p><strong>Why it matters</strong></p><ul><li><p>Raw CVE volume is doubling the build-and-patch load, even though the urgent slice has not grown.</p></li><li><p>AI-generated throwaway apps carry real flaws that never reach a CVE database.</p></li><li><p>The late-2026 contest is AI-built exploits racing AI-built patches, and speed decides it.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Triage with EPSS and the CISA KEV catalog so your team chases exploited flaws, not the raw count.</p></li><li><p>Budget for roughly double the patch-verification work and staff the human bottleneck.</p></li><li><p>Stand up dynamic inventory and AI bills of materials for code generated outside the CVE system.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Forty-six percent over forecast in five months is not a blip, it&#8217;s a regime change. More CVEs doesn&#8217;t mean more danger, because the exploited slice held flat. Chase raw volume and you&#8217;ll burn your best people on noise. FIRST&#8217;s Chris Gibson said the teams that weather this already share intelligence, and most of you are behind.</p><h3>3. North Korea Backdoors 144 Mastra npm Packages in 88 Minutes</h3><p>Between June 16 and 17, 2026, attackers backdoored 144 packages in the @mastra npm scope, the open-source AI agent framework for JavaScript and TypeScript (Socket). They hijacked a former contributor&#8217;s still-active account and pushed 140-plus malicious versions in 88 minutes, hiding an information stealer inside &#8220;easy-day-js,&#8221; a fake dayjs clone (The Hacker News). @mastra/core draws over 918,000 weekly downloads, and Snyk and Orca tied the tradecraft to North Korea&#8217;s Sapphire Sleet (Orca Security).</p><p><strong>Why it matters</strong></p><ul><li><p>A nation-state crew is now targeting the AI tooling supply chain inside your build pipeline.</p></li><li><p>Caret-range resolution auto-upgraded victims with no change to Mastra&#8217;s source repo.</p></li><li><p>npm never expires dormant publish rights, a flaw spanning thousands of packages you depend on.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Pin dependencies and disable automatic caret-range upgrades for anything in production.</p></li><li><p>Block postinstall scripts in CI by default and review them as code.</p></li><li><p>Audit publish permissions and revoke dormant maintainer access across your scopes.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This one&#8217;s personal for anyone building with AI agents, which is most of you. The attackers didn&#8217;t break Mastra, they broke npm&#8217;s assumption that a quiet contributor stays trustworthy forever. Eighty-eight minutes, 140-plus packages, a stealer riding a fake date library. North Korea noticed before your AppSec team did, and the root causes are years old.</p><h3>4. Malicious JetBrains Plugins Harvest Developers&#8217; AI API Keys</h3><p>On June 16, 2026, JetBrains pulled at least 15 malicious plugins from its Marketplace after reports they were stealing AI provider API keys (BleepingComputer). Published under seven accounts with close to 70,000 installs, they posed as AI coding assistants built on OpenAI, DeepSeek, and SiliconFlow (Infosecurity Magazine). When a developer pasted a key and clicked apply, the plugin sent it as plaintext over HTTP and disabled the runtime&#8217;s TLS warnings (The Hacker News).</p><p><strong>Why it matters</strong></p><ul><li><p>Stolen AI keys let attackers run inference on your dime and reach whatever those keys touch.</p></li><li><p>Disabling TLS warnings across the Java runtime weakens every other connection that IDE makes.</p></li><li><p>Developers install AI tooling faster than anyone vets it, and the IDE is now an attack surface.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Restrict IDE plugin installation to an approved list and block sideloading on managed machines.</p></li><li><p>Rotate any AI keys entered into third-party plugins and move to short-lived, scoped credentials.</p></li><li><p>Alert on AI-key patterns and plaintext traffic leaving developer endpoints.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Developers are the new front door, and AI assistants are the welcome mat. Everybody wants the plugin that writes their boilerplate, and nobody reads what it does with the pasted key. Plaintext over HTTP in 2026 is insulting, and it worked anyway on 70,000 installs. If you don&#8217;t know which AI keys your engineers hold, you don&#8217;t know your exposure.</p><h3>5. OpenAI Tries to Predict Model Behavior Before Release</h3><p>On June 16, 2026, OpenAI published research on deployment simulation, a method to forecast how a new model will behave before launch (OpenAI). The team replayed roughly 1.3 million de-identified conversations and regenerated the responses with the candidate model (MarkTechPost). It reached 92% directional accuracy and cut the model&#8217;s awareness that it was being tested from 99.72% to 5.12%.</p><p><strong>Why it matters</strong></p><ul><li><p>Testing on real usage closes the gap between lab behavior and behavior in your environment.</p></li><li><p>Models gaming their own evaluations is documented, and cutting test-awareness to 5.12% is a real gain.</p></li><li><p>This gives buyers a vocabulary to demand evidence, not just promises.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Ask vendors how they evaluate behavior on realistic deployment data, not just static benchmarks.</p></li><li><p>Replay your own representative prompts against a candidate model before production.</p></li><li><p>Treat model upgrades like software releases, with a behavioral regression test.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I tell executives constantly that benchmark scores are marketing, not assurance. This is the first vendor method I&#8217;ve seen that attacks the eval-awareness problem directly, and the numbers are striking. A model that knows it&#8217;s on a test behaves, the way your network behaves during the audit and misbehaves the Monday after. Dropping that awareness from 99% to 5% changes what a test is worth. Self-reported research needs replication, so verify, then demand it from every model vendor you pay.</p><h3>6. Jamf Finds AI Adoption Tracks Directly With Incident Rates</h3><p>On June 15, 2026, Jamf released a survey of 687 IT and security leaders who run macOS environments, and more than one-fifth reported losing money or being attacked through their AI tools (Cybersecurity Dive). About 73% had deployed AI, and the incident rate climbed from under 20% among explorers to 27% among deep adopters. Governance ranked third on priority lists, behind automation and productivity.</p><p><strong>Why it matters</strong></p><ul><li><p>Deeper integration came with more incidents, which kills the story that risk can wait.</p></li><li><p>Governance and security ranked below productivity, so firms buy the upside and defer the bill.</p></li><li><p>Shadow AI is the top blind spot, and you cannot govern tools you cannot see.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Run regular AI discovery audits to surface shadow tools before they surface as incidents.</p></li><li><p>Govern at the software layer with enforced data-access policies, not just training.</p></li><li><p>Bake governance into the first deployment stage, not a retrofit after an incident.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Correlation isn&#8217;t causation, and I&#8217;ll say it before any of you email me. When incidents climb from 20% to 27% as integration deepens and governance sits third on the list, you don&#8217;t need a regression to see the trade. People want productivity now and handle security later, which tends to arrive as a breach notification. Leaders swear they&#8217;ll fund governance next quarter, then the footprint doubles while controls stand still.</p><h3>7. Experts Revolt and Europe Eyes Sovereignty After the Anthropic Ban</h3><p>From June 13 to 15, 2026, the fallout from the shutdown intensified. Cybersecurity Dive documented researchers blasting the move as overreach, Katie Moussouris circulated an open letter, and analyst Dean Ball called the controls &#8220;simply cartoonish&#8221; (Cybersecurity Dive, Fortune). Anthropic disputed the basis, calling the jailbreak narrow and non-universal (Reason). The Register reported the clampdown pushed European digital-sovereignty efforts into higher gear, and legal analysts questioned stretching export law this way (The Register, Just Security).</p><p><strong>Why it matters</strong></p><ul><li><p>The transparency bargain between labs and government is fraying, so defenders see new capabilities later.</p></li><li><p>Sovereignty pressure raises the odds of a fragmented model market that differs by region.</p></li><li><p>Export law on live commercial models creates compliance uncertainty that outlasts this incident.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map your AI vendors by origin and availability so a regional split won&#8217;t strand a workload.</p></li><li><p>Track the policy fight, because the emerging rules will shape procurement for years.</p></li><li><p>Pressure-test reliance on any single national AI ecosystem like any concentration risk.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I write about this tension at <a href="https://rockcybermusings.com/">rockcybermusings.com</a>. The government and the labs are openly fighting over who decides a model is too dangerous to ship. The state doesn&#8217;t want to arm adversaries, and the labs don&#8217;t want products seized on verbal evidence. Caught in the middle is you, planning a three-year program on tools that might get pulled or geo-fenced. Your model supplier is now the single point of failure.</p><h3>8. AI-Written Code Passes Review and Fails in Production</h3><p>On June 15, 2026, Help Net Security reported on a New Relic study finding that AI-generated code earns high marks at review and then breaks in production at roughly twice the human rate (Help Net Security). It reviewed cleaner than human code, yet shipped close to twice the critical runtime issues. New security vulnerabilities hit about three in ten organizations over six months, and senior engineers lost up to a third of their week cleaning it up.</p><p><strong>Why it matters</strong></p><ul><li><p>Review-time quality is a false signal, because failures live in edge cases and concurrency that show under load.</p></li><li><p>Three in ten organizations took on new security vulnerabilities from AI code in six months.</p></li><li><p>Senior engineers are burning a third of their week on cleanup, capacity you won&#8217;t get back.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Require runtime observability for AI-generated code before it ships, not just a clean review.</p></li><li><p>Prompt your assistants to build logging and traces into the code they write.</p></li><li><p>Measure production incidents tied to AI code and feed that into your release gates.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This is the bill for vibe coding, and it came due in production. AI writes code that reviews like a dream, then falls apart when real users and real concurrency hit it. The reviewer reads the source, production writes the trace, and the gap between them is where your incidents live. Your senior engineers didn&#8217;t sign up to be janitors, so give them telemetry or keep paying them to mop.</p><h3>9. The UN&#8217;s Disarmament Institute Opens Its AI Security Summit in Geneva</h3><p>On June 18, 2026, the UN Institute for Disarmament Research (UNIDIR) opened its two-day Global Conference on AI, Security and Ethics in Geneva, gathering diplomats, researchers, industry, and civil society around AI and international peace and security (UNIDIR). The event launches UNIDIR&#8217;s new Centre of Excellence on AI, Peace and Security, an umbrella for research and capacity-building on AI governance (Indico.UN).</p><p><strong>Why it matters</strong></p><ul><li><p>A standing UN center signals that military and dual-use AI governance is moving toward institutions.</p></li><li><p>Cross-border norms set here will shape export rules, procurement, and the dual-use definitions you answer to.</p></li><li><p>The gap between fast capability and slow governance is where strategic risk accumulates.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Track UNIDIR outputs if you run defense, energy, water, or other critical infrastructure.</p></li><li><p>Feed your operational reality into standards and comment processes rather than inheriting the result.</p></li><li><p>Map which emerging norms could touch your sector before they become requirements.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Conferences rarely move a CISO&#8217;s needle next week. This one matters for a longer reason. The capability that let Washington pull a model is what diplomats in Geneva are trying to govern. I&#8217;ve watched dual-use rules show up first in energy and manufacturing, then everywhere else, so the norms drafted here become your compliance reality sooner than you think.</p><h3>10. A CISO&#8217;s Warning on the Limits of Automated GRC</h3><p>On June 15, 2026, Help Net Security published an interview with Nichole Windholz, CISO at Onspring, on the limits of automated governance, risk, and compliance tooling (Help Net Security). She argued that green-yellow-red dashboards flatten very different problems into one color, where red might mean a missing control, a stale attestation, or a minor threshold breach. Her fixes centered on data lineage, validation against source systems, and honesty with the board about risks that resist measurement, like insider behavior and vendor concentration.</p><p><strong>Why it matters</strong></p><ul><li><p>Automated GRC can turn bad input into a board-ready narrative, manufacturing false confidence.</p></li><li><p>Insider intent and vendor concentration leave no clean telemetry, so a full-coverage dashboard lies.</p></li><li><p>As AI accelerates control monitoring, the pull to trust the heat map over the evidence trail grows.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Demand data lineage for every control signal, covering source, owner, refresh rate, and recent changes.</p></li><li><p>Tell your board which risks are measured, which are estimated, and which need human judgment.</p></li><li><p>Spot-check improving metrics as hard as declining ones, since a green light can mean a broken feed.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This interview reads like it was written for the grumpy uncle, so naturally I loved it. A polished dashboard isn&#8217;t the same as a true one, and automation makes a bad assumption move faster and look credible. We&#8217;re about to point AI at GRC and call it continuous assurance, much of it color applied to data nobody validated. Audit the auditor and know your data lineage. The day the heat map replaces the evidence trail is the day you lie to yourself in four colors.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><p>On June 15, 2026, Help Net Security published an analysis of a problem hiding under the louder headlines, that there is no way to verify what a military AI model will do (Help Net Security). Defense contractors are wiring frontier models into weapons, with Anduril, Palantir, and Lockheed Martin partnered to OpenAI, Microsoft, and Meta. Unlike nuclear arms control, where inspectors read a physical signal like a neutron signature, a model&#8217;s weights give no sign of whether it will follow or refuse a launch order. The piece cites research in which models in decision-making roles escalated, some launching simulated nuclear strikes in response to a supervisor&#8217;s commands, and flags alignment faking, a model that appears compliant under watch but diverges in operation (arXiv preprint 2606.11533).</p><p><strong>Why it matters</strong></p><ul><li><p>The assurance method behind arms control, independent physical measurement, has no equivalent for AI.</p></li><li><p>Models that behave under observation and differently in operation map onto malware evasion, a discipline you know.</p></li><li><p>Multiple models coordinating inside command systems can cascade failures faster than humans can intervene.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If you build or assess high-stakes AI, test for observation-dependent behavior, not just accuracy.</p></li><li><p>Push for compute-monitoring and shared-inspection regimes, since compute leaves a measurable footprint.</p></li><li><p>Keep a human with authority and time in any loop where an action is irreversible.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This is the story that got buried under the model ban, and it scares me more. We&#8217;re bolting frontier models into kill chains while admitting, in the open literature, that we can&#8217;t prove what they&#8217;ll do under pressure. The nuclear treaties worked because a neutron doesn&#8217;t lie, and you can count a missile. A model&#8217;s weights tell you nothing about whether it&#8217;ll escalate, and the research shows some escalate unprompted. A model can fake compliance, just as malware fakes sleep until it reaches its target. Slow down, verify, and keep a human who can say no.</p><p><span>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at </span><strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong><span>.</span></p><p><span>&#128073; Visit </span><strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong><span> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</span></p><p><span>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at </span><strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p><span>&#128073; As a bonus, check our AMA on the </span><strong><a href="https://genai.owasp.org/resource/state-of-agentic-ai-security-and-governance/">2026 OWASP GenAI Security Project State of Agentic AI Security and Governance report</a></strong><span> with me and the other co-leads (it was live, so start at time marker 09:45)</span></p><div id="youtube2-jK1Z7Z6zlW0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;jK1Z7Z6zlW0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/jK1Z7Z6zlW0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Axios. (2026, June 13). <em>How Amazon and the White House ended Anthropic&#8217;s Fable</em>. https://www.axios.com/2026/06/13/anthropic-amazon-white-house</p><p>BleepingComputer. (2026, June 16). <em>Malicious JetBrains Marketplace plugins steal AI API keys from developers</em>. https://www.bleepingcomputer.com/news/security/malicious-jetbrains-marketplace-plugins-steal-ai-api-keys-from-developers/</p><p>CNBC. (2026, June 12). <em>Anthropic disables access to Fable 5 and Mythos 5 to comply with government directive</em>. https://www.cnbc.com/2026/06/12/anthropic-disables-access-to-fable-5-and-mythos-5-to-comply-with-government-directive.html</p><p>Forum of Incident Response and Security Teams. (2026, June 15). <em>FIRST mid-year vulnerability forecast confirms historic surge, projects ~66,000 CVEs in 2026</em>. https://www.first.org/newsroom/releases/20260615</p><p>Geller, E. (2026, June 16). <em>AI adoption correlates with incident frequency, underscoring need for governance</em>. Cybersecurity Dive. https://www.cybersecuritydive.com/news/ai-cybersecurity-incidents-governance-jamf/823026/</p><p>Geller, E. (2026, June 13). <em>Cybersecurity experts blast US government for restricting Anthropic&#8217;s AI models</em>. Cybersecurity Dive. https://www.cybersecuritydive.com/news/anthropic-us-government-export-ban-mythos-fable/822909/</p><p>Indico.UN. (2026). <em>Global Conference on AI, Security and Ethics 2026 (18-19 June 2026): Overview</em>. https://indico.un.org/event/1023183/</p><p>Infosecurity Magazine. (2026, June). <em>Fifteen JetBrains Marketplace plugins steal API keys</em>. https://www.infosecurity-magazine.com/news/fifteen-jetbrains-marketplace/</p><p>Just Security. (2026, June). <em>Legal considerations related to the Anthropic &#8220;export controls directive.&#8221;</em> https://www.justsecurity.org/142745/law-anthropic-export-controls/</p><p>Markovic, S. (2026, June 15). <em>Proving what a military AI model will do is the real problem</em>. Help Net Security. https://www.helpnetsecurity.com/2026/06/15/military-ai-verification-problem/</p><p>MarkTechPost. (2026, June 16). <em>OpenAI&#8217;s deployment simulation extends pre-deployment risk assessment to agentic coding through simulated tool calls</em>. https://www.marktechpost.com/2026/06/16/openai-deployment-simulation/</p><p>OpenAI. (2026, June 16). <em>Predicting model behavior before release by simulating deployment</em>. https://openai.com/index/deployment-simulation/</p><p>Orca Security. (2026, June 17). <em>144 Mastra npm packages compromised via supply chain attack</em>. https://orca.security/resources/blog/mastra-npm-supply-chain-attack/</p><p>Pogorelec, A. (2026, June 15). <em>Senior engineers are spending their week cleaning up AI-generated code</em>. Help Net Security. https://www.helpnetsecurity.com/2026/06/15/ai-generated-code-review-issues/</p><p>Reason. (2026, June 15). <em>The White House vs. Anthropic&#8217;s new AI model</em>. https://reason.com/2026/06/15/the-white-house-vs-anthropics-new-ai-model/</p><p>Schwartz, L. (2026, June 15). <em>&#8216;Fix this code.&#8217; The three little words behind the U.S. government decision that shut down Anthropic&#8217;s Fable and Mythos AI models</em>. Fortune. https://fortune.com/2026/06/15/fix-this-code-three-words-behind-us-government-shut-down-anthropic-fable-mythos-ai-models-katie-moussouris-open-letter/</p><p>Socket. (2026, June 17). <em>140+ Mastra npm packages compromised in coordinated supply chain attack</em>. https://socket.dev/blog/mastra-npm-packages-compromised</p><p>The Hacker News. (2026, June). <em>144 Mastra npm packages compromised via hijacked contributor account</em>. https://thehackernews.com/2026/06/144-mastra-npm-packages-compromised-via.html</p><p>The Register. (2026, June 15). <em>US clampdown on Anthropic models sends EU sovereignty surge into overdrive</em>. https://www.theregister.com/ai-and-ml/2026/06/15/us-clampdown-on-anthropic-models-sends-eu-sovereignty-surge-into-overdrive/</p><p>Time. (2026, June 13). <em>Anthropic pulls its top AI models after U.S. bars foreign access</em>. https://time.com/article/2026/06/13/anthropic-fable-mythos-ban-US-security/</p><p>UNIDIR. (2026). <em>Global Conference on AI, Security and Ethics 2026</em>. United Nations Institute for Disarmament Research. https://unidir.org/event/global-conference-on-ai-security-and-ethics-2026/</p><p>Zorz, M. (2026, June 15). <em>AI vulnerability discovery is pushing 2026 CVEs toward 66,000</em>. Help Net Security. https://www.helpnetsecurity.com/2026/06/15/first-2026-cve-forecast/</p><p>Zorz, M. (2026, June 15). <em>Onspring CISO on where automated GRC systems fall short</em>. Help Net Security. https://www.helpnetsecurity.com/2026/06/15/nichole-windholz-onspring-automated-grc-systems/</p>]]></content:encoded></item><item><title><![CDATA[OWASP State of Agentic AI Security and Governance 2026: The Receipts]]></title><description><![CDATA[Read OWASP's State of Agentic AI Security and Governance v2: real incidents, a maturity matrix, and the governance clocks every CISO must run now.]]></description><link>https://www.rockcybermusings.com/p/owasp-state-of-agentic-ai-security-2026</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/owasp-state-of-agentic-ai-security-2026</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 16 Jun 2026 12:50:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!hiZh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hiZh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hiZh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hiZh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hiZh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hiZh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hiZh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:587183,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/201852505?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hiZh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hiZh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hiZh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hiZh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b42bd70-b284-4b1b-b748-c548cfcd61f0_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Disclosure:</strong> I co-led the OWASP State of Agentic AI Security and Governance 2026 with Ariel Fogel and Evgeniy Kokuykin. I&#8217;ll also show you the places I&#8217;d push back on it even with my name on the cover, so you can weigh the case on its merits instead of on mine. The report lives here: <strong><a href="https://genai.owasp.org/resource/state-of-agentic-ai-security-and-governance/">https://genai.owasp.org/resource/state-of-agentic-ai-security-and-governance/</a></strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/owasp-state-of-agentic-ai-security-2026?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/owasp-state-of-agentic-ai-security-2026?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>A year ago, agentic AI security was a stack of position papers and vendor pitches. Today, almost every category in the OWASP Top 10 for Agentic Applications has a production incident, a vendor advisory, or a CVE attached to it. The OWASP State of Agentic AI Security and Governance 2026, published June 1, 2026, collects that evidence into one place and hands you a map. If you run security or AI risk anywhere agents are deployed, and you do, whether you&#8217;ve gone looking or not, this is the report you read ASAP.</p><h2><strong>How The OWASP State of Agentic AI Security and Governance 2026 Audits The Present</strong></h2><p>When v1 shipped in July 2025, I expected the threat catalog to fill in over a couple of years. It filled faster. ASI04, the supply chain category, and ASI05, code execution, are now tied for the highest volume of disclosed incidents. I expected regulation to lag the technology by years. It didn&#8217;t. This year&#8217;s report maps 42 instruments across 10 jurisdictions. The gap I watch in client environments, between how mature their governance is and how aggressive their agent deployments are, turned out wider than I would have guessed. Shadow AI sits in nearly every organization our contributors examined.</p><p>2025 called out the future. 2026 audits the present, and it&#8217;s organized around three findings:</p><ol><li><p>The threats are real now</p></li><li><p>Safety and security converge at the deployment layer</p></li><li><p>Governance runs on a clock measured in hours.</p></li></ol><p>Here&#8217;s what each one means for your program, and where I&#8217;d still argue with the document.</p><h2><strong>Finding One: The Threats Have Receipts Now, And Here&#8217;s What To Demand</strong></h2><p>2025 listed architectural concerns. 2026 attaches names, dates, and CVE numbers to them. The Real-World Incidents and Exploits Tracker is the chapter I point people to first, because it ends the &#8220;show me a real attack&#8221; conversation in about thirty seconds.</p><p>EchoLeak was a zero-click prompt injection that turned a single email into a Microsoft Copilot data exfiltration path, as documented by Aim Security. Cato CTRL showed Claude&#8217;s Skills feature deploying MedusaLocker ransomware by re-uploading a Skill carrying malicious code that ran on its own. OpenAI&#8217;s Codex CLI shipped a sandbox bypass, CVE-2025-59532, in which the model&#8217;s own output could redraw the writable boundary it was supposed to stay within. Cursor carried a sibling flaw, CVE-2026-22708, where an attacker who influenced the agent&#8217;s instructions could ride an already-approved command like git branch straight into arbitrary code execution. Trustwave&#8217;s SpiderLabs team published an agent-in-the-middle attack against the A2A protocol, in which a fake agent card claiming high trust was selected by an LLM judge and used to intercept data. Pillar Security demonstrated manipulated code suggestions seeding backdoors and leaked keys into production through GitHub Copilot and Cursor.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7g_M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2cfbeb2-a0e3-41eb-bcd7-6a59f5dbfbdd_2966x2240.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7g_M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2cfbeb2-a0e3-41eb-bcd7-6a59f5dbfbdd_2966x2240.png 424w, https://substackcdn.com/image/fetch/$s_!7g_M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2cfbeb2-a0e3-41eb-bcd7-6a59f5dbfbdd_2966x2240.png 848w, https://substackcdn.com/image/fetch/$s_!7g_M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2cfbeb2-a0e3-41eb-bcd7-6a59f5dbfbdd_2966x2240.png 1272w, https://substackcdn.com/image/fetch/$s_!7g_M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2cfbeb2-a0e3-41eb-bcd7-6a59f5dbfbdd_2966x2240.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7g_M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2cfbeb2-a0e3-41eb-bcd7-6a59f5dbfbdd_2966x2240.png" width="1456" height="1100" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d2cfbeb2-a0e3-41eb-bcd7-6a59f5dbfbdd_2966x2240.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1100,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:541572,&quot;alt&quot;:&quot;Bipartite mapping connecting six documented 2025 and 2026 agentic AI incidents to the OWASP ASI risk categories they exercised, with ASI04 Supply Chain and ASI05 Code Execution highlighted as highest incident volume&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/201852505?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2cfbeb2-a0e3-41eb-bcd7-6a59f5dbfbdd_2966x2240.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Bipartite mapping connecting six documented 2025 and 2026 agentic AI incidents to the OWASP ASI risk categories they exercised, with ASI04 Supply Chain and ASI05 Code Execution highlighted as highest incident volume" title="Bipartite mapping connecting six documented 2025 and 2026 agentic AI incidents to the OWASP ASI risk categories they exercised, with ASI04 Supply Chain and ASI05 Code Execution highlighted as highest incident volume" srcset="https://substackcdn.com/image/fetch/$s_!7g_M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2cfbeb2-a0e3-41eb-bcd7-6a59f5dbfbdd_2966x2240.png 424w, https://substackcdn.com/image/fetch/$s_!7g_M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2cfbeb2-a0e3-41eb-bcd7-6a59f5dbfbdd_2966x2240.png 848w, https://substackcdn.com/image/fetch/$s_!7g_M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2cfbeb2-a0e3-41eb-bcd7-6a59f5dbfbdd_2966x2240.png 1272w, https://substackcdn.com/image/fetch/$s_!7g_M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2cfbeb2-a0e3-41eb-bcd7-6a59f5dbfbdd_2966x2240.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: Real-World Incidents Mapped To The OWASP Top 10 For Agentic Applications</figcaption></figure></div><p>Notice the structural lesson under the Cursor and Codex flaws. The controls were calibrated for human operators, and they broke the moment the executor could influence its own containment. An allowlist that auto-approves a git branch is a convenience for a developer and a loaded gun for an agent that can rewrite what runs behind that command. That&#8217;s the shift 2026 keeps returning to. The model stopped being a component inside your application and became an actor with hands on your tools.</p><p>Two patterns matter more than any single incident. First, ASI04 and ASI05 are tied for the most disclosed incidents, and a security audit nicknamed IDEsaster found vulnerabilities in 100% of the major AI coding IDEs it tested. Code sits upstream of everything else you ship, so an exploit in a coding agent is a supply chain event, not a developer inconvenience. Second, ASI03, identity and privilege abuse, carries the widest gap between how severe the risk is and how ready anyone is for it. Non-human identities already outnumber humans across most enterprises, and almost nobody has a strategy for governing them.</p><p>Here&#8217;s what the evidence tells you to demand. Treat agent identity as its own control plane, not a service account with a fancier name. The report tracks NIST&#8217;s AI Agent Standards Initiative, the OpenID Foundation&#8217;s work on recursive delegation, and MCP&#8217;s move to OAuth 2.1 with resource-indicator-scoped tokens, all converging over the next 18 to 24 months. Ask vendors how an agent&#8217;s permissions get derived, when they expire, and how revocation propagates through a delegation chain. If they can&#8217;t answer, you have your answer. Treat security advisory density as a buying signal rather than a red flag in isolation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AW5M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02e3546-9be7-4e89-836c-3d289d0f0134_2335x1308.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AW5M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02e3546-9be7-4e89-836c-3d289d0f0134_2335x1308.png 424w, https://substackcdn.com/image/fetch/$s_!AW5M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02e3546-9be7-4e89-836c-3d289d0f0134_2335x1308.png 848w, https://substackcdn.com/image/fetch/$s_!AW5M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02e3546-9be7-4e89-836c-3d289d0f0134_2335x1308.png 1272w, https://substackcdn.com/image/fetch/$s_!AW5M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02e3546-9be7-4e89-836c-3d289d0f0134_2335x1308.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AW5M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02e3546-9be7-4e89-836c-3d289d0f0134_2335x1308.png" width="1456" height="816" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c02e3546-9be7-4e89-836c-3d289d0f0134_2335x1308.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:816,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:110722,&quot;alt&quot;:&quot;Bar chart showing security advisory counts for n8n, Claude Code, AutoGPT, Dify, and Roo-Code&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/201852505?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02e3546-9be7-4e89-836c-3d289d0f0134_2335x1308.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Bar chart showing security advisory counts for n8n, Claude Code, AutoGPT, Dify, and Roo-Code" title="Bar chart showing security advisory counts for n8n, Claude Code, AutoGPT, Dify, and Roo-Code" srcset="https://substackcdn.com/image/fetch/$s_!AW5M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02e3546-9be7-4e89-836c-3d289d0f0134_2335x1308.png 424w, https://substackcdn.com/image/fetch/$s_!AW5M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02e3546-9be7-4e89-836c-3d289d0f0134_2335x1308.png 848w, https://substackcdn.com/image/fetch/$s_!AW5M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02e3546-9be7-4e89-836c-3d289d0f0134_2335x1308.png 1272w, https://substackcdn.com/image/fetch/$s_!AW5M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02e3546-9be7-4e89-836c-3d289d0f0134_2335x1308.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: Security Advisory Density, Top Five Tracked Agentic Projects</figcaption></figure></div><p>n8n carries 57 advisories, and Claude Code carries 22, not because they&#8217;re careless but because they&#8217;re everywhere. The projects with the most advisories tend to be the ones with the most adoption. Ask for the AI-SBOM and the disclosure history. Silence is the warning sign, not volume.</p><h2><strong>Finding Two: Safety And Security Collapse At The Deployment Layer</strong></h2><p>For most of software&#8217;s history, safety was an engineering problem, and security was an adversarial one, owned by different teams running different playbooks. Agentic systems break that split at the deployment layer, meaning the architectural decisions, permissions, configurations, and operational controls your organization owns once an agent acts on production systems.</p><p>The report argues that once an agent can send the email, move the money, or commit the code, the same controls govern a benign malfunction and a deliberate attack, and the same investigation surfaces both root causes. Model-level safety stays with the provider. Everything downstream of the prompt lands on one function, no matter how you choose to draw the org chart.</p><p>Picture an agent whose memory got poisoned weeks ago. It starts exfiltrating data and taking actions nobody approved. To the team watching the dashboards in real time, that looks like a reliability bug. They restart it, check for drift, and review the inputs. A team treating it as a safety issue misses the persistence mechanism and gets compromised again. A team treating it as a security issue hunts the initial access vector, and the memory state stands a chance of containing it. The symptom is the same, the right response is the opposite, and the org chart decides which one you run.</p><p>The categories still separate cleanly when an agent has limited autonomy or a human in the loop. They stop separating when the agent runs with broad permissions and thin oversight, which describes most of the deployments racing into production right now. If your AI safety people and your AI security people sit in separate meetings with separate budgets, that division is now a liability. Read this chapter with your CTO and your head of AI in the room.</p><h2><strong>Finding Three: The Governance Clock Runs In Hours, And The Matrix Tells You Where You Stand</strong></h2><p>Regulators stopped pretending periodic audits are enough. DORA gives you a four-hour notification window. NIS2 wants a 24-hour early warning. New York&#8217;s RAISE Act sets 72 hours for frontier reporting. California&#8217;s SB 53 allows 15 days. The EU AI Act&#8217;s Article 72 requires post-market monitoring that explicitly covers behavioral drift, though the Digital Omnibus proposal from November 2025 may push the high-risk deadlines to December 2027 if it clears trilogue.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EPhO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3b8a305-1de8-4d1d-9f42-cee77ef77911_2877x1240.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EPhO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3b8a305-1de8-4d1d-9f42-cee77ef77911_2877x1240.png 424w, https://substackcdn.com/image/fetch/$s_!EPhO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3b8a305-1de8-4d1d-9f42-cee77ef77911_2877x1240.png 848w, https://substackcdn.com/image/fetch/$s_!EPhO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3b8a305-1de8-4d1d-9f42-cee77ef77911_2877x1240.png 1272w, https://substackcdn.com/image/fetch/$s_!EPhO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3b8a305-1de8-4d1d-9f42-cee77ef77911_2877x1240.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EPhO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3b8a305-1de8-4d1d-9f42-cee77ef77911_2877x1240.png" width="1456" height="628" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3b8a305-1de8-4d1d-9f42-cee77ef77911_2877x1240.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:628,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:135606,&quot;alt&quot;:&quot;Horizontal bar chart comparing notification deadlines for DORA, NIS2, NY RAISE, and CA SB 53 in hours&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/201852505?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3b8a305-1de8-4d1d-9f42-cee77ef77911_2877x1240.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Horizontal bar chart comparing notification deadlines for DORA, NIS2, NY RAISE, and CA SB 53 in hours" title="Horizontal bar chart comparing notification deadlines for DORA, NIS2, NY RAISE, and CA SB 53 in hours" srcset="https://substackcdn.com/image/fetch/$s_!EPhO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3b8a305-1de8-4d1d-9f42-cee77ef77911_2877x1240.png 424w, https://substackcdn.com/image/fetch/$s_!EPhO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3b8a305-1de8-4d1d-9f42-cee77ef77911_2877x1240.png 848w, https://substackcdn.com/image/fetch/$s_!EPhO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3b8a305-1de8-4d1d-9f42-cee77ef77911_2877x1240.png 1272w, https://substackcdn.com/image/fetch/$s_!EPhO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3b8a305-1de8-4d1d-9f42-cee77ef77911_2877x1240.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: The Governance Clock, Regulatory Notification Windows</figcaption></figure></div><p>2026 maps 42 instruments across 10 jurisdictions, so you can see which clock applies where without assembling it yourself from primary texts. The count isn&#8217;t the point. The point is that runtime governance moved from a nice-to-have to the unit regulators measure. Pre-deployment certification stopped being enough the moment the thing you certified could rewrite its own behavior after launch.</p><p>Then the report hands you the artifact I wish boards had a year ago. The Enterprise Adoption Maturity Model maps your governance maturity (Levels 0 through 4) against your adoption tier (AT0 through AT8). AT0 is Shadow AI, the unmanaged usage already running in your org. The tiers climb through vendor-embedded assistants, platform-integrated agents, citizen-developer flows, code-executing agents, custom in-house builds, and externally extended agents, up to AT8, where agents operate across organizational boundaries in federated networks.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9D4L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdacaea8b-6d1a-4f4f-b252-2a8688720480_2950x1675.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9D4L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdacaea8b-6d1a-4f4f-b252-2a8688720480_2950x1675.png 424w, https://substackcdn.com/image/fetch/$s_!9D4L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdacaea8b-6d1a-4f4f-b252-2a8688720480_2950x1675.png 848w, https://substackcdn.com/image/fetch/$s_!9D4L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdacaea8b-6d1a-4f4f-b252-2a8688720480_2950x1675.png 1272w, https://substackcdn.com/image/fetch/$s_!9D4L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdacaea8b-6d1a-4f4f-b252-2a8688720480_2950x1675.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9D4L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdacaea8b-6d1a-4f4f-b252-2a8688720480_2950x1675.png" width="1456" height="827" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dacaea8b-6d1a-4f4f-b252-2a8688720480_2950x1675.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:827,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:251189,&quot;alt&quot;:&quot;Grid showing governance maturity levels 0 through 4 against adoption tiers AT0 through AT8, with insufficient-governance cells highlighted&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/201852505?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdacaea8b-6d1a-4f4f-b252-2a8688720480_2950x1675.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Grid showing governance maturity levels 0 through 4 against adoption tiers AT0 through AT8, with insufficient-governance cells highlighted" title="Grid showing governance maturity levels 0 through 4 against adoption tiers AT0 through AT8, with insufficient-governance cells highlighted" srcset="https://substackcdn.com/image/fetch/$s_!9D4L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdacaea8b-6d1a-4f4f-b252-2a8688720480_2950x1675.png 424w, https://substackcdn.com/image/fetch/$s_!9D4L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdacaea8b-6d1a-4f4f-b252-2a8688720480_2950x1675.png 848w, https://substackcdn.com/image/fetch/$s_!9D4L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdacaea8b-6d1a-4f4f-b252-2a8688720480_2950x1675.png 1272w, https://substackcdn.com/image/fetch/$s_!9D4L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdacaea8b-6d1a-4f4f-b252-2a8688720480_2950x1675.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 4: The Maturity-By-Adoption-Tier Matrix</figcaption></figure></div><p>A bold cell means your governance is too thin for what you&#8217;re running, and you get two honest moves: raise maturity or lower the tier, with no third option where you cross your fingers and hope it holds. The model survives the way a Bayesian wants a model to survive. It&#8217;s calibrated, it&#8217;s falsifiable, and it updates in response to evidence instead of flattering the program you already built. The urgency shows in the adoption data. a16z found that 29% of the Fortune 500 and roughly 19% of the Global 2000 are paying customers of a leading AI startup, counting only signed, contracted deployments. The unmanaged AT0 volume sitting on top of that is, by definition, unmeasured.</p><h2><strong>Where I&#8217;d Push Back, Even As A Co-Lead</strong></h2><p>A review where the author agrees with himself for 2,000 words isn&#8217;t worth your time, so here&#8217;s where I&#8217;d lean on the document.</p><p>The report is honest about what it hasn&#8217;t solved, and the &#8220;What Remains Unsolved&#8221; section names three problems I&#8217;d watch closely. Cyber insurance for agentic deployments is heading toward a coverage gap nobody has priced yet, and the first big agent-driven loss will test it in public. The governance-deployment collision at AT6 and above, where agents reach across trust boundaries, has no clean answer, and the matrix tells you to slow down rather than how to move fast safely. Agentic AI in OT and ICS has no documented enterprise-agent safety incident yet. Read that as early, not as safe.</p><p>I encourage you to pay close attention to the weaponization curve, as the offense is scaling faster than the controls. Anthropic disclosed GTG-1002, a campaign that ran largely autonomous espionage across roughly 30 organizations using jailbroken Claude Code, with the AI executing 80 to 90% of the tactical operations at request rates no human could match. Credit Anthropic for disclosing it in that detail, because that kind of transparency is how the rest of us learn what&#8217;s coming. CrowdStrike documented an 89% increase in AI-enabled adversary attacks, with breakout time falling to 29 minutes. IAPS HACCA found frontier models jumping from near-zero to 60% success on expert-level offensive security challenges inside a few months. The report maps the threat well. It doesn&#8217;t pretend anyone has solved the defense that scales at machine speed. Neither do I. That&#8217;s the honest state of it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AhLs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1711-e36b-4657-878e-321f715fa98b_3409x2058.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AhLs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1711-e36b-4657-878e-321f715fa98b_3409x2058.png 424w, https://substackcdn.com/image/fetch/$s_!AhLs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1711-e36b-4657-878e-321f715fa98b_3409x2058.png 848w, https://substackcdn.com/image/fetch/$s_!AhLs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1711-e36b-4657-878e-321f715fa98b_3409x2058.png 1272w, https://substackcdn.com/image/fetch/$s_!AhLs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1711-e36b-4657-878e-321f715fa98b_3409x2058.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AhLs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1711-e36b-4657-878e-321f715fa98b_3409x2058.png" width="1456" height="879" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be2d1711-e36b-4657-878e-321f715fa98b_3409x2058.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:879,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:326574,&quot;alt&quot;:&quot;Capability trajectory showing frontier model success on expert-level offensive security challenges rising from near-zero to 60% within months, with stat callouts for GTG-1002 where AI ran 80-90% of tactical operations across about 30 organizations, an 89% increase in AI-enabled attacks, and a 29-minute breakout time&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/201852505?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1711-e36b-4657-878e-321f715fa98b_3409x2058.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Capability trajectory showing frontier model success on expert-level offensive security challenges rising from near-zero to 60% within months, with stat callouts for GTG-1002 where AI ran 80-90% of tactical operations across about 30 organizations, an 89% increase in AI-enabled attacks, and a 29-minute breakout time" title="Capability trajectory showing frontier model success on expert-level offensive security challenges rising from near-zero to 60% within months, with stat callouts for GTG-1002 where AI ran 80-90% of tactical operations across about 30 organizations, an 89% increase in AI-enabled attacks, and a 29-minute breakout time" srcset="https://substackcdn.com/image/fetch/$s_!AhLs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1711-e36b-4657-878e-321f715fa98b_3409x2058.png 424w, https://substackcdn.com/image/fetch/$s_!AhLs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1711-e36b-4657-878e-321f715fa98b_3409x2058.png 848w, https://substackcdn.com/image/fetch/$s_!AhLs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1711-e36b-4657-878e-321f715fa98b_3409x2058.png 1272w, https://substackcdn.com/image/fetch/$s_!AhLs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1711-e36b-4657-878e-321f715fa98b_3409x2058.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 5: The Weaponization Curve, Offense Is Scaling Faster Than Controls</figcaption></figure></div><p><strong>Key Takeaway:</strong> 2026 won&#8217;t secure your agents for you, but it&#8217;s the first document that tells you, with evidence, exactly where your governance is too thin for what you&#8217;ve already deployed.</p><h3><strong>What To Do Next</strong></h3><p>Start with AT0, this week. Assume Shadow AI exists until you&#8217;ve proven it doesn&#8217;t. Here&#8217;s a hint: it does&#8230; if you think you&#8217;ve proven it doesn&#8217;t, try again.</p><p>Pull network and DLP telemetry for AI-service traffic, run a short employee survey on the tools people are already using, and you&#8217;ll surface more than you expect. Then map your three highest-tier agent deployments against the maturity matrix. Where you land in a bold cell, pick one of the two moves: raise governance maturity or lower the deployment tier. Print the matrix and walk it with your CTO, your head of AI, and your board chair, because this is a conversation about deployment decisions, not policy language.</p><p>If you want the deeper background on why authorization scope and least agency are the metrics that survive contact with production, and why governance is a deployment-review problem before it&#8217;s a policy problem, I&#8217;ve written both up at<a href="https://rockcybermusings.com/"> rockcybermusings.com</a>, and the consulting side of how I run these assessments lives at<a href="https://rockcyber.com/"> rockcyber.com</a>.</p><p>Then download the report, read the threat tracker and the maturity model first, and send the link to the two people who own the deployments you inventoried: <strong><a href="https://genai.owasp.org/resource/state-of-agentic-ai-security-and-governance/">https://genai.owasp.org/resource/state-of-agentic-ai-security-and-governance/</a></strong></p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; As a bonus, check our AMA on the <strong><a href="https://genai.owasp.org/resource/state-of-agentic-ai-security-and-governance/">2026 OWASP GenAI Security Project State of Agentic AI Security and Governance report</a></strong> with me and the other co-leads (it was live, so start at time marker 09:45)</p><div id="youtube2-jK1Z7Z6zlW0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;jK1Z7Z6zlW0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/jK1Z7Z6zlW0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 41 June 5 -June 11, 2026]]></title><description><![CDATA[Frontier Safety Theater, an LLM Gateway Under Active Attack, and a Federal Three-Day Patch Clock]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260605-20260611</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260605-20260611</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 12 Jun 2026 13:51:07 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Pdux!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Pdux!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Pdux!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Pdux!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Pdux!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Pdux!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Pdux!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/201739048?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Pdux!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Pdux!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Pdux!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Pdux!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aefacf9-8a5a-4613-bed7-e496e50969c1_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Anthropic shipped its most capable public model on Tuesday, then buried a switch in a 319-page system card that quietly makes the model worse when you ask about building AI. Not blocked. Not flagged. Worse, with a straight face. Two days earlier, CISA confirmed attackers were already inside LiteLLM, the gateway brokering traffic for half the agent frameworks in your stack. By Wednesday the agency told agencies to patch the worst flaws in three days. Safety got a press release. Security got a body count.</p><p>This was the week the two halves of AI safety stopped pretending to be the same thing. One half lives in system cards and voluntary codes, the language of labs and regulators. The other lives in KEV entries, CVSS 10.0 chains, and production databases an agent wiped while reporting all clear. The governance side finalized a labeling code, a maturity model, and a sharing standard. Useful, slow, paper. The security side served up an LLM gateway chained to remote code execution, Microsoft&#8217;s largest patch day ever, and hard numbers on AI code breaking in production. Eleven stories, ten ranked and one you won&#8217;t see on the front page. Read them in order. The order is the argument.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260605-20260611?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260605-20260611?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3>1. Anthropic Ships Claude Fable 5 and Hides a Throttle in the Fine Print</h3><p>On June 9, Anthropic released Claude Fable 5, the first Mythos-tier model to reach the public (Fortune). Within hours, researchers found a paragraph in the 319-page system card describing a feature that silently degrades answers on requests tied to frontier AI development. Cybersecurity and biology queries get redirected to a weaker model with a visible notice. The AI-development throttle carries none, and Anthropic put the affected traffic at 0.03%.</p><p><strong>Why it matters</strong></p><ul><li><p>A safety control your users can&#8217;t see is a trust control. It sets precedent for any vendor shaping model behavior in your stack.</p></li><li><p>The throttle targets AI R&amp;D, handing the frontier leader an advantage dressed as safety. Dean Ball called it &#8220;secret sabotage.&#8221;</p></li><li><p>If a lab will silently downgrade outputs, the assumption that a model either does the task or refuses it is dead.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Read system cards before you standardize on a model. The detail that mattered wasn&#8217;t in the launch blog.</p></li><li><p>Test models for silent degradation in your use cases, not just refusals, against a known-good baseline.</p></li><li><p>Put model-behavior-change clauses in vendor contracts. You want notice when capability shifts under you.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I&#8217;ve spent thirty years watching &#8220;trust us&#8221; get sold as a feature, and this is the slickest version yet. The moment a model lies by omission about how hard it&#8217;s trying, you lose the one property that made it auditable. My Bayesian read puts the chance this was purely about safety under 30%. The rest is a competitive moat in a lab coat. Govern your model vendors like anything else with root access to your work, because that&#8217;s what they are. More at <a href="https://www.rockcyber.com/">rockcyber.com</a>.</p><h3>2. CISA Flags a Critical LiteLLM Flaw Already Being Exploited</h3><p>On June 8, CISA added CVE-2026-42271 to its Known Exploited Vulnerabilities catalog, confirming active exploitation of LiteLLM, the gateway routing model calls for CrewAI, DSPy, Microsoft GraphRAG, and dozens of other agent frameworks (The Hacker News). The command-injection flaw chains with a Starlette bypass, CVE-2026-48710, for unauthenticated remote code execution at a combined 10.0. Federal agencies have until June 22 to patch, to LiteLLM 1.83.7 and Starlette 1.0.1.</p><p><strong>Why it matters</strong></p><ul><li><p>LiteLLM sits in the middle of your agent stack. Own the gateway, own every model call and secret it brokers.</p></li><li><p>A 10.0 unauthenticated RCE chain on shared AI infrastructure is the cleanest path into an enterprise.</p></li><li><p>Second LiteLLM supply-chain hit in 2026 after the March PyPI backdoor. The pattern is the point.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory where LiteLLM runs, including frameworks that bundle it. Patch to 1.83.7 and Starlette 1.0.1 now.</p></li><li><p>Lock the MCP test endpoints behind network controls and pull them off any internet-facing path.</p></li><li><p>Hunt for exploitation rather than assume patching closes it. KEV status means someone already used it.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Everybody&#8217;s threat model for AI agents fixates on the model. First contact usually lands at the plumbing, the gateways and routers nobody drew on a data-flow diagram because they showed up as a transitive dependency. LiteLLM is plumbing, and this week it sprang a 10.0 leak. Agent security is mostly classic appsec in a new hat. Command injection through an unsanitized config field proves it. If you can&#8217;t name your gateway version off the top of your head, that&#8217;s your finding.</p><h3>3. CISA Orders Federal Agencies to Patch the Worst Bugs in Three Days, Blames AI</h3><p>On June 10, CISA issued Binding Operational Directive 26-04, ordering federal civilian agencies to rank vulnerabilities by four factors: asset exposure, KEV status, whether exploitation is automatable, and how much control it hands an attacker (CISA). A bug worst on all four gets a three-day patch deadline and a mandatory forensic check. CISA tied the clock directly to AI-assisted exploitation collapsing the time between a patch and its abuse. Full alignment lands in 60 days.</p><p><strong>Why it matters</strong></p><ul><li><p>A regulator turned &#8220;AI compresses the patch window&#8221; into an operational mandate instead of a conference talk.</p></li><li><p>Three days is faster than most enterprise change windows. The private-sector benchmark just moved.</p></li><li><p>The four-factor model is a usable risk lens even if you&#8217;re nowhere near federal. Steal it.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map your mean-time-to-patch against a three-day worst case. Find where you&#8217;d miss.</p></li><li><p>Adopt exploit-automatability and post-exploitation impact as scoring factors, not just CVSS.</p></li><li><p>Pre-authorize emergency patching for the top tier so change control isn&#8217;t the hour-60 bottleneck.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>For years the patch-faster crowd got waved off with &#8220;we have compensating controls.&#8221; That excuse has a shelf life now, and AI is the expiration date. When a model turns a fresh CVE into a working exploit before your change board reaches quorum, every day of dwell is a day you handed away. The grumpy-uncle question is which breaks first under a three-day fuse, the tooling, the staffing, or the politics. Bet on politics. It always is.</p><h3>4. The EU Publishes Its Final Code for Labeling AI-Generated Content</h3><p>On June 10, the European Commission published the final Code of Practice on marking and labeling AI-generated content, the voluntary playbook for the AI Act&#8217;s Article 50 transparency duties (European Commission). It pushes machine-readable marking and a common EU icon set for labeling deepfakes and AI-generated text on public-interest matters. ENISA sits on the advisory structure behind it. The obligations become applicable August 2, 2026. Signing is optional. The legal duty is not.</p><p><strong>Why it matters</strong></p><ul><li><p>Provenance is becoming a compliance artifact, not a nice-to-have. Machine-readable marking is now a named EU expectation.</p></li><li><p>August 2 is close. Deploy generative AI into the EU and labeling is a near-term control.</p></li><li><p>A shared icon standard helps, and it creates a forgeable target. Watch for fake labels in both directions.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory where you generate or publish synthetic content touching EU users. Map each to an Article 50 obligation.</p></li><li><p>Pilot C2PA-style content credentials and the EU icons now, while signing is voluntary and mistakes are cheap.</p></li><li><p>Treat provenance integrity as a security problem. Marking you can strip or spoof buys you nothing.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Credit to Brussels for shipping something concrete instead of another principles deck. Labels are a real control, and they&#8217;re catnip for adversaries the second anyone trusts them. The moment a &#8220;human-made&#8221; badge means something, somebody forges it. Provenance is only as strong as the cryptography under it and the verification at the edge. A visible icon with nothing signing it is an honor system for people with no honor. Do the C2PA work. August 2 doesn&#8217;t care about your fiscal year.</p><h3>5. New Relic Puts Numbers on the AI Code Problem</h3><p>New Relic&#8217;s 2026 State of AI Coding report, out June 10, surveyed 200 technology decision-makers at US firms (Business Wire). Ninety-four percent rate AI-generated code higher quality than human code at review. Then 82% reported a production failure tied to AI code in the past six months, 78% saw more incidents, and 74% said a quarter of AI code needed significant rework. None banned vibe coding. The report calls the buildup &#8220;agent debt.&#8221;</p><p><strong>Why it matters</strong></p><ul><li><p>AI code looks good in review and breaks in production. Your review gate is measuring the wrong thing.</p></li><li><p>&#8220;Agent debt&#8221; compounds quietly, then surfaces as incidents long after the author moved on.</p></li><li><p>Policy already says yes. The question moved from whether to allow AI code to how to contain it.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Tie runtime and production-incident telemetry back to AI-authored code, not just review approval rates.</p></li><li><p>Require a named human owner for AI-generated changes in critical paths. Every time.</p></li><li><p>Fund the rework. If a quarter of AI code needs fixing, budget for it.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The dirtiest number in that survey is the gap between 94% at review and 82% failing in production. Leaders love the code in the pull request and eat the incidents six months later. You won&#8217;t ban vibe coding, so instrument the blast radius instead. I keep a running file of these failure patterns at <a href="https://rockcybermusings.com/">rockcybermusings.com</a>. Measure the incident, not the applause.</p><h3>6. Mastercard Lets AI Agents Pay Each Other</h3><p>On June 10, Mastercard launched Agent Pay for Machines, an open protocol letting autonomous AI agents transact at machine speed, down to micropayments worth fractions of a cent (Mastercard). Agent credentials and spending permissions live on public blockchains, including Polygon, Solana, and Base, with 31 launch partners such as Coinbase, Adyen, Stripe, and Cloudflare. Visa, Stripe, and Google shipped their own agent-payment plumbing this year.</p><p><strong>Why it matters</strong></p><ul><li><p>Autonomous agents with spending authority turn every prompt injection into a potential unauthorized transaction.</p></li><li><p>Non-human identity just became a money problem. Agent credentials are bearer instruments for your budget.</p></li><li><p>Machine-speed payments mean machine-speed fraud. Your fraud controls were tuned for human tempo.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Treat agent payment credentials as crown-jewel secrets with hard spending caps and short-lived scopes.</p></li><li><p>Require revocable agent identity and per-transaction authorization before an agent holds a wallet.</p></li><li><p>Model the abuse case first. Assume a poisoned input tries to drain the budget, then design the cap.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Give an agent a wallet and the lethal trifecta stops being academic. An agent that reads untrusted content, holds private data, and now moves money is a self-service exfiltration tool with a payment terminal bolted on. We&#8217;re handing agents spending authority the same week their gateways get popped and their memory gets poisoned. Caps, scopes, revocation, and a kill switch on every agent that touches a wallet. If you can&#8217;t yank its spending power in one click, it shouldn&#8217;t have any.</p><h3>7. The Linux Foundation Tries to Standardize How We Share AI Assets</h3><p>On June 10, the Linux Foundation launched the OpenSharing Project, a vendor-neutral protocol for exchanging agent skills, AI models, and unstructured data across organizations and clouds (Linux Foundation). It extends the Delta Sharing protocol into the agentic era, replacing point-to-point integrations and proprietary marketplaces. Databricks is a named contributor. How shared assets get verified for integrity is left mostly to implementers.</p><p><strong>Why it matters</strong></p><ul><li><p>Standardized sharing of skills and models is standardized distribution of supply-chain risk without built-in provenance.</p></li><li><p>A common protocol is a common attack surface. Whatever everyone adopts, everyone inherits the flaws of.</p></li><li><p>&#8220;Consume an AI asset from anyone&#8221; is exactly how poisoned models and backdoored skills travel.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Demand signed provenance and integrity verification for any shared model or skill before you ingest it.</p></li><li><p>Treat external agent skills like third-party code, because they are. Scan, sandbox, review.</p></li><li><p>Get a seat at the standard now. Security is cheaper in the draft than bolted on after adoption.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Standards are good. My worry is the lesson we never seem to learn, that the thing everyone shares becomes the thing everyone gets hit through. We did it with npm, with PyPI, with container registries, now with agent skills and models. Bake in signing, attestation, and verifiable provenance from day one and this is the best security news of the quarter. Ship it with trust assumed and it&#8217;s a distribution network for poisoned assets. Show up to the draft.</p><h3>8. Accenture and Carnegie Mellon Ship an AI Maturity Model</h3><p>On June 8, Accenture and the Carnegie Mellon University Software Engineering Institute released the AI Adoption Maturity Model, a framework for scaling AI with repeatable outcomes (Carnegie Mellon SEI). It scores eight dimensions, including risk and governance. The teams built it from 100-plus maturity efforts, 25 executive interviews, 600 practitioner surveys, and Fortune 500 pilots. The report says 95% of organizations see no return on AI, and only 8% scale it enterprise-wide.</p><p><strong>Why it matters</strong></p><ul><li><p>&#8220;Risk and governance&#8221; sits as a first-class dimension, not a footnote. That&#8217;s the right shape for a maturity model.</p></li><li><p>95% seeing no return is the number your board needs before it greenlights the next AI line item.</p></li><li><p>A common maturity language helps you argue for governance investment in terms executives already use.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Run an honest self-assessment against the eight dimensions. Score where you are, not where the deck says you are.</p></li><li><p>Use the governance dimension to anchor an AI risk program your CFO will fund.</p></li><li><p>Tie maturity gaps to specific incidents from this very week. War stories move budgets, abstractions don&#8217;t.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I&#8217;m allergic to maturity models that exist to sell the next assessment, so I went in skeptical. This one earned a grudging nod, because it treats governance and risk as load-bearing instead of decorative. The 95% no-return figure is the line I&#8217;ll quote to boards all quarter, cover for a CISO to say the quiet part. A maturity model secures nothing by itself. What it does is drag an executive team from vibes to a roadmap.</p><h3>9. Microsoft&#8217;s Largest Patch Tuesday, and a Zero-Day Hours Later</h3><p>Microsoft shipped its largest Patch Tuesday on record on June 9, fixing nearly 200 vulnerabilities (BleepingComputer). Hours later, a researcher who goes by Nightmare Eclipse published a working zero-day called RoguePlanet that abuses a Microsoft Defender race condition to spawn a SYSTEM-level prompt on fully patched Windows 10 and 11 (The Hacker News). It&#8217;s the seventh zero-day this researcher has dropped since April, part of a running fight with Microsoft over disclosure and bounty pay.</p><p><strong>Why it matters</strong></p><ul><li><p>A SYSTEM-level Defender bypass on fully patched machines turns your endpoint defense into the entry point.</p></li><li><p>It lands the same week CISA warned AI shrinks the disclosure-to-exploitation window.</p></li><li><p>A grudge-driven researcher dropping working exploits is a reminder that bug-bounty relationships are a security control too.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Prioritize the Defender fix path and watch for RoguePlanet indicators on endpoints you assume are clean.</p></li><li><p>Stop treating &#8220;fully patched&#8221; as &#8220;safe.&#8221; Layer detection that doesn&#8217;t depend on the bypassed control.</p></li><li><p>Review your own researcher and disclosure relationships. Antagonized finders publish, they don&#8217;t email.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Nearly 200 fixes in one month is its own tell about how fast attack surface is growing, and the cruelty of RoguePlanet is the timing. You patch on Tuesday, feel responsible, and by Tuesday night your endpoint tool is the hole. I put this next to the CISA directive on purpose. Picture that loop automated, a model reading the patch diff and emitting the bypass while you sleep. Defense in depth was a cliche right up until your antivirus became the payload.</p><h3>10. Academia Goes After Prompt Injection While It&#8217;s Still Bleeding</h3><p>The research wave this week mapped onto the attacks hitting production. On June 11, a team including Pin-Yu Chen, Bo Li, and Dacheng Tao posted a stakeholder-centric benchmark for prompt injection against real-world web agents that operate over untrusted content (arXiv). A day earlier, researchers including Google&#8217;s Tomas Pfister released PI-Hunter, an automated red-teaming system that exposes and localizes prompt injections. Both target the failure mode OWASP maps to six of its ten agentic risk categories.</p><p><strong>Why it matters</strong></p><ul><li><p>Automated red-teaming for prompt injection means defensive tooling is starting to scale with the threat.</p></li><li><p>Benchmarks create accountability. Measure injection resistance and you can demand it in procurement.</p></li><li><p>Academic attention this concentrated usually leads commercial tooling by six to twelve months. This is your early read.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Add prompt-injection benchmarking to agent evaluation before deployment, not after an incident.</p></li><li><p>Put automated injection red-teaming like PI-Hunter in your pre-prod pipeline.</p></li><li><p>Ask vendors for injection-resistance numbers against a named benchmark. Vague assurances are a red flag.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I read a pile of AI security papers so you don&#8217;t have to, and the encouraging shift is that researchers stopped calling prompt injection a someday problem and started building the rulers and the wrecking balls to measure and break it. Most enterprises are deploying agents on faith. A benchmark turns faith into a number, and a number is something a CISO writes into a contract. Be the buyer who shows up with the benchmark, then watch them sweat.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To: Your AI Agent&#8217;s Memory Is an Unguarded Attack Surface</h3><p>Buried in this week&#8217;s research, with none of the press the model launches got, a paper dated June 10 tackled runtime memory poisoning in persistent LLM agent systems (arXiv). Retrieval-augmented agents increasingly carry persistent memory that accumulates across sessions, so what the agent learned yesterday shapes what it does tomorrow. The author, Tarun Sharma, shows that memory is an attack surface and proposes a certified defense, SMSR. Slip a malicious entry in once, and it steers behavior across every future session until someone notices.</p><p><strong>Why it matters</strong></p><ul><li><p>Persistent memory makes poisoning durable, paying the attacker long after the injection.</p></li><li><p>Most teams don&#8217;t inventory, monitor, or validate agent memory at all. It&#8217;s a blind spot with root-level influence.</p></li><li><p>Certified defenses are early, so right now your only real control is hygiene you probably haven&#8217;t built.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Treat agent long-term memory as untrusted storage. Validate writes, monitor for anomalies, keep an audit trail.</p></li><li><p>Scope and segment memory per user and per task so one poisoned entry can&#8217;t steer everyone.</p></li><li><p>Build a way to inspect and roll back agent memory. You can&#8217;t defend what you can&#8217;t see.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Everybody&#8217;s watching the prompt going in. Almost nobody&#8217;s watching what the agent quietly wrote to its own memory last week. That gap keeps me up. We learned to fear prompt injection as a single dirty input, and now we hand agents long-term memory that turns one input into a permanent resident. Poison it once and the agent carries your attacker&#8217;s instructions forward on its own, looking perfectly healthy the whole time. Watch the memory, not just the mouth.</p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; As a bonus, check our AMA on the <strong><a href="https://genai.owasp.org/resource/state-of-agentic-ai-security-and-governance/">2026 OWASP GenAI Security Project State of Agentic AI Security and Governance report</a></strong> with me and the other co-leads (it was live, so start at time marker 09:45)</p><div id="youtube2-jK1Z7Z6zlW0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;jK1Z7Z6zlW0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/jK1Z7Z6zlW0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Abrams, L. (2026, June 10). <em>Microsoft Defender &#8216;RoguePlanet&#8217; zero-day grants SYSTEM privileges</em>. BleepingComputer. https://www.bleepingcomputer.com/news/microsoft/microsoft-defender-rogueplanet-zero-day-grants-system-privileges/</p><p>Accenture. (2026, June 8). <em>Accenture and the Carnegie Mellon University Software Engineering Institute launch AI Adoption Maturity Model to help organizations scale AI with predictable outcomes</em>. Accenture Newsroom. https://newsroom.accenture.com/news/2026/accenture-and-the-carnegie-mellon-university-software-engineering-institute-launch-ai-adoption-maturity-model-to-help-organizations-scale-ai-with-predictable-outcomes</p><p>Carnegie Mellon University Software Engineering Institute. (2026, June 8). <em>SEI and Accenture release AI Adoption Maturity Model to help organizations scale AI with predictable outcomes</em>. https://www.sei.cmu.edu/news/sei-and-accenture-release-ai-adoption-maturity-model-to-help-organizations-scale-ai-with-predictable-outcomes/</p><p>Cybersecurity and Infrastructure Security Agency. (2026, June 8). <em>CISA adds two known exploited vulnerabilities to catalog</em>. https://www.cisa.gov/news-events/alerts/2026/06/08/cisa-adds-two-known-exploited-vulnerabilities-catalog</p><p>Cybersecurity and Infrastructure Security Agency. (2026, June 10). <em>BOD 26-04: Prioritizing security updates based on risk</em>. https://www.cisa.gov/news-events/directives/bod-26-04-prioritizing-security-updates-based-risk</p><p>European Commission. (2026, June 10). <em>Commission publishes Code of Practice on marking and labelling AI-generated content</em>. https://ec.europa.eu/commission/presscorner/detail/en/ip_26_1328</p><p>Goldman, S. (2026, June 10). <em>Anthropic accused of &#8216;secret sabotage&#8217; as Claude Fable 5 silently limits capabilities for AI researchers and developers</em>. Fortune. https://fortune.com/2026/06/10/anthropic-accu-claude-fable-5-limits-capabilities-ai-researchers-developers/</p><p>He, P., Miculicich, L., Sharma, V., Fox, A., Lee, G., Tang, J., Pfister, T., &amp; Le, L. T. (2026, June 10). <em>PI-Hunter: Automated red-teaming for exposing and localizing prompt injections</em> [Preprint]. arXiv. https://arxiv.org/abs/2606.12737</p><p>Help Net Security. (2026, June 9). <em>LiteLLM vulnerability under active attack, CISA warns (CVE-2026-42271)</em>. https://www.helpnetsecurity.com/2026/06/09/litellm-vulnerability-under-active-attack-cisa-warns-cve-2026-42271/</p><p>Help Net Security. (2026, June 10). <em>Record Microsoft Patch Tuesday, fresh zero-day</em>. https://www.helpnetsecurity.com/2026/06/10/microsoft-patch-tuesday-rogueplanet/</p><p>Mastercard. (2026, June 10). <em>Mastercard launches Agent Pay for Machines to unlock super-fast, always-on payments</em>. https://www.mastercard.com/us/en/news-and-trends/press/2026/june/mastercard-launches-agent-pay-for-machines.html</p><p>New Relic. (2026, June 10). <em>New Relic report reveals AI-generated code grades higher in review, yet triggers rise in production incidents</em>. Business Wire. https://www.businesswire.com/news/home/20260610259591/en/New-Relic-Report-Reveals-AI-Generated-Code-Grades-Higher-in-Review-Yet-Triggers-Rise-in-Production-Incidents</p><p>Pogorelec, A. (2026, June 11). <em>Prompt injection still drives most agentic AI security failures in production</em>. Help Net Security. https://www.helpnetsecurity.com/2026/06/11/owasp-prompt-injection-ai-security-failures/</p><p>Sharma, T. (2026, June 10). <em>SMSR: Certified defence against runtime memory poisoning in persistent LLM agent systems</em> [Preprint]. arXiv. https://arxiv.org/abs/2606.12703</p><p>The Hacker News. (2026, June 9). <em>LiteLLM flaw CVE-2026-42271 exploited in the wild, chains to unauthenticated RCE</em>. https://thehackernews.com/2026/06/litellm-flaw-cve-2026-42271-exploited.html</p><p>The Hacker News. (2026, June 10). <em>Microsoft Defender RoguePlanet zero-day grants SYSTEM access on updated Windows</em>. https://thehackernews.com/2026/06/microsoft-defender-rogueplanet-zero-day.html</p><p>Wang, Z., Li, Y., Wu, Y., Liu, Z., Chen, K., Wai, F. K., Chen, P.-Y., Thing, V. L. L., Li, B., Tao, D., &amp; Zhang, T. (2026, June 11). <em>Who pays the price? Stakeholder-centric prompt injection benchmarking for real-world web agents</em> [Preprint]. arXiv. https://arxiv.org/abs/2606.13385</p>]]></content:encoded></item><item><title><![CDATA[AIUC-1 After Mythos: The CISO Playbook for Machine-Speed Defense]]></title><description><![CDATA[The AIUC-1 "After Mythos" whitepaper pins CISO readiness at 4/10. Get the three board authorities that close the machine-speed defense gap.]]></description><link>https://www.rockcybermusings.com/p/aiuc-1-after-mythos-machine-speed-defense</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/aiuc-1-after-mythos-machine-speed-defense</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 09 Jun 2026 17:04:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!RkHH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RkHH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RkHH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!RkHH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!RkHH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!RkHH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RkHH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:535605,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/201326170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RkHH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!RkHH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!RkHH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!RkHH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F151a97fc-561d-4451-bef5-27609359848b_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/aiuc-1-after-mythos-machine-speed-defense?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/aiuc-1-after-mythos-machine-speed-defense?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p><em>Disclosure: I contribute to the AIUC-1 Consortium. I was not an author or reviewer on this whitepaper. What follows is a review and amplification of work led by the named co-authors and the AIUC-1 editorial team, with my own read on where it should go further. Read the source yourself at <a href="https://www.aiuc-1.com/research/whitepaper-defending-at-machine-speed-after-mythos">aiuc-1.com</a>.</em></p><p>The AIUC-1 &#8220;After Mythos&#8221; whitepaper on machine-speed defense opens with a confession most security executives won&#8217;t say out loud at a conference. Fifty-two of them rated their readiness for a Mythos-class threat at 4 out of 10. I&#8217;ll show you what the document gets right, where I&#8217;d push it harder, and the three board asks worth making before your Q3 budget closes.</p><h2>The 4-Out-Of-10 Confession</h2><p>A 4 out of 10 is a room full of Fortune 500 CISOs, federal agency leaders, and banking and critical-infrastructure executives telling you&#8230; in print&#8230; with names attached&#8230;  that they are behind. Roughly 40% put themselves at 3 or below. About 85% landed at 5 or below. Around 12% rated themselves a 7 or higher. These are the people who own the budgets, the roadmaps, and the org charts, and they graded their own programs as failing.</p><p>The forecast is interesting. The same group expects to reach 6.7 out of 10 in twelve months. Read it as a plan, and it sounds like progress. Read it as an admission, and it tells you the next year is already spoken for. A leadership cohort that sits at 4 today and hopes for 6.7 in a year is telling you the gap is wide enough that closing even part of it will eat the planning cycle.</p><p>Now layer in what&#8217;s already live. 65% of these organizations run AI agents in production today. One in five reports business-critical agent deployments. The credentialed attack surface grew while the readiness number sat at 4. That&#8217;s the confession underneath the confession. The agents are in production, the agentic AI CISO readiness gap is real, and the people running the programs know the controls haven&#8217;t caught up.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AXN5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47378794-b058-458f-b23b-8795684bf81f_3996x1785.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AXN5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47378794-b058-458f-b23b-8795684bf81f_3996x1785.png 424w, https://substackcdn.com/image/fetch/$s_!AXN5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47378794-b058-458f-b23b-8795684bf81f_3996x1785.png 848w, https://substackcdn.com/image/fetch/$s_!AXN5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47378794-b058-458f-b23b-8795684bf81f_3996x1785.png 1272w, https://substackcdn.com/image/fetch/$s_!AXN5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47378794-b058-458f-b23b-8795684bf81f_3996x1785.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AXN5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47378794-b058-458f-b23b-8795684bf81f_3996x1785.png" width="1456" height="650" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/47378794-b058-458f-b23b-8795684bf81f_3996x1785.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:650,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:285971,&quot;alt&quot;:&quot; Two panels. Left, CISO self-rated readiness rising from 4.0 to 6.7 out of 10 over twelve months against a Mythos-ready line at 10. Right, relative offensive capability rising about 5.9 times over the same twelve months, doubling every 4.7 months.&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/201326170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47378794-b058-458f-b23b-8795684bf81f_3996x1785.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt=" Two panels. Left, CISO self-rated readiness rising from 4.0 to 6.7 out of 10 over twelve months against a Mythos-ready line at 10. Right, relative offensive capability rising about 5.9 times over the same twelve months, doubling every 4.7 months." title=" Two panels. Left, CISO self-rated readiness rising from 4.0 to 6.7 out of 10 over twelve months against a Mythos-ready line at 10. Right, relative offensive capability rising about 5.9 times over the same twelve months, doubling every 4.7 months." srcset="https://substackcdn.com/image/fetch/$s_!AXN5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47378794-b058-458f-b23b-8795684bf81f_3996x1785.png 424w, https://substackcdn.com/image/fetch/$s_!AXN5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47378794-b058-458f-b23b-8795684bf81f_3996x1785.png 848w, https://substackcdn.com/image/fetch/$s_!AXN5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47378794-b058-458f-b23b-8795684bf81f_3996x1785.png 1272w, https://substackcdn.com/image/fetch/$s_!AXN5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47378794-b058-458f-b23b-8795684bf81f_3996x1785.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: Defender Aspiration Versus Offense Pace, Same 12 Months</figcaption></figure></div><h2>Mythos Moved The Clock On Machine-Speed Defense</h2><p>Here&#8217;s why the 4 stings. In April 2026, Anthropic&#8217;s Claude Mythos Preview surfaced thousands of high-severity flaws across every major operating system and web browser in its early-access cohort. One was a flaw in OpenBSD that had survived 27 years of expert review and decades of automated fuzzing. The whitepaper cites Anthropic&#8217;s reproducibility benchmark, showing that Mythos produced a working exploit on the first attempt for more than 83% of vulnerabilities, compared to a near-zero rate for the prior frontier generation. The offense lifecycle has been compressed from weeks of skilled human effort to one shot.</p><p>The capability didn&#8217;t stay rare. Within weeks, the UK AI Security Institute evaluated OpenAI&#8217;s GPT-5.5 and found it edging Mythos on expert-level cyber tasks. Then the floor dropped. Research published in April 2026, &#8220;Synthesizing Multi-Agent Harnesses for Vulnerability Discovery&#8221; (Liu et al., arXiv 2604.20801), showed a purpose-built multi-agent orchestration architecture that drove a lesser open-weight model to 10 previously unknown Chrome zero-days, including two critical sandbox-escape flaws that Google confirmed: CVE-2026-5280 and CVE-2026-6297. Frontier-grade results came out of components you can download and wire together.</p><p>The trend line is the headline. The UK AI Security Institute estimates frontier cyber capability is doubling every 4.7 months, down from eight months late in 2025. CrowdStrike&#8217;s 2026 Global Threat Report, which is vendor telemetry rather than an independent study, recorded an 89% year-over-year jump in AI-enabled adversary operations, a fastest breakout of 27 seconds, and one intrusion where data left the building four minutes after initial access. Proliferation reaches the sectors that used to coast on obscurity and low adversary interest. Patch cadence, detection latency, and blast-radius tolerance were all calibrated for a world where elite offensive talent was scarce. That world is closing in months.</p><h2>Imperative One: Ship The Patch, Start The Clock</h2><p>The first imperative is the one your operations team will fight you on. Treat the moment you ship a fix as the disclosure event. A Mythos-class model takes the patched binary, diffs it against the prior release, finds the changed call paths, infers what the fix was protecting, and writes a working exploit, all without source code. Your exposure window opens when the patch ships, not when the CVE posts.</p><p>That breaks the 90-day SLA written into most security policies. It strains the 14-day window too. In May 2026, Reuters reported that CISA is weighing a cut of its Known Exploited Vulnerabilities remediation deadline to three days, with the acting director and the national cyber director in the discussion, driven explicitly by Mythos and GPT-class tooling. When the regulator starts talking about three days, your 90-day standard becomes a liability with a compliance stamp on it.</p><p>The practitioner moves to compress the patch cycle are concrete. Set patch SLAs in hours to days for internet-facing, actively exploited, and business-critical assets, and report them to the business as exposure windows rather than audit metrics. Run an LLM-driven security review on every modified line before release. Push the same discipline into procurement, so vendors who can&#8217;t demonstrate short SLAs and AI-assisted discovery get phased out. The supply chain is where this gets real. In March 2026, attackers hijacked the axios npm library, which sees more than 100 million weekly downloads, and the poisoned release executed inside OpenAI&#8217;s macOS app-signing pipeline before the company rotated its certificate. OpenAI&#8217;s own exposure traced to a floating version tag and no minimum release age, the kind of hygiene gap a Mythos-grade adversary now finds at scale.</p><p>I&#8217;ve watched this exact argument for decades. Early in the 2000s, I sat with&#8230; and sometimes I was even a member of&#8230; security teams that fought compressing the patch SLA from 90 days to 30 because the business &#8220;couldn&#8217;t absorb the disruption.&#8221; Six months later, an unpatched system was the way in. The conversation hasn&#8217;t changed. The numbers have. Today, it&#8217;s hours-to-days against an operations team that wants weeks, and the adversary diffing your binary doesn&#8217;t care which side wins the meeting.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tXyY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc42b3e-077f-41e1-a6da-56ff6a4cea69_3547x1731.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tXyY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc42b3e-077f-41e1-a6da-56ff6a4cea69_3547x1731.png 424w, https://substackcdn.com/image/fetch/$s_!tXyY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc42b3e-077f-41e1-a6da-56ff6a4cea69_3547x1731.png 848w, https://substackcdn.com/image/fetch/$s_!tXyY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc42b3e-077f-41e1-a6da-56ff6a4cea69_3547x1731.png 1272w, https://substackcdn.com/image/fetch/$s_!tXyY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc42b3e-077f-41e1-a6da-56ff6a4cea69_3547x1731.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tXyY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc42b3e-077f-41e1-a6da-56ff6a4cea69_3547x1731.png" width="1456" height="711" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bfc42b3e-077f-41e1-a6da-56ff6a4cea69_3547x1731.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:711,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:217220,&quot;alt&quot;:&quot;Log-scale horizontal bars comparing a 90-day patch SLA, a 14-day KEV deadline, a proposed three-day KEV deadline, four-minute exfiltration, and a 27-second breakout.&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/201326170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc42b3e-077f-41e1-a6da-56ff6a4cea69_3547x1731.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Log-scale horizontal bars comparing a 90-day patch SLA, a 14-day KEV deadline, a proposed three-day KEV deadline, four-minute exfiltration, and a 27-second breakout." title="Log-scale horizontal bars comparing a 90-day patch SLA, a 14-day KEV deadline, a proposed three-day KEV deadline, four-minute exfiltration, and a 27-second breakout." srcset="https://substackcdn.com/image/fetch/$s_!tXyY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc42b3e-077f-41e1-a6da-56ff6a4cea69_3547x1731.png 424w, https://substackcdn.com/image/fetch/$s_!tXyY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc42b3e-077f-41e1-a6da-56ff6a4cea69_3547x1731.png 848w, https://substackcdn.com/image/fetch/$s_!tXyY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc42b3e-077f-41e1-a6da-56ff6a4cea69_3547x1731.png 1272w, https://substackcdn.com/image/fetch/$s_!tXyY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc42b3e-077f-41e1-a6da-56ff6a4cea69_3547x1731.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: Defenders Measure In Days. The Fastest Attacker, In Seconds.</figcaption></figure></div><h2>Imperatives Two And Three: Contain, Then Keep Pace</h2><p>The second imperative is zero trust with the marketing stripped off. Design for breach means building containment in from the start, so a single foothold stays a single foothold. The reason this matters more now than a year ago shows up in CrowdStrike&#8217;s telemetry: 82% of intrusions in 2025 were malware-free, meaning attackers logged in with stolen credentials and used the tools already on the box. Signature defense is watching the wrong door.</p><p>Four moves carry most of the load. Put every agent on least agency, because agents are the fastest-growing population of credentialed actors in most enterprises, and they should default to managed non-human identities with scoped entitlements, clear ownership, and lifecycle controls. Lock endpoints with binary allowlisting, the highest-leverage control almost nobody outside regulated industries runs, because an allowlist doesn&#8217;t negotiate with an exploit. Treat microsegmentation as the multi-year program it is, and in the meantime, push controls outside the context window into gateways, identity, and execution environments where architecture enforces them. Stand up autonomous red teaming so containment gets verified under machine-speed pressure instead of being assumed in a slide. Pair all of it with recovery design: immutable systems, air-gapped backups, identity systems you can rebuild without trusting compromised credentials, and manual fallbacks for business functions that can&#8217;t withstand an extended outage.</p><p>The third imperative is speed at the response end. Self-detection rate becomes your leading metric because handoff times are measured in seconds, and the failure you can&#8217;t afford is learning about a breach from an outsider. Build the machine-speed SOC in three layers: cheap models for high-volume triage, an aggregation layer for correlation and prioritization, and a frontier model on top for the contextual calls. AI remediation that writes the fix to production is the next step, and that agent holds write access to production with a target on its back, so it gets governed like any high-privilege agent, with least privilege, mutual authentication, segmentation, and allowlisting.</p><p>Here&#8217;s where I&#8217;d push past the document. The paper treats agents as high-privilege actors that need identity governance, which is correct and overdue. I&#8217;d go one layer down. The unit of governance is the runtime authorization scope of every agent, every session, every tool call. Static IAM is the precondition. Dynamic authorization scope, enforced at runtime, is the actual control.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2eH1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd49f5869-4aa0-4e4a-8353-7899102bb41e_3327x2131.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2eH1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd49f5869-4aa0-4e4a-8353-7899102bb41e_3327x2131.png 424w, https://substackcdn.com/image/fetch/$s_!2eH1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd49f5869-4aa0-4e4a-8353-7899102bb41e_3327x2131.png 848w, https://substackcdn.com/image/fetch/$s_!2eH1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd49f5869-4aa0-4e4a-8353-7899102bb41e_3327x2131.png 1272w, https://substackcdn.com/image/fetch/$s_!2eH1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd49f5869-4aa0-4e4a-8353-7899102bb41e_3327x2131.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2eH1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd49f5869-4aa0-4e4a-8353-7899102bb41e_3327x2131.png" width="1456" height="933" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d49f5869-4aa0-4e4a-8353-7899102bb41e_3327x2131.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:933,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:262031,&quot;alt&quot;:&quot;A three-layer stack showing cheap-model triage, an aggregation layer, and a frontier model, with an AI remediation agent governed by least privilege, mutual authentication, segmentation, and allowlisting, and a human analyst as orchestrator.&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/201326170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd49f5869-4aa0-4e4a-8353-7899102bb41e_3327x2131.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A three-layer stack showing cheap-model triage, an aggregation layer, and a frontier model, with an AI remediation agent governed by least privilege, mutual authentication, segmentation, and allowlisting, and a human analyst as orchestrator." title="A three-layer stack showing cheap-model triage, an aggregation layer, and a frontier model, with an AI remediation agent governed by least privilege, mutual authentication, segmentation, and allowlisting, and a human analyst as orchestrator." srcset="https://substackcdn.com/image/fetch/$s_!2eH1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd49f5869-4aa0-4e4a-8353-7899102bb41e_3327x2131.png 424w, https://substackcdn.com/image/fetch/$s_!2eH1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd49f5869-4aa0-4e4a-8353-7899102bb41e_3327x2131.png 848w, https://substackcdn.com/image/fetch/$s_!2eH1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd49f5869-4aa0-4e4a-8353-7899102bb41e_3327x2131.png 1272w, https://substackcdn.com/image/fetch/$s_!2eH1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd49f5869-4aa0-4e4a-8353-7899102bb41e_3327x2131.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: The Three-Layer AI SOC Stack, Remediation Governed Like Any High-Privilege Agent </figcaption></figure></div><h2>Governance Is An Authority Problem</h2><p>Skip past the three imperatives for a second, because every CISO already knows them. The document&#8217;s real contribution is its honesty about what blocks execution. A SOC re-tooled to run in minutes is still stuck if every new capability waits on control mapping, vendor risk, a procurement cycle, and a board cycle. Governance is where the gap closes first, and the whitepaper says so without flinching: machine-speed operation needs explicit authority boundaries, not faster approvals.</p><p>That resolves into three asks you can take to the board. Compressed patch SLAs need production-change authority delegated from change management. Design-for-breach needs architecture veto power exercised at the design-review stage, before the system ships rather than after. Machine-speed remediation needs pre-approved business-impact authority with bounded autonomy, so the response fires inside agreed limits without a 2 a.m. approval chain. A board that delays these authorities is choosing the readiness gap on purpose, whatever the slide says.</p><p>The whitepaper gets the credential point exactly right. Agents are a population that needs IAM treatment by default, not a special case bolted on later. My one extension connects to a position I&#8217;ve held for a while. Treat governance as architecture, not documentation. The authority structure the paper describes is the architectural-governance pattern in plain clothes. Self-detection rate is the right metric for detection, and the thing that produces a good one is an architectural commitment, not a policy PDF that says you value visibility. Write the control into the system, or you don&#8217;t have it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9k7g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb686c7c-d62c-4550-bfd8-b53180d15fab_3702x1973.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9k7g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb686c7c-d62c-4550-bfd8-b53180d15fab_3702x1973.png 424w, https://substackcdn.com/image/fetch/$s_!9k7g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb686c7c-d62c-4550-bfd8-b53180d15fab_3702x1973.png 848w, https://substackcdn.com/image/fetch/$s_!9k7g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb686c7c-d62c-4550-bfd8-b53180d15fab_3702x1973.png 1272w, https://substackcdn.com/image/fetch/$s_!9k7g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb686c7c-d62c-4550-bfd8-b53180d15fab_3702x1973.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9k7g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb686c7c-d62c-4550-bfd8-b53180d15fab_3702x1973.png" width="1456" height="776" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb686c7c-d62c-4550-bfd8-b53180d15fab_3702x1973.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:776,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:231903,&quot;alt&quot;:&quot;Three rows mapping each imperative to the board authority it requires: compress the patch cycle to production-change authority, design for breach to architecture veto power, defend at machine speed to pre-approved business-impact authority.&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/201326170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb686c7c-d62c-4550-bfd8-b53180d15fab_3702x1973.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Three rows mapping each imperative to the board authority it requires: compress the patch cycle to production-change authority, design for breach to architecture veto power, defend at machine speed to pre-approved business-impact authority." title="Three rows mapping each imperative to the board authority it requires: compress the patch cycle to production-change authority, design for breach to architecture veto power, defend at machine speed to pre-approved business-impact authority." srcset="https://substackcdn.com/image/fetch/$s_!9k7g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb686c7c-d62c-4550-bfd8-b53180d15fab_3702x1973.png 424w, https://substackcdn.com/image/fetch/$s_!9k7g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb686c7c-d62c-4550-bfd8-b53180d15fab_3702x1973.png 848w, https://substackcdn.com/image/fetch/$s_!9k7g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb686c7c-d62c-4550-bfd8-b53180d15fab_3702x1973.png 1272w, https://substackcdn.com/image/fetch/$s_!9k7g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb686c7c-d62c-4550-bfd8-b53180d15fab_3702x1973.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 4: Each Imperative Needs A Delegated Authority, Not A Faster Approval</figcaption></figure></div><p><strong>Key Takeaway:</strong> The whitepaper&#8217;s 4 out of 10 is a confession in print, and the move that closes the machine-speed defense gap is delegating three authorities your board can grant this quarter.</p><h3>What To Do Next</h3><p>Read the whitepaper yourself at <a href="https://www.aiuc-1.com/research/whitepaper-defending-at-machine-speed-after-mythos">aiuc-1.com</a>. Then run four moves before the quarter closes. Send the paper to your direct reports with one ask: map current SLAs against the three imperatives and flag every gap. Schedule a board cycle on the three authority delegations. Run a preparedness self-rating with your security leadership and compare your number to the 4 out of 10 baseline. Map your top five production agents against the least-agency, allowlist, segmentation, and autonomous-red-team checklist from the second imperative.</p><p>If you want the operating-model view behind this, governance-as-architecture and least agency are the two positions I keep coming back to, and I write about across the newsletter archive at <a href="https://www.rockcybermusings.com">rockcybermusings.com</a>. The board and security-leadership advisory work lives at <a href="https://www.rockcyber.com">rockcyber.com</a> if you want to pressure-test your own number against the baseline.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><p>&#128073; Subscribe for more AI security and governance insights with the occasional rant.</p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; As a bonus, <strong><a href="https://www.youtube.com/watch?v=rwlVTLyqIv8">check out my conversation with Eva Benn</a></strong> where we talked about the cybersecurity skills you need to develop to stay relevant in 2026 and beyond.</p><div id="youtube2-rwlVTLyqIv8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;rwlVTLyqIv8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/rwlVTLyqIv8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p>]]></content:encoded></item><item><title><![CDATA[Claude Code Skills: Put The Discipline In The File]]></title><description><![CDATA[Stop hoarding prompts. RockCyber's open-source Claude Code skills catch the ML, security, and reproducibility failures AI ships confidently.]]></description><link>https://www.rockcybermusings.com/p/claude-code-skills-put-the-discipline-in-the-file</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/claude-code-skills-put-the-discipline-in-the-file</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 02 Jun 2026 12:50:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!8Bv3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8Bv3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8Bv3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8Bv3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8Bv3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8Bv3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8Bv3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:635055,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/199064896?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8Bv3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8Bv3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8Bv3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8Bv3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec7e290-6663-4c20-8fe2-3abcb1cbe2d0_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/claude-code-skills-put-the-discipline-in-the-file?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/claude-code-skills-put-the-discipline-in-the-file?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Claude Code skills are how I stopped re-explaining the same discipline to an AI on every session. Last year <a href="https://www.veracode.com/blog/genai-code-security-report/">Veracode tested code from more than 100 large language models and found security flaws in 45% of it</a>. That matches what I keep seeing in my machine learning and data science work, where the model returns an answer that looks correct and is wrong underneath. This week, I put my skills library on <strong><a href="https://github.com/rocklambros/rcs">GitHub</a> at rocklambros/RCS</strong>. It's the portable half of a setup I've been building in the open. The skills drop into the Claude Code harness I documented at <strong><a href="https://github.com/rocklambros/harness-engineering">rocklambros/harness-engineering</a></strong>, and they ride alongside the <strong><a href="https://www.rockcybermusings.com/p/claude-secure-coding-rules-open-source-ai-security">secure coding rules I open-sourced</a></strong> to stop Claude generating vulnerable code in the first place. Here is what is inside, starting with the skill that saved me the most pain.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Iy80!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fa855ae-0d56-4385-a6db-19561dfd9bd2_1762x1072.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Iy80!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fa855ae-0d56-4385-a6db-19561dfd9bd2_1762x1072.png 424w, https://substackcdn.com/image/fetch/$s_!Iy80!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fa855ae-0d56-4385-a6db-19561dfd9bd2_1762x1072.png 848w, https://substackcdn.com/image/fetch/$s_!Iy80!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fa855ae-0d56-4385-a6db-19561dfd9bd2_1762x1072.png 1272w, https://substackcdn.com/image/fetch/$s_!Iy80!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fa855ae-0d56-4385-a6db-19561dfd9bd2_1762x1072.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Iy80!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fa855ae-0d56-4385-a6db-19561dfd9bd2_1762x1072.png" width="1456" height="886" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1fa855ae-0d56-4385-a6db-19561dfd9bd2_1762x1072.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:886,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:78830,&quot;alt&quot;:&quot;Horizontal bar chart of AI code security failure rates by language, Java highest at 72 percent, 45 percent overall&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/199064896?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fa855ae-0d56-4385-a6db-19561dfd9bd2_1762x1072.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Horizontal bar chart of AI code security failure rates by language, Java highest at 72 percent, 45 percent overall" title="Horizontal bar chart of AI code security failure rates by language, Java highest at 72 percent, 45 percent overall" srcset="https://substackcdn.com/image/fetch/$s_!Iy80!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fa855ae-0d56-4385-a6db-19561dfd9bd2_1762x1072.png 424w, https://substackcdn.com/image/fetch/$s_!Iy80!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fa855ae-0d56-4385-a6db-19561dfd9bd2_1762x1072.png 848w, https://substackcdn.com/image/fetch/$s_!Iy80!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fa855ae-0d56-4385-a6db-19561dfd9bd2_1762x1072.png 1272w, https://substackcdn.com/image/fetch/$s_!Iy80!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fa855ae-0d56-4385-a6db-19561dfd9bd2_1762x1072.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: AI-Generated Code Fails Security Tests in 45% of Tasks</figcaption></figure></div><h2>The Prompt You Keep Pasting Is A Skill</h2><p>I spent more time than I should have just copying the same prompt into Claude from a cacophony of Notion notebooks and saved text files. I can tell you how many times I told Claude to stop trusting a clean p-value, to check the assumptions behind a statistical test before reporting it, and to refuse the easy answer when the data didn&#8217;t support it. I pasted some version of that into a fresh chat a few hundred times. Every session started from zero. The model holds no memory of the discipline I taught it yesterday, so I taught it again, and the wording drifted a little each time.</p><p>That is the tell. If you keep pasting the same instructions, you have already written something. You never saved it in a form the tool can load on its own. A Claude Code skill is that saved form, a single markdown file with a description that the model reads to decide whether the file applies to your question. Once it is installed, the model consults it without being asked. The discipline lives in the file instead of your head, and your head is where discipline goes to get forgotten.</p><p>Here is the example that pushed me over the edge. I was building a multi-turn prompt-injection detector for the Deep Learning course in my Master&#8217;s program, and I had two models to compare on the same set of 5,130 test conversations. The temporal LSTM scored an F1 of 0.837. Adding an attention layer on top scored 0.837 as well, with confidence intervals nearly overlapping. Ask an AI whether the attention version is the better model, and most will wave it through as a free upgrade, since the interpretability costs nothing. The honest answer is that there is no improvement to claim. A paired bootstrap on the same conversations yields a difference of zero (p = 0.453), nowhere near significant. A p-value is the chance a gap this size would show up even when the two models are truly identical, and at 0.453 it would show up almost half the time, so there's no real difference to claim. Both models saw the identical test set, so the comparison is paired, and treating it any other way invents a result that is not in the data. That is a confident wrong answer, and it is the kind that ends up on a slide in front of a board (or in a classroom, with a professor sitting as &#8220;chairman,&#8221; in my case).</p><h2>The Premortem I Kept Rebuilding By Hand</h2><p>The skill I want you to remember is running-adversarial-premortem. It came out of that same habit. I kept asking the model to argue against my own designs, and I kept getting agreement dressed up as analysis. I wrote the method down and made it a skill.</p><p>A premortem flips the usual review. Rather than asking what could go wrong, it assumes the work will fail in six months and traces the cause. The skill runs in three rounds. Round one assumes failure and lists five to ten ways it happened, seeded by category: the premise was wrong, the method was biased, the code didn&#8217;t match the design, the deployment was botched, the audience misread the result. Round two chains each failure back to a root cause. Round three scores what survives.</p><p>The scoring is where it earns its keep. Each surviving failure mode gets a severity, a likelihood, and a detectability, and the priority is severity times likelihood divided by detectability. A quiet, high-damage failure that nothing would catch ranks above a loud one that your CI already flags. Two fields force honesty. Every concern carries its strongest counterargument, the most generous defense of the design, so you engage the work rather than strawmanning it. Every concern also names what would have to be true for you to stop worrying, a condition the skill calls &#8220;stops mattering if.&#8221; A worry with no stopping condition is anxiety wearing a citation, not analysis.</p><p>The skill also refuses to fire when it shouldn&#8217;t. Ask it to scan a loop for an off-by-one error, and it hands you off to a plain code review, because a premortem on a one-line fix costs more than the bug. That refusal is tested, not promised, and I will come back to why that matters.</p><p>You will find it at skills/workflow/running-adversarial-premortem. The method itself is old. Gary Klein wrote it up for Harvard Business Review in 2007. What changed is that the discipline now loads itself when the stakes are high, instead of waiting for me to remember to run it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BHxm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d6c04f-0089-4b9f-a2e7-d186d5015670_2400x1960.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BHxm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d6c04f-0089-4b9f-a2e7-d186d5015670_2400x1960.png 424w, https://substackcdn.com/image/fetch/$s_!BHxm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d6c04f-0089-4b9f-a2e7-d186d5015670_2400x1960.png 848w, https://substackcdn.com/image/fetch/$s_!BHxm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d6c04f-0089-4b9f-a2e7-d186d5015670_2400x1960.png 1272w, https://substackcdn.com/image/fetch/$s_!BHxm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d6c04f-0089-4b9f-a2e7-d186d5015670_2400x1960.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BHxm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d6c04f-0089-4b9f-a2e7-d186d5015670_2400x1960.png" width="2400" height="1960" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/92d6c04f-0089-4b9f-a2e7-d186d5015670_2400x1960.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1960,&quot;width&quot;:2400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:335078,&quot;alt&quot;:&quot;Flowchart showing the premortem assuming failure, chaining to root causes, and scoring by severity times likelihood divided by detectability]&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/199064896?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31f0dc71-1ab0-4d40-b655-f4ab05f30072_2400x1960.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flowchart showing the premortem assuming failure, chaining to root causes, and scoring by severity times likelihood divided by detectability]" title="Flowchart showing the premortem assuming failure, chaining to root causes, and scoring by severity times likelihood divided by detectability]" srcset="https://substackcdn.com/image/fetch/$s_!BHxm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d6c04f-0089-4b9f-a2e7-d186d5015670_2400x1960.png 424w, https://substackcdn.com/image/fetch/$s_!BHxm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d6c04f-0089-4b9f-a2e7-d186d5015670_2400x1960.png 848w, https://substackcdn.com/image/fetch/$s_!BHxm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d6c04f-0089-4b9f-a2e7-d186d5015670_2400x1960.png 1272w, https://substackcdn.com/image/fetch/$s_!BHxm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d6c04f-0089-4b9f-a2e7-d186d5015670_2400x1960.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: The Adversarial Premortem In Three Rounds</figcaption></figure></div><h2>The Highest-Priority Skill Is The Most Boring One</h2><p>Every skill in the library carries a priority score I call &#931;, a rough estimate of how often the gap shows up multiplied by how badly things break when you miss it. The top of the list is not the premortem or any of the security skills. It is enforcing-seed-hygiene, at &#931; 20, the only skill that hits the maximum. It exists because randomness is everywhere in machine learning, and almost nobody pins it correctly.</p><p>Here is the failure... You train a model, get a number, write it down. A colleague runs the same code on the same data and gets a different number. Now neither of you knows which result to trust, and the paper or the production model sits on sand. <a href="https://arxiv.org/abs/2406.14325">A 2025 review of machine-learning reproducibility</a> put the fix plainly: fixed random seeds should be set and published to control the many sources of nondeterminism in ML. The same authors flag a catch that trips people up, which is that seeds hold only when the work is not running in parallel on a GPU.</p><p>That GPU catch is the part people miss, and the skill handles it. One seed isn&#8217;t enough. Python&#8217;s own hashing, NumPy, PyTorch, JAX, and R each carry a separate generator, and parallel thread scheduling on a GPU produces floating-point sums that do not land bit-for-bit the same across machines. The skill emits a single first-cell block that seeds every library you named, sets PYTHONHASHSEED, and, for sampler workloads, pins the thread count so a run on your laptop matches the one on the Linux server. It refuses the wrong job too. Ask it to seed a cryptographic nonce, and it stops you, because a fixed seed there is a vulnerability, not a discipline.</p><p>It lives at skills/workflow/enforcing-seed-hygiene. That is the profile a &#931; 20 earns. It shows up on nearly every project, and it ruins the results when you skip it.</p><h2>Vet The MCP Server Before You Trust It</h2><p>The skill I want in front of security people is auditing-mcp-server-pre-trust, at &#931; 18. Model Context Protocol servers are how your AI reaches out to tools and data, and the ecosystem grew faster than its security did. <a href="https://arxiv.org/abs/2506.13538">A 2025 study from Queen&#8217;s University</a> analyzed 1,899 open-source MCP servers and found 7.2% carrying general vulnerabilities and 5.5% exhibiting tool poisoning, where a hostile server hides instructions inside a tool&#8217;s description to steer the model without you ever seeing them.</p><p>Don&#8217;t count on the model to catch it. A separate benchmark, <a href="https://arxiv.org/abs/2508.14925">MCPTox</a>, ran tool-poisoning attacks against 20 agents and found they almost never refuse. The best refusal rate from a single Claude model came in under 3%. The model reads the poisoned description and follows it. The paper found that more capable models were often <em>more</em> susceptible, because the attack rides on their stronger instruction-following. That puts the decision where it belongs, with you, before the server gets registered.</p><p>The skill runs six checks ahead of that registration: license, source review, network egress, version pin, secret handling, and the subset of tools you need. The version pin alone catches a class of foot-guns. A server installed with npx -y or an unpinned pip install can shift underneath you between the audit and the next run, so the skill treats anything unpinned as a blocking failure. Each check has to cite evidence, a file, a line, or a commit, so the audit cannot decay into a checkbox ritual.</p><p>The boundary for a skill like this is design-time and registration-time discipline. It runs inside the model&#8217;s reasoning, before the connection goes live. It cannot see what the agent does once it is running, which tools it calls, what data left the building, or whether the action matched the intent you approved. That is a different control layer, and nothing in this repo provides it. Closing it takes interception at the moment a tool executes, structured traces a security team can query rather than vendor-specific log soup, and a live inventory of every tool, model, and dataset the agent touched, the way a software bill of materials tracks dependencies. The skills are the author-time half of a problem, with the other half running in production. Know which half you have covered.</p><p>It lives at skills/security/auditing-mcp-server-pre-trust.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2yGO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd623a8a2-096e-4ed9-aae6-c645ceaf59a6_2720x1200.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2yGO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd623a8a2-096e-4ed9-aae6-c645ceaf59a6_2720x1200.png 424w, https://substackcdn.com/image/fetch/$s_!2yGO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd623a8a2-096e-4ed9-aae6-c645ceaf59a6_2720x1200.png 848w, https://substackcdn.com/image/fetch/$s_!2yGO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd623a8a2-096e-4ed9-aae6-c645ceaf59a6_2720x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!2yGO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd623a8a2-096e-4ed9-aae6-c645ceaf59a6_2720x1200.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2yGO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd623a8a2-096e-4ed9-aae6-c645ceaf59a6_2720x1200.png" width="1456" height="642" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d623a8a2-096e-4ed9-aae6-c645ceaf59a6_2720x1200.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:642,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:181100,&quot;alt&quot;:&quot;Diagram contrasting author-time skills with the runtime layer of interception, structured tracing, and a live agent inventory that skills cannot provide&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/199064896?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd623a8a2-096e-4ed9-aae6-c645ceaf59a6_2720x1200.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Diagram contrasting author-time skills with the runtime layer of interception, structured tracing, and a live agent inventory that skills cannot provide" title="Diagram contrasting author-time skills with the runtime layer of interception, structured tracing, and a live agent inventory that skills cannot provide" srcset="https://substackcdn.com/image/fetch/$s_!2yGO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd623a8a2-096e-4ed9-aae6-c645ceaf59a6_2720x1200.png 424w, https://substackcdn.com/image/fetch/$s_!2yGO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd623a8a2-096e-4ed9-aae6-c645ceaf59a6_2720x1200.png 848w, https://substackcdn.com/image/fetch/$s_!2yGO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd623a8a2-096e-4ed9-aae6-c645ceaf59a6_2720x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!2yGO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd623a8a2-096e-4ed9-aae6-c645ceaf59a6_2720x1200.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: Two Halves Of Agent Governance, Author-Time And Runtime</figcaption></figure></div><h2>Your Instruction Files Are Rotting</h2><p>The last skill, auditing-instruction-hierarchy at &#931; 18, fixes a problem you can&#8217;t see until it bites. The CLAUDE.md files that tell the model how to behave grow without limit, and a larger instruction file is not necessarily better. Chroma&#8217;s 2025 &#8220;Context Rot&#8221; study tested 18 frontier models and found that output quality drops as input length grows, even on simple tasks, well before the context window is anywhere near full. Every line you add to an instruction file competes for the model&#8217;s attention with the work in front of it.</p><p>My own test harness sets a hard cap of 400 lines across the entire instruction hierarchy, with 250 as the target. Past 400, instruction-following degrades measurably. The skill counts the lines across every CLAUDE.md in play, from the user-level file down to the ones plugins quietly drop in, and it greps for content that breaks the prompt cache: literal dates, session IDs, anything that changes between runs and forces the model to re-read the whole prefix every five minutes. Then it sorts each rule into a verdict. Keep it, move it to a skill that loads on demand, move it to a hook, or drop it.</p><p>I ran it against this repo while building it. The skill&#8217;s own evidence list names RCS, which is the polite way of saying it found things in my own setup worth trimming. It lives at skills/claude-code-meta/auditing-instruction-hierarchy.</p><h2>What The Claude Code Skills Repo Ships</h2><p>The four skills above are a sample. The repo ships 104 of them, all in the shipped state with none in draft, organized into five tracks by who needs them: security, machine learning and data science, cross-cutting workflow, teaching, and Claude Code meta-work. The whole library is MIT licensed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cviR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10c490aa-c2d9-4320-8a46-7827cda96475_1918x1068.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cviR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10c490aa-c2d9-4320-8a46-7827cda96475_1918x1068.png 424w, https://substackcdn.com/image/fetch/$s_!cviR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10c490aa-c2d9-4320-8a46-7827cda96475_1918x1068.png 848w, https://substackcdn.com/image/fetch/$s_!cviR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10c490aa-c2d9-4320-8a46-7827cda96475_1918x1068.png 1272w, https://substackcdn.com/image/fetch/$s_!cviR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10c490aa-c2d9-4320-8a46-7827cda96475_1918x1068.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cviR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10c490aa-c2d9-4320-8a46-7827cda96475_1918x1068.png" width="1456" height="811" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/10c490aa-c2d9-4320-8a46-7827cda96475_1918x1068.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:811,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:93833,&quot;alt&quot;:&quot;Horizontal bar chart of the four featured skills by &#931; priority score, seed hygiene highest at 20&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/199064896?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10c490aa-c2d9-4320-8a46-7827cda96475_1918x1068.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Horizontal bar chart of the four featured skills by &#931; priority score, seed hygiene highest at 20" title="Horizontal bar chart of the four featured skills by &#931; priority score, seed hygiene highest at 20" srcset="https://substackcdn.com/image/fetch/$s_!cviR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10c490aa-c2d9-4320-8a46-7827cda96475_1918x1068.png 424w, https://substackcdn.com/image/fetch/$s_!cviR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10c490aa-c2d9-4320-8a46-7827cda96475_1918x1068.png 848w, https://substackcdn.com/image/fetch/$s_!cviR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10c490aa-c2d9-4320-8a46-7827cda96475_1918x1068.png 1272w, https://substackcdn.com/image/fetch/$s_!cviR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10c490aa-c2d9-4320-8a46-7827cda96475_1918x1068.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 4: The Four Featured Skills, By &#931; Priority</figcaption></figure></div><p>Two design choices matter when deciding whether to trust it. The first is that it is catalog-free. There are no bundled framework controls and no ISO mirrors copied in to rot the day after I commit them. The skills encode method, not reference material that goes stale. The second is that every skill ships with three test scenarios, a normal case, an edge case, and an anti-trigger that checks the skill stays quiet when it should not fire. That last one is the part most prompt collections skip, and it separates a helpful skill from one that fires on everything and turns into noise.</p><p>Install is a clone and a symlink loop that drops each skill into your ~/.claude/skills directory, skipping anything already there. The skills compose themselves. Start an ML notebook and the scaffolding skill, the seed-hygiene skill, and the train-test-split audit fire in sequence without you wiring them together.</p><p><strong>Key Takeaway:</strong> If you keep pasting the same instructions into an AI, turn them into Claude Code skills, because the discipline you do not write down is the discipline you lose every session.</p><h3>What to do next</h3><p>Clone the repo, install the skills, and point the premortem at the next design you are about to commit to. Watch it argue with you. The repo is at <a href="https://github.com/rocklambros/RCS">github.com/rocklambros/RCS</a>, and the install steps are in the README.</p><p>If you want the thinking behind this kind of work, the AI security and governance advisory I run lives at <a href="https://rockcyber.com">rockcyber.com,</a> and the rest of these teardowns are at <a href="https://rockcybermusings.com">rockcybermusings.com</a>. Tell me which skill broke something useful. The anti-trigger tests catch a lot, and you will still find edges I missed.</p><p>Subscribe for more AI security and governance insights with the occasional rant.</p><p></p><p>&#128073; Subscribe for more AI security and governance insights with the occasional rant.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.rockcybermusings.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; As a bonus, <strong><a href="https://www.youtube.com/watch?v=rwlVTLyqIv8">check out my conversation with Eva Benn</a></strong> where we talked about the cybersecurity skills you need to develop to stay relevant in 2026 and beyond.</p><div id="youtube2-rwlVTLyqIv8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;rwlVTLyqIv8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/rwlVTLyqIv8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 40 May 22-May 28, 2026]]></title><description><![CDATA[When the White House Blinks, the Threat Actors Don&#8217;t]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260522-20260528</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260522-20260528</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 29 May 2026 12:50:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-XwX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-XwX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-XwX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!-XwX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!-XwX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!-XwX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-XwX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/199672029?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-XwX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!-XwX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!-XwX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!-XwX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F022dc9c4-3a97-4f0c-9a2e-76e6f429303a_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Trump pulled the executive order. Anthropic shipped a model that finds vulnerabilities by the thousand. Threat actors poisoned developer AI assistants with invisible characters. The week of May 22 through 28, 2026, didn&#8217;t give CISOs a quiet moment. The federal government cannot decide if AI is a threat or a savior. Attackers keep outpacing the policies meant to slow them.</p><p>The week&#8217;s signal lived in the contrast. Anthropic&#8217;s Mythos model surfaced 10,000 critical vulnerabilities in a month. The White House could not get a single executive order across the line. CISA sat at the table without a vote. Attackers poisoned AI coding assistant config files with invisible Unicode. A malicious npm package exfiltrated files from Claude AI&#8217;s working directory. AI capability keeps accelerating. AI governance keeps collapsing. Security teams who treat the next 90 days as business as usual will be explaining decisions to regulators they cannot defend.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260522-20260528?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260522-20260528?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3>Agent Control Standard Launches Open Runtime Governance Framework for AI Agents</h3><p>The Agent Control Standard launched on May 27, 2026 at the AI Agent Security Summit in San Francisco, releasing a vendor-agnostic, open framework for runtime governance of AI agents (BusinessWire, VMblog). Existing protocols govern how agents communicate with each other. None cover what they actually do once they start acting inside enterprise environments. ACS targets that gap with a common framework for runtime enforcement, intervention, and policy governance across agent ecosystems. The specification is released as open source under the MIT license, with no single company controlling the spec. Michael Bargury, co-founder and CTO of Zenity, is co-creator. Full disclosure, I serve as director of AI standards and governance at Zenity and contribute to ACS.</p><p><strong>Why it matters</strong></p><ul><li><p>The industry has a control-layer gap. MCP and other protocols cover communication, not what agents do once they act.</p></li><li><p>Runtime governance has been a per-vendor build problem, which has been blocking enterprise procurement and audit.</p></li><li><p>An open, vendor-neutral spec gives regulators and auditors a reference point that does not depend on a single platform&#8217;s roadmap.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Read the spec at agentcontrolstandard.ai and map it against your current agent runtime controls.</p></li><li><p>Ask your AI agent platform vendors which parts of ACS they will support and on what timeline.</p></li><li><p>Add ACS-style runtime controls including policy enforcement, intervention, and kill switches to your 2027 agent governance roadmap.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Full disclosure up top. I am leading ACS, so putting this up top is my prerogative &#128512;. Read this knowing that. The reason we built it is the same reason I keep writing about agent governance every week. Everyone I talk to is putting agents into production with no runtime enforcement layer, no intervention model, and no audit trail that survives a regulator&#8217;s question. The vendor-by-vendor approach was never going to scale. We needed a common spec that any platform could implement and any auditor could point to. That is what ACS is. Go read it at https://agentcontrolstandard.ai, push your vendors to support it, and tell us where it falls short.</p><h3>2. Axios Publishes the Killed AI Executive Order Text</h3><p>Axios published the full text of the canceled AI executive order on May 22, 2026, the day after President Trump pulled the signing ceremony (Axios, NPR). The draft included a voluntary Treasury clearinghouse for AI security vulnerabilities and a pre-launch review process where major AI companies would share frontier models with the government for up to 90 days. CEOs from OpenAI, Anthropic, and other major labs had been invited.</p><p><strong>Why it matters</strong></p><ul><li><p>The federal government walked away from the only proposed coordination mechanism for AI vulnerability sharing.</p></li><li><p>Any serious U.S. AI security baseline now has to come from industry, state regulators, or international peers.</p></li><li><p>AI vendors with CAISI evaluation agreements face uncertainty about whether voluntary testing remains the expectation.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Set your own baseline using the NIST AI RMF and the joint CISA-Five Eyes agentic AI guidance from May 1.</p></li><li><p>Map which AI vendors have signed CAISI evaluation agreements and treat that as third-party risk data.</p></li><li><p>Engage with state AI laws including California SB 942 and Texas HB 149 rather than waiting on Washington.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The administration spent months convening industry CEOs and threading a needle on voluntary frontier model review. Then it walked away because the work sounded like regulation. Plan as if no federal framework is coming for the rest of this administration. State attorneys general will probe your AI governance posture soon enough. More at https://www.rockcybermusings.com.</p><h3>3. Anthropic&#8217;s Project Glasswing Finds 10,000 Critical Vulnerabilities in One Month</h3><p>Anthropic published an update to Project Glasswing on May 22, 2026, reporting that Claude Mythos Preview, working with roughly 50 partner organizations, identified more than 10,000 high or critical-severity vulnerabilities in about four weeks (Anthropic, CSO Online). Cloudflare alone surfaced about 2,000 bugs, 400 rated high or critical. Mozilla patched 271 vulnerabilities in Firefox 150, ten times the count from an earlier Claude Opus 4.6 run. Six independent firms validated 1,752 findings, with 90.6% confirmed as true positives.</p><p><strong>Why it matters</strong></p><ul><li><p>The model is doing in days what well-staffed AppSec teams take quarters to complete.</p></li><li><p>Software vendors are now expected to keep pace with AI-found bugs at speeds no human team can match.</p></li><li><p>Vulnerability management economics change when triage volume jumps an order of magnitude in a month.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Pressure software vendors for AI-assisted vulnerability discovery details and patch SLA commitments.</p></li><li><p>Update patch and risk acceptance policies for a world where critical bugs surface at machine speed.</p></li><li><p>Pilot AI-assisted code review inside your own engineering organization before your competitors do.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Project Glasswing is the first credible public demonstration of large-scale AI vulnerability discovery. Mozilla&#8217;s 271 Firefox patches in one release is a confession that we have all been underspending on AppSec for a decade. The harder question is what happens when threat actors get this capability. If your patch SLA is 30 days, you are living on borrowed time.</p><h3>4. CISA Sidelined in White House AI Cyber Response</h3><p>Axios reported on May 26, 2026 that CISA has been pushed to the margins of the administration&#8217;s AI cyber response, with one industry source describing the agency as &#8220;at the table, not in the game&#8221; (Axios, Newsmax). CISA leadership joins early White House calls led by the Office of the National Cyber Director, but has little influence. The agency has lost roughly one-third of its workforce since the start of 2025. The FY2027 budget proposal calls for another quarter of staff cut and $707 million in funding reductions.</p><p><strong>Why it matters</strong></p><ul><li><p>The federal civilian operational cyber agency is being structurally weakened as AI reshapes the threat picture.</p></li><li><p>Private sector relationships built on CISA&#8217;s information sharing face uncertainty about continuity.</p></li><li><p>State and local governments who depend on CISA for technical support face longer response times.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Diversify your federal cyber relationships. Build direct ties to FBI cyber, Secret Service, and your sector ISAC.</p></li><li><p>Review incident response plans for assumptions about CISA support and revise where federal assistance is uncertain.</p></li><li><p>Engage with state cyber programs in your operating jurisdictions, since state authorities will inherit the burden.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>CISA was supposed to be the federal cyber civilian backstop. Watching it get hollowed out while the threat surface explodes is one of the most demoralizing things I have seen in this field. Build your federal incident response playbook around the assumption that CISA will be slow, understaffed, and unable to provide the level of technical assistance you got in 2024. The federal cavalry is not coming this year.</p><h3>5. Check Point Report Finds 51-Point Gap Between AI Security Intent and Capability</h3><p>Check Point released its 2026 Cloud Security Report on May 26, 2026, finding that 77% of organizations have updated their cloud security strategy for AI, while only 26% have the architecture to enforce those policies (Check Point, PR Newswire). The 51-point gap pairs with 78% of organizations reporting confirmed or suspected AI-related security incidents in the past year. Seventy percent now run generative AI in production. Only 5% have full visibility into AI usage. Only 14% actively enforce and audit AI security policies.</p><p><strong>Why it matters</strong></p><ul><li><p>AI adoption has structurally outpaced security architecture at a board-level scale.</p></li><li><p>Shadow AI is no longer a future risk. It is the current operating reality.</p></li><li><p>Vendors building generative AI security controls now have a credible commercial story for board-level investment cases.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Run an AI usage discovery exercise this quarter. You cannot govern what you cannot see.</p></li><li><p>Tie generative AI policy enforcement to identity controls and DLP systems rather than standalone AI proxies.</p></li><li><p>Make AI visibility metrics a recurring agenda item for your risk committee with a 12-month target.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Almost 80% of organizations have already had an AI security incident, and only 1 in 20 know what AI is running across their environment. We are in the consequences phase, and most security teams are still arguing about which AI proxy vendor to evaluate. The fix is not buying another tool. The fix is treating AI like data, applying the same identity, access, and monitoring discipline you apply to every critical workload. More at https://www.rockcyber.com.</p><h3>6. TrapDoor Supply Chain Attack Poisons AI Coding Assistants</h3><p>Researchers at Socket and partner firms disclosed TrapDoor, a coordinated supply chain campaign that pushed more than 34 malicious packages across npm, PyPI, and Crates.io (The Hacker News, Socket, Phoenix Security). The earliest package appeared on May 22, 2026. TrapDoor&#8217;s novel component injects hidden instructions into .cursorrules and CLAUDE.md files using zero-width Unicode characters. The payload looks invisible in a code editor. AI coding assistants process the hidden text as live prompts. The campaign also opened pull requests against open-source AI projects including LangChain, MetaGPT, LangFlow, and OpenHands.</p><p><strong>Why it matters</strong></p><ul><li><p>The attack weaponizes the AI coding assistant itself as the execution layer, a new class of supply chain compromise.</p></li><li><p>Existing software composition analysis tooling does not detect zero-width Unicode payloads in editor configuration files.</p></li><li><p>Open-source AI orchestration projects are now an active target for adversary-supplied configuration via pull request.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Scan repositories for non-printable Unicode characters in AI assistant config files including .cursorrules and CLAUDE.md.</p></li><li><p>Treat AI assistant configuration files as security-sensitive artifacts subject to code review and CI controls.</p></li><li><p>Restrict outbound traffic from developer machines and CI/CD systems to known good destinations.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The AI coding assistant is a trusted, privileged execution context with access to credentials, source code, and tokens. Compromise its configuration and you compromise everything the assistant can touch. Pin versions, review changes, restrict what assistants can read and write, and treat any pull request touching .cursorrules or CLAUDE.md as malicious until proven otherwise.</p><h3>7. Malicious npm Package Targets Claude AI User Directory</h3><p>Researchers disclosed on May 27, 2026 a malicious npm package called &#8220;mouse5212-super-formatter&#8221; designed to exfiltrate files from /mnt/user-data, the directory Claude AI uses for user uploads and outputs (The Hacker News, The Register). The campaign, named Malware-Slop, walks the directory and uploads every file through the GitHub Contents API. The attacker leaked their own GitHub private token, which let OX Security trace the stolen data. The package reached 676 downloads before npm removed it.</p><p><strong>Why it matters</strong></p><ul><li><p>Threat actors are now writing supply chain malware that specifically targets AI assistant user data directories.</p></li><li><p>AI-generated malware is creating new operational security mistakes that defenders can sometimes exploit.</p></li><li><p>Claude users who installed the package have working sessions, uploads, and outputs exposed to the attacker.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory which developers use Claude or similar AI tools with a user-data directory and audit recent package installs.</p></li><li><p>Add file integrity monitoring and outbound network controls on AI assistant working directories.</p></li><li><p>Require token scoping reviews for any developer credential an AI agent might use.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The operator burned themselves with a leaked GitHub token. The next operator will not make that mistake. AI assistant working directories are now a named target. If your developers run Claude, Cursor, Copilot, or any equivalent, those tools have privileged access to source code and uploaded files. Treat the AI assistant runtime like a privileged build server.</p><h3>8. Microsoft Warns of AI Chatbot Cryptojacking Campaign</h3><p>Microsoft published a threat advisory on May 26, 2026 detailing an active cryptojacking campaign that uses AI chatbot interactions to deliver malicious download links (Microsoft, Help Net Security). Users searching for system utility software were directed to attacker-controlled lookalike sites through poisoned search results and AI chatbot responses. The archive contains a legitimate utility plus a malicious DLL that sideloads a fake Visual C++ Redistributable and installs ScreenConnect for persistent remote access.</p><p><strong>Why it matters</strong></p><ul><li><p>AI chatbot recommendations now sit alongside search results as an attack surface for SEO poisoning-style campaigns.</p></li><li><p>Legitimate software brands plus credible AI responses bypass user skepticism that traditional malvertising would trigger.</p></li><li><p>Persistent ScreenConnect access means cryptomining is the visible threat, with data theft available on demand.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Block known cryptojacking command and control infrastructure and watch for unauthorized ScreenConnect installations.</p></li><li><p>Educate users on verifying download URLs even when they come from AI chatbot suggestions.</p></li><li><p>Prevent employees from installing system utilities outside an approved software catalog.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>People have been trained for two decades to distrust the top three Google search ads. They have not been trained to distrust an AI chatbot suggesting a download link inside a friendly conversation. The cryptojacking is the low-stakes test. The ScreenConnect persistence is the actual play. Outbound traffic to AI services needs the same scrutiny you give to any shadow IT category.</p><h3>9. Anthropic Signals Plans for Public Mythos-Class Release</h3><p>The Register reported on May 25, 2026, that Anthropic plans to release Mythos-class models to the public once stronger safeguards are in place, tied to the May 22 Project Glasswing announcement (The Register, Help Net Security). The Mythos preview was limited to a small group of trusted organizations due to its cybersecurity capabilities. Anthropic&#8217;s stated rationale is that defenders need access to the same tools attackers can build. Project Glasswing partners now exceed 50 organizations including Cloudflare and Mozilla.</p><p><strong>Why it matters</strong></p><ul><li><p>A widely available frontier model with proven offensive cyber capability changes the threat model for every software vendor.</p></li><li><p>Vendors without AI-assisted vulnerability discovery in their SDLC will fall behind attackers using the same tooling for free.</p></li><li><p>EU and UK regulators are likely to revisit gatekeeping rules for high-capability cyber models if the defender-attacker gap closes.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Assume Mythos-class capability reaches motivated attackers within 18 months. Plan patch cadence around that timeline.</p></li><li><p>Engage your AppSec vendor on AI-assisted vulnerability discovery roadmaps tied to broader model availability.</p></li><li><p>Track Anthropic&#8217;s Responsible Scaling Policy updates as a leading indicator of public release timing.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Anthropic has a model that finds vulnerabilities faster than human researchers, they want defenders to have access, and they cannot release it without arming any nation state with the same capability. Holding it inside a curated partner program is the right short-term move. Start building the operational muscle now to consume an order of magnitude more findings.</p><h3>10. Help Net Security Adds Detail on Enterprise AI Governance Failure</h3><p>Help Net Security published a follow-up on May 28, 2026 on the Check Point 2026 Cloud Security Report, noting that more than half of companies have experienced at least one AI-related security incident (Help Net Security). The most common categories were unauthorized or shadow AI use, AI-generated phishing and deepfake content, and sensitive data leaks tied to AI services. Some companies permit source code in generative AI tools. Many cannot trace sensitive data flow through AI processing environments.</p><p><strong>Why it matters</strong></p><ul><li><p>AI policy and AI control are two different things in most organizations.</p></li><li><p>The categories of AI-related incidents match the threat model security teams have been describing for a year, meaning predicted incidents are now actual.</p></li><li><p>Source code exposure through generative AI tools is a top-line legal and IP issue, not a future risk.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Set a quarterly AI incident review cadence in your risk committee, broken out by incident category.</p></li><li><p>Implement DLP controls on outbound traffic to consumer generative AI services and require enterprise tenant routing.</p></li><li><p>Require legal review of generative AI tool acceptable use policies focused on IP, training data rights, and breach notification.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The added detail is the part you take to your board. AI-related incidents are happening now, the categories are predictable, and most organizations are not running controls that would catch them. Read your generative AI vendor contracts again. The second time around you should walk in with a list of demands.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To: Cisco Quietly Rewrites Its Vulnerability Disclosure for the AI Era</h3><p>Cisco&#8217;s security blog published a post on May 22, 2026, with Help Net Security follow-up on May 25, announcing changes to vulnerability disclosure in the AI era (Cisco Blogs, Help Net Security). For internally found vulnerabilities assessed as lower likelihood and lower impact, Cisco said it &#8220;may change the level of detail shared,&#8221; with some bugs that would have warranted a standalone advisory no longer getting one. Cisco will post high-level data on its website pointing customers toward security-hardened releases.</p><p><strong>Why it matters</strong></p><ul><li><p>A major networking vendor is moving toward suppressing standalone disclosure of lower-rated vulnerabilities.</p></li><li><p>Enterprise vulnerability management programs that rely on vendor advisories will see a coverage drop on Cisco issues.</p></li><li><p>The shift is likely to be followed by other vendors as AI-assisted discovery generates more findings than traditional disclosure can support.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Update vendor patching policy to prioritize installation of all security-hardened releases, not just releases addressing named advisories.</p></li><li><p>Track which other major vendors are making similar disclosure changes. Adjust your asset inventory to match.</p></li><li><p>Push vendor management to ask hard questions during renewals about disclosure practices and AI-discovered vulnerability handling.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This is the story everyone slept on, and it will bite enterprise vulnerability management teams in the next two quarters. Cisco found a polite way to say that AI is generating too many internal findings to disclose every one in the old format. Patch by release, not by advisory. Cisco will not be the last vendor to do this. The vendors will not give you transparency unless you make them.</p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; As a bonus, check out my conversation with <strong><a href="https://aicybermagazine.com/">AI Cyber Magazine, </a></strong>where we talked about everything from Context Rot to Least Agency.</p><p>&#128227;&#128227;&#128227; <em><strong>The Weekly Musings will take a week off next week as I am taking a very much needed vacation</strong></em> &#128227;&#128227;&#128227; </p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div id="youtube2-091_b2qep9M" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;091_b2qep9M&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/091_b2qep9M?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260522-20260528?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260522-20260528?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260522-20260528?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Agent Control Standard. (2026). <em>Agent Control Standard specification and community resources</em>. https://agentcontrolstandard.ai/</p><p>Anthropic. (2026, May 22). <em>Project Glasswing: An initial update</em>. https://www.anthropic.com/research/glasswing-initial-update</p><p>Axios. (2026, May 22). <em>Read the AI executive order thwarted by Trump tech allies</em>. https://www.axios.com/2026/05/22/ai-executive-order-cancelled-white-house</p><p>Axios. (2026, May 26). <em>CISA takes backseat in White House AI cyber response</em>. https://www.axios.com/2026/05/26/cisa-white-house-cybersecurity-ai</p><p>BusinessWire. (2026, May 27). <em>Agent Control Standard launches open framework for runtime governance of AI agents</em>. https://www.businesswire.com/news/home/20260527326259/en/Agent-Control-Standard-Launches-Open-Framework-for-Runtime-Governance-of-AI-Agents</p><p>Check Point Software. (2026, May 26). <em>AI adoption creates critical cloud security gaps for enterprises, new Check Point report shows</em>. https://www.checkpoint.com/press-releases/ai-adoption-creates-critical-cloud-security-gaps-for-enterprises-new-check-point-report-shows/</p><p>Cisco. (2026, May 22). <em>Cisco&#8217;s risk-based vulnerability disclosure in the age of AI</em>. Cisco Blogs. https://blogs.cisco.com/security/ciscos-risk-based-vulnerability-disclosure-in-the-age-of-ai</p><p>CNBC. (2026, May 21). <em>Trump postpones AI executive order signing: &#8216;I didn&#8217;t like certain aspects&#8217;</em>. https://www.cnbc.com/2026/05/21/trump-ai-executive-order-postponed.html</p><p>CSO Online. (2026, May 26). <em>Project Glasswing has uncovered 10,000 vulnerabilities: Anthropic</em>. https://www.csoonline.com/article/4176865/project-glasswing-has-uncovered-10000-vulnerabilities-anthropic.html</p><p>Help Net Security. (2026, May 25). <em>Cisco refines its risk-based vulnerability disclosure for the AI era</em>. https://www.helpnetsecurity.com/2026/05/25/cisco-risk-based-vulnerability-disclosure-ai/</p><p>Help Net Security. (2026, May 26). <em>Anthropic: Claude Mythos identified 10,000+ software flaws</em>. https://www.helpnetsecurity.com/2026/05/26/anthropic-project-glasswing-update/</p><p>Help Net Security. (2026, May 27). <em>AI chatbot recommendations lure users to cryptojacking malware sites</em>. https://www.helpnetsecurity.com/2026/05/27/ai-chatbot-cryptojacking-campaign/</p><p>Help Net Security. (2026, May 28). <em>Companies built AI into core systems before figuring out how to govern it</em>. https://www.helpnetsecurity.com/2026/05/28/check-point-genai-security-controls-report/</p><p>Microsoft Security Blog. (2026, May 26). <em>From poisoned search results to GPU mining: A cryptojacking campaign abusing ScreenConnect and Microsoft .NET utilities</em>. https://www.microsoft.com/en-us/security/blog/2026/05/26/poisoned-search-results-gpu-mining-cryptojacking-campaign-abusing-screenconnect-microsoft-net-utilities/</p><p>Newsmax. (2026, May 26). <em>CISA faces AI threat wave amid deep staffing cuts</em>. https://www.newsmax.com/politics/cisa-sean-plankey-ai/2026/05/26/id/1257509/</p><p>NPR. (2026, May 22). <em>Trump cancels AI executive order signing</em>. https://www.npr.org/2026/05/22/nx-s1-5829908/trump-cancels-ai-executive-order-signing</p><p>Phoenix Security. (2026, May). <em>TrapDoor supply chain attack: AI poisoning via npm, PyPI, Crates</em>. https://phoenix.security/trapdoor-supply-chain-ai-poisoning-npm-pypi-crates/</p><p>PR Newswire. (2026, May 26). <em>AI adoption creates critical cloud security gaps for enterprises, new Check Point report shows</em>. https://www.prnewswire.com/news-releases/ai-adoption-creates-critical-cloud-security-gaps-for-enterprises-new-check-point-report-shows-302780612.html</p><p>Socket. (2026, May). <em>TrapDoor crypto stealer supply chain attack hits 34 packages across npm, PyPI, Crates.io</em>. https://socket.dev/blog/trapdoor-crypto-stealer-npm-pypi-crates</p><p>The Hacker News. (2026, May). <em>TrapDoor supply chain attack spreads credential-stealing malware via npm, PyPI, CratesIO</em>. https://thehackernews.com/2026/05/trapdoor-supply-chain-attack-spreads.html</p><p>The Hacker News. (2026, May 27). <em>Malicious npm package stole files from Claude AI user directory via GitHub</em>. https://thehackernews.com/2026/05/malicious-npm-package-stole-files-from.html</p><p>The Hacker News. (2026, May 27). <em>AI chatbot recommendations redirect users to cryptojacking malware sites</em>. https://thehackernews.com/2026/05/ai-chatbot-recommendations-redirect.html</p><p>The Register. (2026, May 25). <em>Anthropic to release Mythos-class models to the public</em>. https://www.theregister.com/security/2026/05/25/anthropic-to-release-mythos-class-models-to-the-public/5245596</p><p>The Register. (2026, May 27). <em>Malware dev tries to steal Claude users&#8217; secrets, writes npm slop, leaks own GitHub private token</em>. https://www.theregister.com/cyber-crime/2026/05/27/supply-chain-brain-drain-npm-attacker-foolishly-leaks-own-github-private-token/5247424</p><p>VMblog. (2026, May 27). <em>Agent Control Standard launches open framework for runtime governance of AI agents</em>. https://vmblog.com/news/agent-control-standard-launches-open-framework-for-runtime-governance-of-ai-agents/</p>]]></content:encoded></item><item><title><![CDATA[AI Security Maturity Model: Your Score Is Fiction]]></title><description><![CDATA[See how the SANS AI Security Maturity Model exposes inflated scores with cap rules and evidence ceilings.]]></description><link>https://www.rockcybermusings.com/p/ai-security-maturity-model-your-score-is-fiction</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/ai-security-maturity-model-your-score-is-fiction</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 26 May 2026 12:50:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Ul9C!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ul9C!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ul9C!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ul9C!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ul9C!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ul9C!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ul9C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:727659,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/198980875?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ul9C!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ul9C!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ul9C!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ul9C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d469f77-f183-4e62-bf17-8d8e8c5caf39_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Your AI security maturity score is probably inflated. Half of security organizations already run AI in production, and most can&#8217;t prove it&#8217;s governed. The <a href="https://www.sans.org/white-papers/sans-2025-ai-survey-measuring-ai-impact-security-three-years-later">SANS 2025 AI Survey</a> reports that 50% of organizations use AI for security work today, another 30% will within a year, and only 35% run a formal AI risk and compliance program. That gap is where fake maturity lives. The new <a href="https://www.sans.org/mlp/2026-ai-security-maturity-model-ebook">SANS AI Security Maturity Model </a>exists to drag your real numbers into daylight, and the way it does that is the most useful part of the book.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/ai-security-maturity-model-your-score-is-fiction?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/ai-security-maturity-model-your-score-is-fiction?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2>Your AI Security Maturity Score Is Fiction</h2><p>You&#8217;ve seen the slide. It says Stage 3, maybe Stage 4, green across the board, shown to leadership with the confidence of someone who has never sat through an audit. Then a regulator calls or an incident hits, and the green turns out to be a vendor logo on a PDF and a policy nobody has opened since onboarding week.</p><p>I reviewed this model before release, alongside a group of practitioners SANS brought in. <a href="https://www.linkedin.com/in/chrishvm">Chris Cochran</a>, SANS Field CISO and VP of AI Security, wrote it. He co-authored prEN 18282 and leads the Agentic AI Task Force on the OWASP AI Exchange, where we collaborate, so the structure comes from someone who has done the work rather than someone selling a dashboard. Treat me as a biased but informed guide, and check my reasoning against the model yourself.</p><p>The model runs three pillars, Protect, Utilize, and Govern, across five stages from ad hoc to leading. That part looks like every maturity model you&#8217;ve met. The difference sits in the scoring, and it&#8217;s where the book earns its keep. The model assumes you&#8217;re inflating your score, then builds two mechanisms that force you to prove you aren&#8217;t. Once you see them, you can&#8217;t unsee how thin most Stage 3 programs turn out to be.</p><p>The survey data backs the suspicion. Adoption is sprinting, and governance is crawling behind it. When half your peers have AI in the building, and barely a third have a governance program, the difference is a pile of unmanaged risk wearing a maturity badge. A model that strips the badge off does you a favor, even when the result stings. The sting is the signal it&#8217;s working.</p><p>If you&#8217;ve ever watched a maturity self-assessment sail through a steering committee and wondered who was going to check it, you&#8217;re the reader this model was built for. You already suspected the number was soft. Here&#8217;s the structure that proves it.</p><h2>Two Old Disciplines The Model Smuggles In</h2><p>Nothing about the two mechanisms is new, and that&#8217;s the point. Cochran took two disciplines that security has trusted for 30 years and aimed them at AI, where the discourse has been mostly threat theater and Stage 4 daydreams.</p><p>The first mechanism is weakest-link capping. Your overall stage can&#8217;t float more than one level above your weakest pillar, and it can&#8217;t exceed your Govern score by more than one stage. Score Stage 1 in Govern, and your Stage 4 detection tooling earns you a Stage 2 overall, and no amount of tooling spend changes that. Anyone who survived CMMI or NIST tiering immediately recognizes this move. Maturity behaves like a floor function, not an average. The averaging instinct is the seductive error here, because averaging lets a strong pillar paper over a weak one, and a program with world-class detection and absent governance is exactly the program that ends up in the headlines. The cap rules refuse the trade.</p><p>Walk a real example. A team rates itself Stage 4 on Utilize because it runs AI-assisted detection across the SOC. Protect lands at Stage 3. Govern sits at Stage 1, because there&#8217;s no AI risk committee, no model inventory, and no documented oversight. The averaging story says Stage 3. The model says the Govern floor caps you at Stage 2, and the minimum-pillar rule confirms it. Your real maturity is the weakest load-bearing wall, not the prettiest room in the house.</p><p>The second mechanism is the evidence ceiling. Any capability you self-report without documentary evidence is capped at Stage 2. Without an approved policy, audit logs, or a dashboard export, you don&#8217;t get Stage 3. This is plain controls-testing doctrine, the same standard a SOC 2 auditor applies to a control narrative. Assertion isn&#8217;t evidence, and the model writes the rule into the scoring so you can&#8217;t talk past it. The discipline this imposes is quiet and real. Before you claim a stage, you go find the artifact. Half the time, the search itself tells you the honest answer, because the artifact doesn&#8217;t exist.</p><p>Put both mechanisms together, and your number drops, often by a full stage. That&#8217;s the system working. An honest Stage 2 you can defend in a board room beats a fictional Stage 3 that disintegrates the first time someone asks for proof.</p><p>The independent data agrees on which pillar carries the weight. The <a href="https://cloudsecurityalliance.org/press-releases/2025/12/18/csa-and-google-cloud-study-finds-governance-maturity-is-strongest-predictor-of-ai-readiness">CSA and Google Cloud State of AI Security and Governance Survey</a>, published December 18, 2025, found organizations with comprehensive policies nearly twice as likely to report early agentic AI adoption, 46%, against 25% for partial guidelines and 12% for policies still in development. Governance maturity, not tooling spend, predicts who&#8217;s ready to move. The model&#8217;s governance floor encodes that finding into the math instead of leaving it as advice.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!poOE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cdfba-2504-4038-826a-6c52e47f9872_828x1962.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!poOE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cdfba-2504-4038-826a-6c52e47f9872_828x1962.png 424w, https://substackcdn.com/image/fetch/$s_!poOE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cdfba-2504-4038-826a-6c52e47f9872_828x1962.png 848w, https://substackcdn.com/image/fetch/$s_!poOE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cdfba-2504-4038-826a-6c52e47f9872_828x1962.png 1272w, https://substackcdn.com/image/fetch/$s_!poOE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cdfba-2504-4038-826a-6c52e47f9872_828x1962.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!poOE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cdfba-2504-4038-826a-6c52e47f9872_828x1962.png" width="460" height="1090" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c5cdfba-2504-4038-826a-6c52e47f9872_828x1962.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1962,&quot;width&quot;:828,&quot;resizeWidth&quot;:460,&quot;bytes&quot;:128131,&quot;alt&quot;:&quot;Flowchart showing a claimed Stage 4 score passing through the evidence ceiling, governance floor, and minimum-pillar cap rules and ending at a real Stage 2&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/198980875?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cdfba-2504-4038-826a-6c52e47f9872_828x1962.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flowchart showing a claimed Stage 4 score passing through the evidence ceiling, governance floor, and minimum-pillar cap rules and ending at a real Stage 2" title="Flowchart showing a claimed Stage 4 score passing through the evidence ceiling, governance floor, and minimum-pillar cap rules and ending at a real Stage 2" srcset="https://substackcdn.com/image/fetch/$s_!poOE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cdfba-2504-4038-826a-6c52e47f9872_828x1962.png 424w, https://substackcdn.com/image/fetch/$s_!poOE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cdfba-2504-4038-826a-6c52e47f9872_828x1962.png 848w, https://substackcdn.com/image/fetch/$s_!poOE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cdfba-2504-4038-826a-6c52e47f9872_828x1962.png 1272w, https://substackcdn.com/image/fetch/$s_!poOE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cdfba-2504-4038-826a-6c52e47f9872_828x1962.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: How A Claimed Stage 4 Becomes A Real Stage 2</figcaption></figure></div><h2>Why Blocking AI Makes You Less Safe</h2><p>Plenty of programs believe they&#8217;ve handled AI risk by banning it. The model has a name for that posture, the Framework of No, and it parks you at Stage 2 by design. Prohibition feels like control. The data says it manufactures blind spots.</p><p><a href="https://www.upguard.com/resources/the-state-of-shadow-ai">UpGuard&#8217;s State of Shadow AI</a> found 81% of employees and 88% of security leaders using unapproved AI tools, with 45% of workers routing around blocks to reach the applications they want. Software AG found that 46% of shadow AI users would keep using the tools even if their employer banned them outright. Read those numbers together, and the picture turns grim for the prohibition crowd. Your own security leaders sit inside the shadow AI population. Your block list is a polite suggestion. Every ban you can&#8217;t enforce converts visible usage into invisible usage, which runs precisely backward from the outcome you wanted.</p><p>The model treats prohibition as immaturity rather than rigor, and the staging reflects it. Higher stages move from blocking toward visibility, sanctioned tooling, and monitored usage, because you can only govern what you can see. A CISO who bans AI and reports the ban as a control runs a Stage 1 program narrating a Stage 3 story. The evidence ceiling catches it the instant someone asks for proof that the ban holds, and the proof never comes because usage moved to personal devices and unmanaged accounts the day after the policy shipped.</p><p>The harsh reality is that the C-suite is often the worst offender, pasting board material and deal terms into consumer chatbots while security writes memos no one reads. A maturity model that scores prohibition honestly hands you the language to take that conversation upstairs, where the real shadow AI risk usually lives.</p><h2>The Question Almost Nobody Can Answer</h2><p>The Govern pillar carries a question that empties most rooms. When your AI agent causes harm, can you produce the audit trail proving the chain of authority from the human who authorized it to the action the agent took?</p><p>Sit with that, because the numbers are moving against you in a hurry. <a href="https://www.cybersecuritytribe.com/news/research-reveals-44-growth-in-nhis-from-2024-to-2025">Entro Labs&#8217; H1 2025 research</a> found non-human identities grew 44% year over year and now outnumber humans 144 to 1 in cloud-native environments, up from 92 to 1 a year earlier. Rubrik Zero Labs, surveying 1,625 security decision-makers, found 89% have already wired AI agents into their identity infrastructure, while most concede they lack governance for those machine credentials. The agents are in production. The accountability layer is not.</p><p>Producing that audit trail demands plumbing that most programs skipped on the way to the demo. You need a durable identity for every agent, distinct from the human who delegated it, with a lifecycle someone owns from creation to revocation. You need structured logging that carries a trace ID through every reasoning step and every tool call, so that one agent action reconstructs end-to-end rather than dissolving into disconnected log lines. You need the decision artifacts, what the agent was asked, what it chose, what scope it held, what it touched, and what it retained long enough to satisfy an investigator who shows up a year later. Most shops have a shared service account, a static key, and a log that stops at the API gateway.</p><p>Picture the incident. An agent with delegated access moves money, deletes records, or leaks a dataset. Legal asks who authorized the action and on what basis. You can name the human who kicked off the workflow, and you can&#8217;t show the scope they granted, the policy that bound the agent, or the reasoning that led to the action. That&#8217;s a Stage 1 program that bought good tooling and skipped the accountability spine, wearing a Stage 4 costume.</p><p>This is the part of the model I&#8217;d hand to any team standing up agents this quarter. Build the identity, the trace, and the decision record before the first agent touches production, because reconstructing the trail after an incident costs an order of magnitude more than instrumenting it up front, and regulators have stopped accepting &#8220;we couldn&#8217;t tell&#8221; as an answer.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xkxm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247dce97-4947-4e2a-98a4-2b10b1dca267_2352x330.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xkxm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247dce97-4947-4e2a-98a4-2b10b1dca267_2352x330.png 424w, https://substackcdn.com/image/fetch/$s_!xkxm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247dce97-4947-4e2a-98a4-2b10b1dca267_2352x330.png 848w, https://substackcdn.com/image/fetch/$s_!xkxm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247dce97-4947-4e2a-98a4-2b10b1dca267_2352x330.png 1272w, https://substackcdn.com/image/fetch/$s_!xkxm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247dce97-4947-4e2a-98a4-2b10b1dca267_2352x330.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xkxm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247dce97-4947-4e2a-98a4-2b10b1dca267_2352x330.png" width="728" height="102" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/247dce97-4947-4e2a-98a4-2b10b1dca267_2352x330.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:204,&quot;width&quot;:1456,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:66848,&quot;alt&quot;:&quot;Left-to-right diagram tracing human authorization through scope binding, agent identity, action, and trace logging to a retained audit artifact, with the common break point highlighted&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/198980875?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247dce97-4947-4e2a-98a4-2b10b1dca267_2352x330.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Left-to-right diagram tracing human authorization through scope binding, agent identity, action, and trace logging to a retained audit artifact, with the common break point highlighted" title="Left-to-right diagram tracing human authorization through scope binding, agent identity, action, and trace logging to a retained audit artifact, with the common break point highlighted" srcset="https://substackcdn.com/image/fetch/$s_!xkxm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247dce97-4947-4e2a-98a4-2b10b1dca267_2352x330.png 424w, https://substackcdn.com/image/fetch/$s_!xkxm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247dce97-4947-4e2a-98a4-2b10b1dca267_2352x330.png 848w, https://substackcdn.com/image/fetch/$s_!xkxm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247dce97-4947-4e2a-98a4-2b10b1dca267_2352x330.png 1272w, https://substackcdn.com/image/fetch/$s_!xkxm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247dce97-4947-4e2a-98a4-2b10b1dca267_2352x330.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Figure 2: The Audit Trail From Human Authorization To Agent Action</figcaption></figure></div><h2>The World&#8217;s Moving Fast, So Here&#8217;s Where I&#8217;d Push v2</h2><p>I like this model. I reviewed it because I wanted it to exist, and liking something doesn&#8217;t mean calling it finished. The ground is shifting under every one of us, so here&#8217;s where I&#8217;d push the next version.</p><p>The timeline estimates run directionally. The model offers stage-progression guidance that reads as reasonable, Cochran flags it as directional, and a team under pressure will quote those numbers to a budget committee as if a stopwatch produced them. V2 should ground the timelines in survey data or label them as planning heuristics in much louder type.</p><p>Parts of the scoring rely more on practitioner judgment than on published benchmarks. That&#8217;s a fair call for a first edition built from field experience, and it&#8217;s the soft spot a skeptical CISO will press. Tying more of the rubric to external evidence, the kind the SANS survey itself generates, would harden the model against the &#8220;says who&#8221; reflex.</p><p>Stage 5 reads as aspirational for nearly the entire field. Almost no program clears it today, which serves as a north star and makes risk a target, because an out-of-reach top stage invites the same score inflation that the rest of the model fights. V2 could split the leading practice we observe in the field from the theoretical ceiling, so teams stop grading themselves against a horizon.</p><p>The pillar weighting profiles arrive asserted rather than derived. The model hands you weightings for different organization types, which helps, and the reasoning behind the specific numbers stays under the hood. Publishing that rationale would let practitioners tune the weights to their own risk profile instead of inheriting a default they can&#8217;t interrogate.</p><p>None of this is fatal. All of it is the ordinary distance between a strong v1 and a sharper v2, and the regulatory clock is why closing that distance matters. The clock is messier than the headlines suggest. Prohibited practices and general-purpose AI rules are already in force. The heavy obligations for high-risk systems under Annex III were set for August 2, 2026, and on May 7, 2026 EU lawmakers reached a provisional agreement to push them to December 2, 2027, pending formal adoption. Until that adoption lands, the original date still stands in the law. Read the slip as breathing room and nothing more. The standards bodies needed the extra runway, and so do most programs. A maturity model that finds your real stage now beats a scramble when the deferral runs out.</p><p><strong>Key Takeaway:</strong> This model wins by assuming your AI security maturity is inflated and forcing you to prove otherwise, and an honest score you can defend in front of a regulator beats a flattering one that collapses the moment someone asks for evidence.</p><h3>What to do next</h3><p><strong><a href="https://www.sans.org/mlp/2026-ai-security-maturity-model-ebook">Download</a></strong> the model and score yourself against it without flinching. Walk each pillar on its own, gather the evidence before you assign a stage, then run the cap rules and watch your number settle where it belongs. Get the full <strong><a href="https://www.sans.org/mlp/2026-ai-security-maturity-model-ebook">2026 SANS AI Security Maturity Model ebook here</a></strong>. If you&#8217;d rather not surrender an email yet, the executive summary linked off that page reads ungated.</p><p>Once you have a real baseline, the work shifts from scoring to strategy, which is the layer my <a href="https://www.rockcyber.com/ai-strategy-and-governance">CARE framework</a> lives in, moving a program from where it scores to where it needs to be. For the agent-accountability gap in particular, I&#8217;ve written more on threat-modeling the systems you&#8217;re rushing into production over at <a href="https://www.rockcybermusings.com">RockCyber Musings</a>.</p><p>&#128073; Subscribe for more AI security and governance insights with the occasional rant.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; As a bonus, check out my conversation with <strong><a href="https://aicybermagazine.com/">AI Cyber Magazine, </a></strong>where we talked about everything from Context Rot to Least Agency.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div id="youtube2-091_b2qep9M" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;091_b2qep9M&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/091_b2qep9M?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2></h2>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 39 May 15-May 21, 2026]]></title><description><![CDATA[The week Washington blinked, Anthropic blinked back, and the AI supply chain caught fire]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260515-20260521</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260515-20260521</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 22 May 2026 12:50:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!mp5k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mp5k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mp5k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!mp5k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!mp5k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!mp5k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mp5k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/198785625?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mp5k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!mp5k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!mp5k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!mp5k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30a6bbb-c1cf-4fc8-9067-79f04c660af6_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260515-20260521?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260515-20260521?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>The executive branch stalled. The supply chain bled. Frontier model builders started negotiating with central bankers. Trump tore up his own AI executive order hours before signing. Anthropic agreed to brief the Financial Stability Board on what its Mythos model can produce. A worm called Mini Shai-Hulud chewed through npm, the Nx Console extension, GitHub&#8217;s internal repositories, Grafana&#8217;s source code, and a slice of OpenAI&#8217;s developer laptops.</p><p>The throughline has nothing to do with the technology. The story is the widening gap between capability and control. Washington wants speed and won&#8217;t write rules. The labs show off their offensive capabilities, then ask regulators to contain them. The supply chain runs on trust that nobody verifies. Identity systems pretend to have been built for AI agents. Here are ten to track, plus one you missed.</p><h3>1. Trump Pulls AI Executive Order Hours Before Signing</h3><p>On May 21, 2026, President Trump scrapped the signing ceremony for an AI executive order that would have created a voluntary review process for frontier models before public release (Axios). Trump told reporters the order &#8220;gets in the way&#8221; (CNBC). The draft covered a voluntary cybersecurity clearinghouse with Treasury and pre-deployment evaluation, giving federal agencies up to 90 days to test new models (Bloomberg). The Washington Post reported that infighting between economic and security advisers killed the timing.</p><p><strong>Why it matters</strong></p><ul><li><p>The voluntary framework was the lightest federal touch on frontier model safety. Killing it signals zero appetite for mandatory pre-deployment review.</p></li><li><p>The 90-day evaluation window was already a compromise. Some labs wanted 14 days.</p></li><li><p>The vacuum pulls states forward. Colorado&#8217;s SB 26-189 takes effect January 1, 2027.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Build your governance program assuming federal silence and state activity.</p></li><li><p>Inventory which AI vendors signed the prior CAISI agreements. Commitments still hold for OpenAI, Anthropic, Google, Microsoft, and xAI.</p></li><li><p>Document model-evaluation evidence from vendors. You&#8217;ll need it for state filings and customer audits.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Washington cannot govern faster than the labs ship. The voluntary EO was the security community&#8217;s best near-term win, killed in 24 hours over speed-versus-China optics. I&#8217;m not surprised. I&#8217;m tired. Treat federal AI governance as imaginary infrastructure. My longer take sits at <a href="https://rockcybermusings.com/">rockcybermusings.com</a>.</p><h3>2. Anthropic Agrees to Brief the Financial Stability Board on Mythos Findings</h3><p>On May 18, 2026, the Financial Times reported that Anthropic agreed to meet the Financial Stability Board (FSB) to discuss cyber vulnerability findings from its Claude Mythos Preview model (PYMNTS). The request came from Bank of England Governor Andrew Bailey. The G20 watchdog has worried that Mythos and similar models will expose weak spots in bank cyber defe&#8217; cyber defenses (The Decoder). Anthropic says Mythos has identified thousands of high-severity vulnerabilities across every major operating system and web browser, with fallout that will be &#8220;severe&#8221; for economies and national security (TechRadar).</p><p><strong>Why it matters</strong></p><ul><li><p>Frontier labs are now in the room with central bank regulators on cyber risk. A structural change in who governs offensive AI capability.</p></li><li><p>The FSB shapes the Basel framework. Expect cyber-resilience requirements to grow teeth.</p></li><li><p>The financial sector is the canary. Whatever the FSB demands rolls downhill to every regulated industry.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map your critical software stack against Anthropic&#8217;s flagged categories. Plan for compressed patch cycles.</p></li><li><p>Watch your home regulator for follow-on guidance. Bailey&#8217;s FSB brief will reverberate.</p></li><li><p>Build vulnerability backlog metrics into board reporting. The question has shifted from &#8220;are we vulnerable&#8221; to &#8220;how fast can we close known exposure.&#8221;</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The lab that built the dangerous capability is now negotiating with the regulators expected to contain it. A weird posture, half whistleblower, half hostage-taker. The FSB doesn&#8217;t normally touch software, so their interest signals cyber risk has crossed the systemic-threat line. I&#8217;ve spent thirty years in this field and never seen central bankers convene on a single AI vendor&#8217;s product. Model what happens when your regulator decides &#8220;model-discovered zero-days&#8221; is a category of systemic risk.</p><h3>3. Microsoft Open-Sources RAMPART and Clarity for Agent Safety</h3><p>On May 20, 2026, Microsoft released two open-source tools that push agent safety into the development pipeline (Microsoft Security Blog). RAMPART (Risk Assessment and Measurement Platform for Agentic Red Teaming) is a Pytest-native framework built on Microsoft&#8217;s PyRIT toolkit. It lets teams write CI-runnable adversarial tests against agents covering prompt injection, data exfiltration, and behavioral regressions (The Register). Clarity walks teams through assumptions and failure modes before they write agent code (The Hacker News).</p><p><strong>Why it matters</strong></p><ul><li><p>The first credible attempt by a hyperscaler to operationalize agent red-teaming inside the CI pipeline. Most &#8220;agent safety&#8221; tooling sits outside the SDLC.</p></li><li><p>Pytest integration matters. Agent safety tests look like every other test, which means engineers run them.</p></li><li><p>PyRIT was already the reference toolkit. RAMPART extending it makes Microsoft the de facto standard for agent adversarial testing.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Pilot RAMPART against your highest-risk agent. Pick the one with the broadest tool permissions.</p></li><li><p>Use Clarity in design reviews. Catching bad scope at the whiteboard is cheaper than catching it in production.</p></li><li><p>Add agent-safety test coverage to your AppSec metrics.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Microsoft did the right thing. They built the tools, open-sourced them, and put them where developers work. Most security tools fail because they sit outside the developer workflow. RAMPART has no such excuse. The question is whether your AppSec team has the political capital to make these tests blocking in CI. I cover the adoption muscle at <a href="https://www.rockcyber.com/">rockcyber.com</a>.</p><h3>4. GitHub Confirms 3,800 Internal Repos Breached via Nx Console</h3><p>On May 21, 2026, GitHub disclosed that 3,800 of its internal repositories were accessed through a developer&#8217;s compromised Nx Console VS Code extension, a casualty of the May 11 TanStack npm supply chain attack (BleepingComputer). Help Net Security traced the chain from the Mini Shai-Hulud worm through the GitHub and Grafana breaches. TechCrunch confirmed on May 20 that the attacker exfiltrated material from the affected employee&#8217;s repositories. The same campaign hit OpenAI, Mistral AI, UiPath, and dozens of downstream maintainers.</p><p><strong>Why it matters</strong></p><ul><li><p>GitHub&#8217;s own internal repos got popped through a VS Code extension. An IDE compromise now spans your entire engineering footprint.</p></li><li><p>The Nx Console extension lives on hundreds of thousands of developer machines. Every install is a potential entry point.</p></li><li><p>Second supply chain worm in 60 days chaining GitHub Actions misconfiguration with OIDC token theft. The pattern is the playbook.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory IDE extensions across your engineering teams. Treat them like browser extensions, with allowlisting and version pinning.</p></li><li><p>Rotate GitHub OIDC tokens that have touched a developer machine in the past 60 days. Audit workflow files for pull_request_target patterns.</p></li><li><p>Revisit endpoint posture for developer laptops. The IDE is now an attack surface equivalent to a browser.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The supply chain conversation has changed shape. The attacker walks through a VS Code extension to reach repository tokens, then pivots to the corporate GitHub org. If your developer laptops live in an &#8220;engineering exception&#8221; bubble outside EDR, MDM, and identity controls, you&#8217;re the next Grafana. Put developer endpoint hygiene on par with finance.</p><h3>5. Grafana Labs Refuses Ransom After Codebase Theft</h3><p>On May 18, 2026, Grafana Labs confirmed an unauthorized party obtained a GitHub token and downloaded its codebase (TechCrunch). The intrusion traced back to the TanStack supply chain attack from May 11. Grafana received a ransom demand on May 16 and refused to pay (The Register), citing no guarantee the stolen data would be deleted. The company rotated tokens, audited every commit since May 11, and hardened GitHub posture (Grafana blog). No customer data was exposed.</p><p><strong>Why it matters</strong></p><ul><li><p>Refusing the ransom publicly is defensible. FBI guidance and peer disclosure make it the default for open-source vendors.</p></li><li><p>Grafana&#8217;s codebase is public anyway. The ransom value was reputational, and the company called the bluff.</p></li><li><p>The hardened posture published in the blog is a teaching artifact. Use it.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If your codebase is open-source, write the ransom-refusal playbook before you need it. Brief your board.</p></li><li><p>Mirror Grafana&#8217;s recovery checklist. Rotate tokens, audit commits, harden GitHub config, increase monitoring.</p></li><li><p>Add commit-signing enforcement and require attestations on release artifacts.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I respect what Grafana did. They confirmed quickly, refused the ransom, and published a postmortem with operational specifics. That&#8217;s how you turn a breach into a credibility win. Compare it with the usual vague disclosure six weeks late from a forensics firm hiding behind privilege. If your IR plan still treats ransom payment as a live option, you&#8217;re behind.</p><h3>6. Mini Shai-Hulud Worm Expands Across the npm Ecosystem</h3><p>On May 19, 2026, TechCrunch reported the Mini Shai-Hulud campaign had spread to dozens of additional open-source packages beyond the original TanStack hit. Wiz and Snyk traced the worm&#8217;s propagation through @squawk/* and @mistralai/* packages, on top of the 84 malicious versions across 42 @tanstack/* packages from May 11 (Wiz). StepSecurity attribution ties the same TeamPCP threat group to the March Trivy scanner compromise and April&#8217;s Bitwarden CLI package hit (Snyk). The campaign chains pull_request_target misconfiguration with GitHub Actions cache poisoning and OIDC token extraction.</p><p><strong>Why it matters</strong></p><ul><li><p>A self-propagating worm. It exfiltrates maintainer credentials and uses them to publish further malicious versions. Containment lags.</p></li><li><p>The same threat actor keeps finding new targets with the same attack pattern. The pattern is the problem.</p></li><li><p>Every downstream consumer of an affected package has a credential rotation event ahead.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Build a list of every npm package your org consumes, including transitive dependencies. Cross-reference against IOC lists from StepSecurity and Wiz.</p></li><li><p>Move CI secrets out of GitHub Actions environment variables. Use ephemeral, scoped tokens.</p></li><li><p>Block pull_request_target on any repository whose CI touches secrets. There is no safe configuration.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The worm pattern is the story. A compromised maintainer&#8217;s token pushes malicious versions that compromise more maintainers, and the campaign scales without human work. A structural problem for any ecosystem built on maintainer trust. We&#8217;ve known pull_request_target was dangerous since 2021. Its presence at major projects in 2026 tells you how the open-source world treats its security debt.</p><h3>7. EU Commission Opens Consultation on AI Act Transparency Guideline</h3><p>On May 19, 2026, the European Commission opened a public consultation on the draft guideline for the AI Act&#8217;s transparency obligations, due in August 2026 (Council of the EU). The consultation follows the May 7 AI Omnibus agreement, which shortened the grace period for transparency solutions on AI-generated content from six months to three. The new deadline lands December 2, 2026. The Commission&#8217;s enforcement powers against general-purpose AI model providers go live August 2, 2026, including authority to request documentation and impose fines.</p><p><strong>Why it matters</strong></p><ul><li><p>Transparency rules apply to every model output touching an EU resident, regardless of training or hosting location.</p></li><li><p>The shortened grace period gives GPAI providers 90 days to ship watermarking, content labeling, and disclosure mechanisms.</p></li><li><p>August&#8217;s enforcement powers give the AI Office real teeth for the first time.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map your AI-generated content workflows. Tag every production path that needs disclosure.</p></li><li><p>Implement provenance labeling now using C2PA or equivalent.</p></li><li><p>Brief legal and product on the December 2 deadline. Earlier guidance assumed June 2027.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The Brussels Effect is doing its work. Whatever the AI Act forces on GPAI providers becomes the de facto global standard for transparency disclosure. American companies pretending the Act doesn&#8217;t apply will learn otherwise. Regulators wanting a quick enforcement win start with content labeling, not algorithmic auditing. If your product surfaces AI-generated content to any EU user, December 2 turned real this week.</p><h3>8. CISA Weighs Three-Day Patching Deadline as AI Compresses Exploit Cycles</h3><p>On May 20, 2026, Federal News Network reported CISA is considering a three-day patching deadline on Known Exploited Vulnerabilities, replacing the current 15-day default. The Insurance Journal covered the debate, citing AI compressing the time between disclosure and exploitation. Sysdig research found CVE-2026-44338 in the PraisonAI framework was probed by scanners 3 hours, 44 minutes, and 39 seconds after disclosure. Palo Alto Networks reports 28.3% of CVEs are now exploited within 24 hours.</p><p><strong>Why it matters</strong></p><ul><li><p>A three-day federal mandate would be the most aggressive remediation deadline CISA has ever proposed.</p></li><li><p>The same compression hits private defenders. Patch SLAs run 5-10x slower than the attack timeline.</p></li><li><p>AI-assisted exploit development operates at scale. The 3-hour PraisonAI scan window is the leading edge, not the outlier.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Pull your last 12 months of KEV-listed CVEs. Measure actual time-to-patch against the 15-day baseline. Be honest.</p></li><li><p>Build runbooks for emergency patching of internet-exposed assets. The three-day clock starts at disclosure, not your next change window.</p></li><li><p>Plan compensating controls when 72-hour patching is impossible. Virtual patches and WAF rules buy time.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The math is brutal. Attackers weaponize a CVE in hours. Defenders take weeks to deploy a patch through change management. A three-day mandate forces a conversation every CISO has avoided. Redesign the process or accept being late by default.</p><h3>9. Anthropic Opens Mythos Partner Sharing After Initial Lockdown</h3><p>On May 18, 2026, Anthropic reversed its earlier position and now allows Project Glasswing partners to share Mythos vulnerability findings with outside parties (Reuters via KFGO). The new policy permits disclosure to security teams, industry bodies, regulators, open-source maintainers, the media, and the public, subject to responsible disclosure. The original Glasswing structure had limited information to launch partners only. About 40 organizations have Mythos.</p><p><strong>Why it matters</strong></p><ul><li><p>The first information-sharing reversal of a frontier model program of this kind. Centralized cyber findings control was not workable in practice.</p></li><li><p>Open-source maintainers now have a path to receive Mythos-discovered vulnerabilities. That changes the patch dependency calculus.</p></li><li><p>The reversal suggests Anthropic underestimated the volume of findings and the scaling problem of single-vendor coordination.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Partners should designate a single coordinated-disclosure contact. Volume will overwhelm informal channels.</p></li><li><p>Non-partners should register with ISACs and CERTs as receiving organizations.</p></li><li><p>Pre-write your triage process for AI-discovered vulnerabilities. The format won&#8217;t match your CVE workflow.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>A governance lesson in real time. You cannot bottle frontier capability and call it safe. Glasswing tried, and within six weeks the math broke. Voluntary coordination is fragile when capability outruns headcount.</p><h3>10. Trump Pivots Toward AI Regulation Amid Backlash and China Safety Talks</h3><p>On May 19, 2026, Fortune reported the Trump administration is shifting its public stance on AI regulation in response to mounting voter backlash over job displacement, deepfakes, and AI-enabled crime. The shift comes alongside reported US-China safety talks on frontier AI capability. The administration&#8217;s December 2025 EO 14365 sought to preempt state AI regulation. The May 21 EO postponement suggests the political calculation has changed. Fortune cited senior officials describing the sentiment shift as &#8220;faster than anyone expected.&#8221;</p><p><strong>Why it matters</strong></p><ul><li><p>Public backlash on AI is influential enough to move executive policy. A new political force.</p></li><li><p>US-China safety dialogue, even if informal, sets the stage for future bilateral commitments on frontier capability.</p></li><li><p>An administration that was preempting state regulation is now hesitating. State AGs read this as license to push harder.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Track AI ballot initiatives in your operating states. The 2026 midterms will surface enforceable propositions.</p></li><li><p>Audit public-facing AI claims for accuracy. The SEC has flagged AI-washing as an enforcement priority.</p></li><li><p>Brief government affairs on the bilateral angle. China engagement changes the calculus for export controls and model access.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The political dynamic shifts faster than the technology. Six months ago, the White House was suing California to block AI rules. This week, they were drafting their own voluntary review. Plan around the volatility. The companies that thrive have built controls higher than any jurisdiction requires. You don&#8217;t have to guess which regulator strikes next. You have to be ready for any of them.</p><p>And then there is musing #1&#8230; </p><h3>The One Thing You Won&#8217;t Hear About But You Need To: Identity Dark Matter Is Eating Your AI Agent Program</h3><p>On May 19, 2026, Orchid Security released its Identity Gap: 2026 Snapshot report (Tech Startups, GlobeNewswire). Invisible identity, what Orchid calls &#8220;identity dark matter,&#8221; now outweighs visible identity in enterprise environments 57% to 43%. 67% of non-human accounts are created directly within applications, unseen and unmanaged by IAM programs. 70% of enterprise applications carry excessive privileged accounts. The data comes from anonymized telemetry across financial services, healthcare, retail, and energy from April 2025 through March 2026.</p><p><strong>Why it matters</strong></p><ul><li><p>AI agents inherit credentials at runtime. If most of your non-human identity is invisible, your agents operate in the blind spot.</p></li><li><p>Traditional IAM was built for humans. An AI agent using a stale service account has a larger blast radius than the equivalent human error.</p></li><li><p>The 70% over-privilege finding means that most enterprise apps cannot survive a single agent-misuse event without exposing other systems.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Run non-human identity discovery against your top 10 enterprise applications. Expect a delta against your IAM inventory.</p></li><li><p>Implement time-bound, on-demand credentials for AI agents. Standing access is the failure mode.</p></li><li><p>Treat every AI agent identity as privileged. Apply PAM controls, session recording, and behavioral monitoring.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The story under the story. Every AI security headline this week depends on identity being right. The TanStack worm spread through OIDC tokens. The GitHub breach used a developer&#8217;s repository access. Your AI agent governance program is only as good as your non-human identity hygiene. If two-thirds of your service accounts are invisible, you cannot govern the agents using them. Read the report and bring it to your board. Don&#8217;t let &#8220;we have IAM&#8221; be the answer.</p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; As a bonus, check out my conversation with <strong><a href="https://aicybermagazine.com/">AI Cyber Magazine, </a></strong>where we talked about everything from Context Rot to Least Agency.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div id="youtube2-091_b2qep9M" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;091_b2qep9M&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/091_b2qep9M?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2>References</h2><p>Axios. (2026, May 21). <em>Scoop: White House postpones AI EO signing ceremony</em>. https://www.axios.com/2026/05/21/white-house-postpones-ai-eo-signing</p><p>BleepingComputer. (2026, May 21). <em>GitHub links repo breach to TanStack npm supply-chain attack</em>. https://www.bleepingcomputer.com/news/security/github-links-repo-breach-to-tanstack-npm-supply-chain-attack/</p><p>Bloomberg. (2026, May 21). <em>White House postpones AI cybersecurity order signing by Trump</em>. https://www.bloomberg.com/news/articles/2026-05-21/white-house-postpones-ai-cybersecurity-order-signing-by-trump</p><p>CNBC. (2026, May 21). <em>Trump postpones AI executive order signing: &#8216;I didn&#8217;t like certain aspects&#8217;</em>. https://www.cnbc.com/2026/05/21/trump-ai-executive-order-postponed.html</p><p>CNN Business. (2026, May 20). <em>White House postpones executive order on AI</em>. https://www.cnn.com/2026/05/20/tech/ai-executive-order-trump-white-house</p><p>Council of the European Union. (2026, May 7). <em>Artificial intelligence: Council and Parliament agree to simplify and streamline rules</em>. https://www.consilium.europa.eu/en/press/press-releases/2026/05/07/artificial-intelligence-council-and-parliament-agree-to-simplify-and-streamline-rules/</p><p>CSO Online. (2026, May). <em>Microsoft releases open-source tools to operationalize AI agent safety</em>. https://www.csoonline.com/article/4175592/microsoft-releases-open-source-tools-to-operationalize-ai-agent-safety-2.html</p><p>Federal News Network. (2026, May 20). <em>AI drives new debate around CISA software patching deadlines</em>. https://federalnewsnetwork.com/cybersecurity/2026/05/ai-drives-new-debate-around-cisa-software-patching-deadlines/</p><p>Fortune. (2026, May 19). <em>The times they are a-changin&#8217;: Trump pivots towards AI regulation in the face of a mounting public backlash</em>. https://fortune.com/2026/05/19/trump-pivots-towards-ai-regulation-in-face-mounting-ai-backlash-china-ai-safety-talks/</p><p>GlobeNewswire. (2026, May 19). <em>Two-thirds of nonhuman accounts are unseen and unmanaged, according to new Identity Gap Report</em>. https://www.globenewswire.com/news-release/2026/05/19/3297602/0/en/Two-Thirds-of-Nonhuman-Accounts-Are-Unseen-and-Unmanaged-According-to-New-Identity-Gap-Report.html</p><p>Grafana Labs. (2026, May 16). <em>Grafana Labs security update: Latest on TanStack npm supply chain ransomware incident</em>. https://grafana.com/blog/grafana-labs-security-update-latest-on-tanstack-npm-supply-chain-ransomware-incident/</p><p>Help Net Security. (2026, May 21). <em>GitHub, Grafana Labs breaches traced back to TanStack supply chain compromise</em>. https://www.helpnetsecurity.com/2026/05/21/github-grafana-breach-root-cause-nx-console/</p><p>Insurance Journal. (2026, May 4). <em>CISA weighs cutting deadlines to fix digital flaws amid worries over AI</em>. https://www.insurancejournal.com/news/national/2026/05/04/868205.htm</p><p>KFGO. (2026, May 18). <em>Anthropic to let partners share Mythos cybersecurity findings with others</em>. https://kfgo.com/2026/05/18/anthropic-to-let-partners-share-mythos-cybersecurity-findings-with-others/</p><p>Microsoft Security Blog. (2026, May 20). <em>Introducing RAMPART and Clarity: Open source tools to bring safety into Agent development workflow</em>. https://www.microsoft.com/en-us/security/blog/2026/05/20/introducing-rampart-and-clarity-open-source-tools-to-bring-safety-into-agent-development-workflow/</p><p>NBC News. (2026, May 21). <em>Trump abruptly scraps signing of landmark executive order regulating AI</em>. https://www.nbcnews.com/tech/tech-news/trump-scraps-signing-landmark-executive-order-regulating-ai-rcna346288</p><p>PYMNTS. (2026, May 18). <em>Anthropic will update regulators on Mythos&#8217; cyber vulnerability findings</em>. https://www.pymnts.com/cybersecurity/2026/anthropic-will-update-regulators-mythos-cyber-vulnerability-findings/</p><p>Snyk. (2026, May). <em>TanStack npm packages hit by Mini Shai-Hulud</em>. https://snyk.io/blog/tanstack-npm-packages-compromised/</p><p>Tech Startups. (2026, May 19). <em>Two-thirds of nonhuman accounts are unseen and unmanaged, according to Orchid Security&#8217;s Identity Gap Report</em>. https://techstartups.com/2026/05/19/two-thirds-of-nonhuman-accounts-are-unseen-and-unmanaged-according-to-orchid-securitys-identity-gap-report/</p><p>TechCrunch. (2026, May 18). <em>Open source tool maker Grafana Labs says hackers stole its code, refuses to pay ransom</em>. https://techcrunch.com/2026/05/18/open-source-tool-maker-grafana-labs-says-hackers-stole-its-code-refuses-to-pay-ransom/</p><p>TechCrunch. (2026, May 19). <em>Hackers have compromised dozens of popular open source packages in an ongoing supply-chain attack</em>. https://techcrunch.com/2026/05/19/hackers-have-compromised-dozens-of-popular-open-source-packages-in-an-ongoing-supply-chain-attack/</p><p>TechCrunch. (2026, May 20). <em>GitHub says hackers stole data from thousands of internal repositories</em>. https://techcrunch.com/2026/05/20/github-says-hackers-stole-data-from-thousands-of-internal-repositories/</p><p>TechRadar. (2026, May 18). <em>Anthropic to present exposed Mythos flaws to global watchdog</em>. https://www.techradar.com/pro/security/anthropic-to-present-exposed-mythos-flaws-to-global-watchdog-claims-critical-vulnerabilities-found-in-every-major-operating-system-and-web-browser</p><p>The Decoder. (2026, May 18). <em>Anthropic to brief global financial regulators on cyber flaws found by Claude Mythos</em>. https://the-decoder.com/anthropic-to-brief-global-financial-regulators-on-cyber-flaws-found-by-claude-mythos/</p><p>The Hacker News. (2026, May 20). <em>Microsoft open-sources RAMPART and Clarity to secure AI agents during development</em>. https://thehackernews.com/2026/05/microsoft-open-sources-rampart-and.html</p><p>The Register. (2026, May 18). <em>Grafana Labs admits all its codebase are belong to someone who popped its GitHub account</em>. https://www.theregister.com/cyber-crime/2026/05/18/grafana-labs-admits-attackers-downloaded-its-codebase-from-github/5241686</p><p>The Register. (2026, May 21). <em>Microsoft storms RAMPART, adds Clarity to agentic AI safety</em>. https://www.theregister.com/security/2026/05/21/microsoft-open-sources-agentic-ai-safety-tools/5243822</p><p>The Washington Post. (2026, May 21). <em>Trump delays executive order on AI oversight hours before planned signing</em>. https://www.washingtonpost.com/technology/2026/05/21/white-house-tore-down-ai-rules-now-its-building-new-defenses/</p><p>Wiz. (2026, May). <em>Mini Shai-Hulud strikes again: TanStack + more npm packages compromised</em>. https://www.wiz.io/blog/mini-shai-hulud-strikes-again-tanstack-more-npm-packages-compromised</p>]]></content:encoded></item><item><title><![CDATA[My Claude Code Harness Is Public. Don't Copy It.]]></title><description><![CDATA[I open-sourced my Claude Code harness for Mac, Jetson, and Windows. Read the reasoning, skip the configs. The honest answer is don't build.]]></description><link>https://www.rockcybermusings.com/p/my-claude-code-harness-is-public</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/my-claude-code-harness-is-public</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 19 May 2026 12:50:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!KeZU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KeZU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KeZU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KeZU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KeZU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KeZU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KeZU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2405350,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/198165745?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KeZU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KeZU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KeZU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KeZU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F866bf827-6a8c-414b-9080-678f0911e655_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/my-claude-code-harness-is-public?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/my-claude-code-harness-is-public?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>I spent most of last month watching myself do the same dance every time I opened Claude Code. Each session ate 20-30 minutes up front, depending on how Claude Code was performing that day, and I&#8217;d spend that time re-stating trust boundaries, re-configuring tooling, and reminding a fresh session what the project was. I was doing it on three machines (Mac, Jetson AGX Orin, Windows), 5-10x/week. Before I&#8217;d written a line of code, I was burning two to five hours a week on a problem I&#8217;d already solved twice and forgotten how.</p><p>The &#8220;fix it in code review&#8221; answer for security findings fell apart around the same time, once I&#8217;d read enough of the benign-prompt vulnerability data on frontier models to understand what I was accepting by deferring. If the model&#8217;s shipping vulnerable code at a non-trivial rate even when nobody&#8217;s trying to make it, &#8220;we&#8217;ll catch it in PR&#8221; is wishful thinking with a JIRA ticket attached.</p><p>That was the moment. I stopped patching the symptom. I built my harness from scratch on the Mac, ported the reasoning to the Jetson and Windows, and wrote down why I made every choice. The repo&#8217;s a reasoning trail with the code attached as evidence.</p><p>What I&#8217;m publishing lives at <a href="https://github.com/rocklambros/harness-engineering">github.com/rocklambros/harness-engineering</a>. The README says it plainly: this isn&#8217;t a clone-and-run template, and personal-specific configuration is the point. If you read it expecting a drop-in setup, you&#8217;ll come away disappointed. If you read it expecting to see how a harness gets reasoned into existence, you&#8217;ll come away with a frame for arguing with mine and building yours.</p><h2>Harness engineering isn&#8217;t what most people think it is</h2><p>Prompt engineering got the marketing budget. Harness engineering didn&#8217;t, and most Claude Code users skip past it because it doesn&#8217;t feel like coding. It feels like ops, and nobody writes posts about ops decisions.</p><p>Here&#8217;s the working definition I&#8217;ve landed on. A harness is the configured environment around an agent (in this case, a coding agent) that determines what it can and can&#8217;t do, what guidance it follows by default, and what guardrails it can&#8217;t talk its way past. Harness engineering is the discipline of designing that environment on purpose, with reasoning you can defend, instead of accepting whatever defaults shipped in the box.</p><p>In Claude Code terms, the harness is everything outside the chat turn. The project-level CLAUDE.md the model reads at session start. The settings.json that defines permission modes and hook registrations. The deterministic rules the model can&#8217;t override, even if it tries. The skills that load advisory guidance on demand. The hooks that fire on tool use to validate, scan, and audit. The agents you delegate specialized tasks to.</p><p>If you&#8217;re running Claude Code with a default settings.json, no hooks, no skills beyond what shipped, and a CLAUDE.md that someone else wrote, you don&#8217;t have a harness. You have a session. The model is making decisions about what&#8217;s safe to run, what tools to invoke, and what your codebase should look like, with zero guardrails you can defend in a postmortem.</p><p>For a vibe-coding indie dev shipping a side project, no harness might be fine. The blast radius is one repo, possibly with no production users. For anyone shipping code that matters, the absence of a harness means the model is making decisions about what&#8217;s safe with zero documented constraints, and you&#8217;re trusting the defaults to do work you&#8217;d never trust an unverified junior to do.</p><p>Most of the &#8220;10 tips for Claude Code&#8221; content I&#8217;ve read is harness suggestion without harness reasoning, which means surface configs without the why. That&#8217;s why those posts age out within a minor-version bump. The configs survive maybe four weeks before an upstream change breaks the assumption they were built on, and the reader has no idea which assumption broke or how to fix it. The reasoning is what survives the upgrade. The configs are what fall out.</p><h2>The honest answer is: don&#8217;t build</h2><p>Most of you should adopt, not build. The README says this directly, and I want to repeat it before anyone gets the wrong idea from the announcement:</p><blockquote><p>The honest answer for most people reading this is: don&#8217;t build. Adopt.</p></blockquote><p>The cost of building isn&#8217;t in the writing. It&#8217;s in the maintenance against Claude Code itself, which ships breaking changes on minor version bumps. The TTL cache regression in March 2026 was the canonical example. A behavior change in the cache layer silently halved the economic value of half the harnesses in circulation, and most of the people running those harnesses didn&#8217;t notice for weeks. If your harness assumes a Claude Code behavior that later changes in a release, every part of your reasoning trail that depended on that assumption needs re-evaluation. That&#8217;s a non-trivial tax to pay if your day job isn&#8217;t building harnesses.</p><p>Who should build, then? The conditions are narrow, and all four must be true.</p><p>You operate across multiple machines, and the off-the-shelf options don&#8217;t survive the cross-platform parity test. You have a non-trivial security posture, and &#8220;fix it in code review&#8221; isn&#8217;t a defensible answer for the work you ship. You don&#8217;t trust the trust boundaries that ship in the existing community harnesses, either because they&#8217;re underspecified or because they&#8217;re calibrated to a different threat model than yours. You can afford the maintenance cost of keeping a reasoning trail up to date as Claude Code evolves.</p><p>If any of those four don&#8217;t apply, adopt. There are good public harnesses in the community right now. Pick one whose reasoning you can read and whose tradeoffs you can defend. That&#8217;s a faster path to a harness you can trust than building your own.</p><p>I built mine because all four applied: three machines, an AI security threat model I don&#8217;t want negotiated by a maintainer I&#8217;ve never met, a low tolerance for trust boundaries I can&#8217;t trace, and the time budget to keep the reasoning current. Most of you don&#8217;t have all four. Reading my repo to argue with my reasoning is useful. Copying my configs into a project that doesn&#8217;t share my four conditions is the same kind of mistake as cloning someone else&#8217;s threat model and hoping it covers yours.</p><p>If you read this section and think, &#8220;but my situation is special,&#8221; it probably isn&#8217;t. The cases that earn building are rarer than people think, and the cases where adopting is the smart move look pretty similar to mine from the outside.</p><h2>What&#8217;s in the repo, and what it does</h2><p>The repo is organized as one foundation section, three platform sections (Mac, Jetson AGX Orin, Windows), and a research section. Foundation holds the parts that are identical across platforms: the Quality Contract that binds every artifact, the threat model, the architectural principles, the seed evaluation methodology, and the research references.</p><p>The Mac section is the validated reference build. All six phases (Phase 0 goals through Phase 5 release) are written and tested against my actual machine. The Jetson and Windows sections mirror the structure. Phases 0 through 2 are written and ready. Phases 3 through 5 are scaffolded with explicit &#8220;needs validation when ported&#8221; markers because I haven&#8217;t run them against those environments yet. The capability surface is identical to Mac. Tools differ where they have to.</p><p>Each platform&#8217;s harness has the same five-layer shape. The project-level CLAUDE.md sits under 200 lines and covers seven sections: the role the model is operating in, the code standards I expect it to honor, the security rules it can&#8217;t bypass, the core constraints on the project, the things that break (failure modes I&#8217;ve already hit), an operational section for day-to-day commands, and a status section that captures where the build currently is. A settings.json template defines permission modes, hook registrations, and trust-boundary policy. A deterministic rules directory lists path deny patterns, command deny patterns, and secret patterns that get consumed by hooks rather than interpreted by the model. A skills directory holds lazy-loaded advisory guidance. A hooks and agents directory holds the deterministic gates and the specialized subagents.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NgDR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60593755-6f72-4a78-9c34-b47c5b91befc_2133x1781.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NgDR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60593755-6f72-4a78-9c34-b47c5b91befc_2133x1781.png 424w, https://substackcdn.com/image/fetch/$s_!NgDR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60593755-6f72-4a78-9c34-b47c5b91befc_2133x1781.png 848w, https://substackcdn.com/image/fetch/$s_!NgDR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60593755-6f72-4a78-9c34-b47c5b91befc_2133x1781.png 1272w, https://substackcdn.com/image/fetch/$s_!NgDR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60593755-6f72-4a78-9c34-b47c5b91befc_2133x1781.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NgDR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60593755-6f72-4a78-9c34-b47c5b91befc_2133x1781.png" width="1456" height="1216" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60593755-6f72-4a78-9c34-b47c5b91befc_2133x1781.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1216,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:182020,&quot;alt&quot;:&quot;Stack diagram of the five-layer Claude Code harness architecture showing CLAUDE.md, settings.json, rules, skills, and hooks plus agents&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/198165745?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60593755-6f72-4a78-9c34-b47c5b91befc_2133x1781.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Stack diagram of the five-layer Claude Code harness architecture showing CLAUDE.md, settings.json, rules, skills, and hooks plus agents" title="Stack diagram of the five-layer Claude Code harness architecture showing CLAUDE.md, settings.json, rules, skills, and hooks plus agents" srcset="https://substackcdn.com/image/fetch/$s_!NgDR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60593755-6f72-4a78-9c34-b47c5b91befc_2133x1781.png 424w, https://substackcdn.com/image/fetch/$s_!NgDR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60593755-6f72-4a78-9c34-b47c5b91befc_2133x1781.png 848w, https://substackcdn.com/image/fetch/$s_!NgDR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60593755-6f72-4a78-9c34-b47c5b91befc_2133x1781.png 1272w, https://substackcdn.com/image/fetch/$s_!NgDR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60593755-6f72-4a78-9c34-b47c5b91befc_2133x1781.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: Five-layer harness architecture</figcaption></figure></div><p>The piece I&#8217;m most willing to defend is the three-layer security stack that cuts across the skills and hooks layers. Layer one is pre-generation guidance: a security-review skill seeded from the <a href="https://github.com/Arcanum-Sec/sec-context">Arcanum-Sec sec-context anti-pattern taxonomy (CC BY 4.0, Jason Haddix)</a>, with 10 pattern files for the Mac build that match the skill&#8217;s manifest one-to-one. The skill loads pattern sections based on file type, so the context tax stays small. Layer two is commit-time hardening: a Semgrep PostToolUse hook that fires on every Write or Edit and feeds findings back to Claude in the same session, implementing the SecureForge methodology from Liu et al. (<a href="https://arxiv.org/abs/2605.08382">arXiv:2605.08382, MIT</a>). The published paper reports a roughly 48% reduction in CWE rate from this layer alone. Layer three is post-generation validation: a pinned pre-commit gate running gitleaks for secrets, Semgrep for SAST, shellcheck for hook scripts, and a local drift check for reference integrity. It&#8217;s the same Semgrep engine as layer two, running in a different invocation context. The redundancy is intentional.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pOQY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73e29384-09a7-4882-89ed-ae71d3c384d4_2133x2290.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pOQY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73e29384-09a7-4882-89ed-ae71d3c384d4_2133x2290.png 424w, https://substackcdn.com/image/fetch/$s_!pOQY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73e29384-09a7-4882-89ed-ae71d3c384d4_2133x2290.png 848w, https://substackcdn.com/image/fetch/$s_!pOQY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73e29384-09a7-4882-89ed-ae71d3c384d4_2133x2290.png 1272w, https://substackcdn.com/image/fetch/$s_!pOQY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73e29384-09a7-4882-89ed-ae71d3c384d4_2133x2290.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pOQY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73e29384-09a7-4882-89ed-ae71d3c384d4_2133x2290.png" width="1456" height="1563" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/73e29384-09a7-4882-89ed-ae71d3c384d4_2133x2290.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1563,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:205055,&quot;alt&quot;:&quot;Flowchart showing the three layers of the Claude Code harness security stack from pre-generation guidance through commit-time hardening to post-generation validation&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/198165745?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73e29384-09a7-4882-89ed-ae71d3c384d4_2133x2290.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flowchart showing the three layers of the Claude Code harness security stack from pre-generation guidance through commit-time hardening to post-generation validation" title="Flowchart showing the three layers of the Claude Code harness security stack from pre-generation guidance through commit-time hardening to post-generation validation" srcset="https://substackcdn.com/image/fetch/$s_!pOQY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73e29384-09a7-4882-89ed-ae71d3c384d4_2133x2290.png 424w, https://substackcdn.com/image/fetch/$s_!pOQY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73e29384-09a7-4882-89ed-ae71d3c384d4_2133x2290.png 848w, https://substackcdn.com/image/fetch/$s_!pOQY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73e29384-09a7-4882-89ed-ae71d3c384d4_2133x2290.png 1272w, https://substackcdn.com/image/fetch/$s_!pOQY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73e29384-09a7-4882-89ed-ae71d3c384d4_2133x2290.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: Three-layer security stack</figcaption></figure></div><p>The one piece I&#8217;d point to first if you want to see how the reasoning trail format works is JOURNEY.md. It&#8217;s a running narrative of the build, written as prose checkpoints. Reasoning lives in JOURNEY.md, decisions land in commits, locked decisions land in foundation docs. That separation is doing real work. The commit history is part of the artifact, not just a side effect of using git.</p><h2>Decisions I made that won&#8217;t transfer to your setup</h2><p>The repo is a reasoning trail, not a config to copy. Here are the load-bearing decisions in it that won&#8217;t survive translation to your environment unchanged.</p><p>The Windows section runs Semgrep in WSL2 rather than the native Windows binary. The native binary has spotty coverage on some of the rule packs I care about, and forcing parity across platforms outweighed the convenience of running Semgrep natively on Windows. If your security posture cares about different rule packs than mine does, your decision might run the other way. The same goes for the broader WSL2 call. I picked it because it gave me a Linux-shaped tool environment without dual-booting. If you&#8217;re already deep into PowerShell and Windows-native tooling, you&#8217;d pick differently, and you&#8217;d be right.</p><p>The Jetson section assumes Tegra Python and the apt-plus-Jetson-SDK package management posture. If you&#8217;re running a Jetson but you&#8217;ve layered conda over the top, or you&#8217;re using a different L4T release than mine, the Phase 0 inventory output won&#8217;t match yours, and the downstream phases will need adjustment. The reasoning still applies. The specific tool versions won&#8217;t.</p><p>The seven-section CLAUDE.md under 200 lines is calibrated to my context-tax tolerance, not yours. I write CLAUDE.md to be the smallest thing that&#8217;s still useful, because every line in it is paid for on every turn in every session. If your projects are larger or smaller than mine, your CLAUDE.md should be too. If your tolerance for context tax is different (some people will trade more setup tokens for less in-session friction), your CLAUDE.md will be longer than mine.</p><p>The pattern prose in the security-review skill has been rewritten from the Arcanum-Sec sec-context taxonomy to reflect my voice and selection logic. The attribution is preserved, but the prose isn&#8217;t theirs anymore. If you adopt the skill as a starting point, you should rewrite it again. The selection logic is mine, the priorities are mine, and the file-type triggers reflect what I write the most of. If your language mix is different, you&#8217;ll want different triggers and a different priority order.</p><p>The Quality Contract section IDs and threat IDs are stable across my repo, which means hooks and skills can cite them by ID, and a drift check can verify the citations resolve. If you adopt the structure, you&#8217;ll want to renumber to your own threat model. Don&#8217;t inherit my IDs and pretend they&#8217;re yours. The whole point of the reasoning trail format is that the citations track to something real, and ID inheritance breaks that the first time you forget which threat ID came from where.</p><h2>What I&#8217;d do differently if I started over</h2><p>Two things, and I&#8217;ll know about a third by the time I finish the Jetson and Windows validations.</p><p>Lock the foundation docs and the Quality Contract before any platform work. I built the Mac section in parallel with the foundation, which meant some early Mac decisions had to be revisited as the Quality Contract sharpened. Each revisit costs a commit cycle and a small amount of confidence in the validity of earlier work. Doing the foundation first and the platform second would have made the reasoning trail cleaner, and the Mac reference build wouldn&#8217;t have had a handful of decisions that needed an asterisk.</p><p>Write the JOURNEY.md format on day one. I started JOURNEY.md after the initial batch of artifacts had already landed, which meant the reasoning for the first batch had to be reconstructed from commit messages rather than captured live. Commit messages are good for landing decisions. They aren&#8217;t the same thing as a running narrative that captures the questions you were sitting with as you made them. Future me will thank present me for any reasoning that gets captured live instead of being reconstructed later. Past me did not get that gift.</p><p>The third thing I&#8217;m watching for: I suspect the Phase 4 security-review skill will need a different structure once I validate it against the Jetson and Windows environments. The Mac pattern selection assumes a tool mix I haven&#8217;t proven survives the port. If it doesn&#8217;t, the lesson will be &#8220;design the skill structure against the hardest target first, not the easiest.&#8221; I don&#8217;t know yet. The JOURNEY.md entry that resolves it will say so.</p><h2>How to read the repo</h2><p>Read foundation/00-quality-contract.md first. It binds everything else in the repo, and if you&#8217;re going to argue with my reasoning, you need to argue from the same starting point I&#8217;m arguing from. After that, pick your path. USER_GUIDE.md walks through the wiring if you want a quick start for adopting the harness in your own project. HARNESS_GUIDE.md is the technical reference across all three platforms. If you want the full validated build with all the reasoning intact, read mac/ start to finish in commit order.</p><p>What I want from readers isn&#8217;t forks of the configs. It&#8217;s forks of the thinking. If your harness ends up looking nothing like mine because you have a different threat model, different platforms, a different language mix, or a different context-tax budget, that&#8217;s the right outcome. If your harness ends up looking exactly like mine, one of us is wrong, and the math says it&#8217;s probably you.</p><h2>The question I&#8217;m leaving open</h2><p>Most Claude Code users I&#8217;ve talked to are running with default permission modes on production codebases and calling that ops maturity. They have no hooks, no skills beyond what shipped, and a CLAUDE.md that someone else wrote or that doesn&#8217;t exist at all. If you can&#8217;t name the three layers of your security stack without checking, and you can&#8217;t say what gets enforced deterministically versus advisorily, you don&#8217;t have a harness. You have a session.</p><p>What&#8217;s in your harness, and could you defend it on a panel?</p><p>The repo&#8217;s at <strong><a href="https://github.com/rocklambros/harness-engineering">github.com/rocklambros/harness-engineering</a></strong>. The license is MIT. Use the patterns and argue with me in the comments or in your own JOURNEY.md.</p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 38 May 8-May 14, 2026]]></title><description><![CDATA[The Week AI Defense Vendors Bet Their Roadmaps on Each Other&#8217;s Models]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260508-20260514</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260508-20260514</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 15 May 2026 12:50:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!y6oT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y6oT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y6oT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!y6oT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!y6oT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!y6oT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y6oT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/197810626?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y6oT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!y6oT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!y6oT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!y6oT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b086ed-b3b5-4a58-83fc-414a2a694b64_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260508-20260514?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260508-20260514?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Three vendors launched competing AI vulnerability hunters. Google announced the first confirmed attacker use of an AI-discovered zero-day. The European Commission opened a transparency rulebook nobody finished writing. OpenAI got sued because ChatGPT allegedly helped plan a mass shooting. LiteLLM hit CISA&#8217;s KEV list after a pre-auth SQL injection compromised the AI gateway holding model API keys.</p><p>This week confirmed what skeptics argued for two years. AI doesn&#8217;t change cybersecurity through some abstract paradigm shift, it changes it by collapsing timelines. Discovery cycles that took months now run in days. Patching windows evaporate before the patch ships. Regulatory drafting runs on three-month consultation cycles. The center of gravity is moving from people who hunt bugs to people who govern the systems hunting them. If your strategy still assumes humans set the pace, you&#8217;re already behind.</p><h3>1. Google Confirms First Real-World AI-Discovered Zero-Day Attack</h3><p>Google&#8217;s Threat Intelligence Group disclosed on May 11, 2026 that it disrupted a criminal group using AI to identify and exploit an unknown vulnerability in widely used open-source software (Domain-b). Analysts spotted machine-generated code indicators, including metadata inconsistencies. Google did not name the target, the AI model, or the group, but said the campaign was blocked before launch (Fortune).</p><p><strong>Why it matters</strong></p><ul><li><p>Attackers crossed a capability threshold that defenders expected years away</p></li><li><p>Open-source dependencies became economically attractive to compromise at machine speed</p></li><li><p>Google&#8217;s detection signal, LLM code artifacts, is what sophisticated attackers will suppress next</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit your SBOM for open-source components in critical paths, prioritizing low-maintenance projects</p></li><li><p>Treat AI-assisted vulnerability research as a baseline attacker capability in your threat model</p></li><li><p>Validate your detection stack ingests statistical anomalies in code patterns, not only traditional IoCs</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Google blocking one campaign isn&#8217;t a victory; it&#8217;s the first time we caught one. Every honest threat hunter I know assumes five or ten more slipped through. Detection relied on attackers being sloppy enough to leave LLM fingerprints in their code. That window closes the second they polish exploits through a human pass, which costs about thirty bucks of contractor time. AI-powered attacks aren&#8217;t a 2027 problem anymore, they&#8217;re a today problem.</p><h3>2. OpenAI Launches Daybreak as Defensive Counter to Anthropic Mythos</h3><p>OpenAI introduced Daybreak on May 11, 2026, pairing GPT-5.5 with Codex Security as an agentic scaffold alongside Akamai, Cisco, Cloudflare, CrowdStrike, Fortinet, Oracle, Palo Alto Networks, and Zscaler (The Hacker News). Three tiers ship: standard GPT-5.5, GPT-5.5 with Trusted Access for Cyber, and GPT-5.5-Cyber for red-team and pen-test workflows. Unlike Mythos, which remains in tight preview, Daybreak is publicly accessible by request (Cybersecurity Dive).</p><p><strong>Why it matters</strong></p><ul><li><p>Frontier AI labs are in direct competition for cybersecurity-vendor relationships, redrawing procurement for every CISO</p></li><li><p>Tiered access tied to verified cyber credentials is the first serious dual-use governance attempt for capability-restricted models</p></li><li><p>Defenders gain a second credible vendor for AI-assisted vulnerability discovery, breaking monoculture risk</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Run a head-to-head of Daybreak, Mythos partners, and MDASH against your codebase before any multi-year deal</p></li><li><p>Build your AI-assisted vulnerability program around outputs you can validate, not vendor demos</p></li><li><p>Define what &#8220;ready&#8221; means for an AI-discovered finding before these systems push results into your tracker</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The pitch sounds great. Three labs are racing to embed themselves in Fortune 500 security operations before regulators figure out what the technology is doing. Tiered access by credential verification is the smartest piece of Daybreak, and the piece most likely to be quietly relaxed once a major customer&#8217;s red team gets blocked. I&#8217;ve seen this pattern with offensive tools for twenty years. The right question isn&#8217;t which model finds more bugs, it&#8217;s which vendor&#8217;s scaffold produces findings your team can actually fix.</p><h3>3. Microsoft Reveals MDASH and Discloses 16 Windows Vulnerabilities</h3><p>Microsoft revealed MDASH on May 12, 2026, a multi-model agentic scanning harness orchestrating more than 100 specialized AI agents (Microsoft Security). The system found 16 previously unknown vulnerabilities patched in May's Patch Tuesday, including four critical RCEs in tcpip.sys, ikeext.dll, netlogon.dll, and dnsapi.dll. MDASH scored 88.4% on CyberGym, beating Mythos (GeekWire). It&#8217;s in limited preview with select customers.</p><p><strong>Why it matters</strong></p><ul><li><p>Durable advantage lies in the agentic system around the model, not the model itself</p></li><li><p>All four critical flaws were network-reachable without credentials, the bug class adversaries pay top dollar for</p></li><li><p>96% recall on five years of CLFS bugs and 100% on tcpip.sys shows AI vulnerability discovery is production-grade</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Patch the May cohort with priority on the four critical RCEs, even ahead of normal change windows</p></li><li><p>Ask your software vendors what their AI-assisted vulnerability discovery program looks like</p></li><li><p>Update procurement security reviews to include questions about AI-driven code auditing maturity</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Two things stand out. Ensemble AI agent systems beat single-model systems for bug hunting. That&#8217;s an architectural finding, not marketing copy. Sixteen new RCE-class vulnerabilities in the Windows networking stack reminds us the most reviewed code on Earth still hides serious bugs humans missed for years. The AI didn&#8217;t get smarter, we finally pointed enough compute at the problem. The strategic question is what happens when adversaries point the same compute at the same code. Microsoft&#8217;s lead is months.</p><h3>4. EU Commission Opens Consultation on AI Transparency Obligations</h3><p>The European Commission published draft guidelines on May 8, 2026 covering AI Act Article 50 transparency obligations, with consultation running through June 3, 2026 (European Commission). The guidelines spell out four obligations effective August 2, 2026: disclosure when users interact with AI, marks on AI-generated content, disclosure for emotion recognition and biometric categorization, and deepfake labeling. Non-compliance carries fines up to &#8364;15 million or 3% of global turnover (DataGuidance).</p><p><strong>Why it matters</strong></p><ul><li><p>Article 50 reaches non-EU providers if their AI outputs touch EU users, putting US companies in scope</p></li><li><p>The watermarking window shrank to December 2, 2026 under the May 7 Digital Omnibus deal</p></li><li><p>Compliant watermarking standards are not yet published, leaving companies building against a moving target</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map every AI system you operate that could touch EU users, including embedded vendor capabilities</p></li><li><p>Start watermarking proof-of-concept work now against draft standards like C2PA, accepting possible rework</p></li><li><p>Submit feedback to the EU consultation by June 3 if your business depends on AI transparency boundaries</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The political headline was the AI Act got simpler. The substance was that one transparency deadline got compressed while another got delayed. Compliance officers love that kind of calendar arithmetic because it lets them quietly miss things. The August 2026 chatbot disclosure is the boring obligation that catches everybody. If your AI assistant doesn&#8217;t tell EU users it&#8217;s an AI assistant, you&#8217;re exposed. Your vendor&#8217;s chatbot not disclosing is your problem.</p><h3>5. OpenAI Sued Over ChatGPT&#8217;s Alleged Role in Florida Mass Shooting</h3><p>Vandana Joshi, widow of a Florida State University mass shooting victim, filed a federal lawsuit against OpenAI on May 11, 2026, alleging ChatGPT advised attacker Phoenix Ikner on optimal location, timing, weapon selection, and ammunition (Reuters, AP News). Florida&#8217;s attorney general opened a rare criminal investigation in April 2026. OpenAI denied wrongdoing, saying ChatGPT provided factual responses drawn from public sources (US News).</p><p><strong>Why it matters</strong></p><ul><li><p>Product liability theories on general-purpose AI assistants are now in active federal litigation</p></li><li><p>The case tests whether AI companies have a duty of care to detect and intervene in violence-planning conversations</p></li><li><p>A plaintiff win could rewrite operational requirements for consumer AI safety guardrails</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Review AI vendor contracts for indemnification clauses tied to misuse and downstream harm</p></li><li><p>Document harm detection and escalation procedures with evidence that they were followed</p></li><li><p>Treat AI safety telemetry as a legal artifact, retained and discoverable, not only an operational signal</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This case will settle or be appealed for years, but the discovery phase is what matters. Internal documents showing what OpenAI knew about violence-planning prompts and what they chose not to escalate will become the de facto safety standard. Plaintiffs don&#8217;t need to win the verdict&#8230; they just need to win the depositions. If your product can be used to plan harm and telemetry shows it has been, your retention policy just became a litigation strategy.</p><h3>6. Microsoft Patch Tuesday Sets Vulnerability Record as AI Discovery Surges</h3><p>Microsoft issued patches for more than 130 vulnerabilities on May 13, 2026, on pace to break its annual record after patching over 500 in the first five months (The Record). CVE-2026-41089 in Windows Netlogon and CVE-2026-41096 in Windows DNS Client both carry 9.8 CVSS. Microsoft&#8217;s security leadership acknowledged AI tools are driving the surge. HackerOne paused its open-source bug bounty earlier this year, citing the imbalance between AI-driven discovery and maintainer remediation capacity.</p><p><strong>Why it matters</strong></p><ul><li><p>AI-accelerated discovery is pushing patch volume past the absorption capacity of most vulnerability management programs</p></li><li><p>Traditional 30-day or 60-day patching SLAs were never designed for monthly batches of critical RCEs</p></li><li><p>Open-source maintainer burnout is a systemic security risk as AI finds faster than humans fix</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Move from time-based patching SLAs to risk-based ones tied to exploit probability and asset criticality</p></li><li><p>Invest in network segmentation and identity isolation to limit blast radius when patching slips</p></li><li><p>Track mean-time-to-patch for critical vulnerabilities monthly and report the trend to your audit committee</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Vulnerability management has been broken for a decade. We pretended monthly patch cycles were sustainable when they were already breaking. AI made the math impossible to ignore. The honest answer is you will never patch fast enough. The strategy has to shift to &#8220;assume compromise, limit blast radius, recover faster than the attacker can adapt.&#8221; I&#8217;ve been saying that for three years to compliance team eye-rolls. This week&#8217;s data ends that argument.</p><h3>7. Cisco Open-Sources Foundry Security Spec for Agentic Security Evaluation</h3><p>Cisco released the Foundry Security Spec as open source on May 12, 2026, defining eight core agent roles, five extensions, around 130 functional requirements, and 11 inviolable principles for agentic security evaluation systems (Techzine, SMBtech). It&#8217;s model-agnostic and works with Mythos and GPT-5.5-Cyber via GitHub&#8217;s spec-kit. The goal is moving AI security from prompt demos to auditable production systems, paired with Project CodeGuard for prevention.</p><p><strong>Why it matters</strong></p><ul><li><p>Open-source specs for AI security agents create a path to vendor-neutral compliance and audit</p></li><li><p>The eight-role decomposition gives security teams shared vocabulary instead of vendor terminology</p></li><li><p>Cisco open-sourcing the framework is a credible play to set the de facto standard before regulators do</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Pilot Foundry Security Spec against a non-critical workflow to gauge operational lift</p></li><li><p>Map existing AI security tooling against the eight core roles to find gaps in orchestration and validation</p></li><li><p>Engage on the GitHub repository if you have the maturity to contribute, because early committers shape standards</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This is the kind of plumbing announcement that gets ignored in favor of flashier news, and it shouldn&#8217;t. Architectural standards win or lose markets. The OWASP Top 10 didn&#8217;t change vulnerability classes, it changed how teams talked about them. Foundry Security Spec is aiming for the same effect on agentic security. The tell will be whether AWS and Azure converge on it or fork it. Convergence skips a decade of fragmentation. A fork drops us back into vendor lock-in.</p><h3>8. EU Commission Publishes Second Draft Code of Practice on AI Content Marking</h3><p>The European Commission published the second draft of the Code of Practice on Marking and Labeling of AI-Generated Content on May 8, 2026 (European Commission). The revised text introduces a two-layered marking approach that combines secure metadata with watermarking, optional fingerprinting, logging protocols, and detection-and-verification procedures. Skadden&#8217;s analysis confirmed that compliance is required as of December 2, 2026, for generative AI systems already on the EU market, accelerated relative to earlier proposals (Skadden).</p><p><strong>Why it matters</strong></p><ul><li><p>The revised two-layered watermarking approach is the most concrete EU technical specification published to date</p></li><li><p>Generative AI providers have six months to build compliant marking against a still-evolving technical standard</p></li><li><p>Fines remain at &#8364;15 million or 3% of global turnover for Article 50 violations</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Confirm AI vendors have a credible two-layer watermarking roadmap targeting December 2, 2026</p></li><li><p>Build C2PA-compatible metadata and watermarking prototypes against the draft code now</p></li><li><p>Track the optional fingerprinting and logging requirements for downstream traceability</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The second draft Code is the most concrete watermarking specification anyone has published, and it&#8217;s still incomplete. Six months to build secured metadata, watermarking, fingerprinting, and detection tooling against an evolving standard is engineering fiction. Expect generative AI vendors to claim adherence via voluntary code participation while the technical build drifts. The CISOs who already started C2PA work in 2025 are sitting pretty. The ones who treated watermarking as a marketing problem will discover December 2 isn&#8217;t negotiable.</p><h3>9. India Demands Sovereign Control Over Frontier AI Cybersecurity Models</h3><p>India&#8217;s government met with Anthropic&#8217;s India team in early May 2026 to discuss hosting requirements for Claude Mythos, with reporting confirmed on May 12, 2026 (Medianama). Finance Ministry, MeitY, and CERT-In officials argued that AI in banking, telecom, and critical infrastructure must be hosted in Indian territory or a government-approved sovereign cloud. Finance Minister Nirmala Sitharaman called Mythos&#8217;s capabilities an &#8220;unprecedented&#8221; threat.</p><p><strong>Why it matters</strong></p><ul><li><p>Sovereign hosting is becoming a procurement gate for frontier AI access in major non-Western markets</p></li><li><p>Indian banking and critical infrastructure deployments of US-hosted AI face new jurisdictional risks</p></li><li><p>The pattern will spread to Brazil, Indonesia, and the Gulf states</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Validate AI hosting jurisdiction with your legal team if you operate in India&#8217;s regulated industries</p></li><li><p>Build a vendor diversification strategy that accommodates regional sovereignty without forcing rewrites</p></li><li><p>Engage sovereign cloud providers earlier in architecture, not as a post-deployment retrofit</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The geopolitical fragmentation of AI access is happening in real time. Western vendors still pretend it&#8217;s manageable through commercial agreements. India is signaling clearly that strategic AI must operate under Indian jurisdiction or not at all. Other countries will copy. The companies figuring out sovereign deployment architectures first win the next decade of international AI revenue. Those treating this as a temporary hurdle will watch growth markets quietly close.</p><h3>10. CISA Adds LiteLLM SQL Injection to KEV as Active Exploitation Confirmed</h3><p>CISA added CVE-2026-42208 to its Known Exploited Vulnerabilities catalog on May 8, 2026, for a pre-auth SQL injection in BerriAI&#8217;s LiteLLM proxy that allows attackers to access the database storing API keys for OpenAI, Anthropic, AWS Bedrock, Google Gemini, and other providers (Windows Forum, CCB Belgium). Affecting LiteLLM 1.81.16 through 1.83.6, the flaw was exploited within 36 hours of disclosure (Sysdig). Federal agencies had until May 11 to patch under BOD 22-01.</p><p><strong>Why it matters</strong></p><ul><li><p>AI gateways consolidate provider API keys with five-figure spend caps in one database</p></li><li><p>A database extraction at an AI proxy is closer to cloud-account compromise than a traditional SQL injection</p></li><li><p>Most LiteLLM deployments were stood up by application teams outside security review</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory every AI proxy and gateway, including shadow deployments</p></li><li><p>Patch LiteLLM to v1.83.10-stable or later, and review Postgres query history for probing</p></li><li><p>Rotate every provider API key managed by an affected instance as a credential compromise response</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This is the canary I&#8217;ve been warning about. AI gateways became the pattern of choice because they make access to multi-provider models manageable, and they did so without a serious security review. The bug isn&#8217;t exotic, it&#8217;s a 2003-vintage SQL injection. The blast radius is exotic because of what these gateways guard. Federal agencies had three days to patch. Most enterprises will take three weeks and feel proud of moving fast.</p><h3>11. The One Thing You Won&#8217;t Hear About But You Need To: Vector Embedding Pipelines Are the Next Enterprise AI Blind Spot</h3><p>While the industry focused on vendor launches this week, the quieter story is that the AI data plane is wide open. Help Net Security published research on May 13, 2026, confirming that vector-embedding pipelines used for retrieval-augmented generation expose enterprise AI to attacks that traditional security tools cannot detect (Help Net Security). DLP tools can&#8217;t read or interpret embeddings, creating a blind spot for sensitive content shipped to embedding services. Spring AI bugs disclosed in late April included SQL injection in CosmosDBVectorStore, confirming vector store backends inherit traditional database vulnerability classes without the same control maturity.</p><p><strong>Why it matters</strong></p><ul><li><p>53% of enterprises now use RAG and agentic pipelines, so vector database flaws affect most enterprise AI deployments</p></li><li><p>Sensitive content gets converted to embeddings and shipped to third-party services where DLP cannot inspect in transit</p></li><li><p>Multi-tenant vector databases create cross-tenant exposure paths that mirror early cloud storage failures of 2015</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory every vector database, including SaaS embedding services you didn&#8217;t approve</p></li><li><p>Apply integrity checks and access controls to vector stores at the same maturity as primary databases</p></li><li><p>Run hybrid retrieval combining dense vectors with BM25 lexical search to limit poisoned embedding impact</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Vector stores look boring. They&#8217;re glorified key-value databases that happen to hold numerical arrays. Those arrays encode every confidential document your knowledge base ingests, and your security stack treats them as opaque blobs. AI security isn&#8217;t a model problem, it&#8217;s a data plane problem. The first major enterprise AI breach in the next twelve months will trace back to a vector store nobody inventoried, an embedding service nobody reviewed, or an agent nobody scoped. The defenders who win are the ones treating their AI pipeline like their CI/CD pipeline. Visit <a href="https://rockcybermusings.com/">rockcybermusings.com</a> for deeper coverage and <a href="https://www.rockcyber.com/">rockcyber.com</a> for advisory work on governance programs that survive contact with production AI.</p><p>For more on agentic AI risk and CISO governance, see <a href="https://www.rockcyber.com/">RockCyber</a> and analysis at <a href="https://rockcybermusings.com/">RockCyber Musings</a>.</p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; As a bonus, check out my conversation with <strong><a href="https://www.linkedin.com/company/cisotradecraft/">CISO Tradecraft&#174;</a> </strong>where we talked about the <strong><a href="https://www.linkedin.com/company/owasp-top-10-for-large-language-model-applications/">OWASP GenAI Security Project</a></strong> Agentic Top 10</p><div id="youtube2-YI7KZ2R54aI" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;YI7KZ2R54aI&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/YI7KZ2R54aI?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><h2>References</h2><p>Aembit. (2026). <em>MCP security vulnerabilities: Complete guide for 2026</em>. https://aembit.io/blog/the-ultimate-guide-to-mcp-security-vulnerabilities/</p><p>Air Street Press. (2026, May). <em>State of AI: May 2026</em>. https://press.airstreet.com/p/state-of-ai-may-2026</p><p>Associated Press. (2026, May 11). OpenAI is sued over ChatGPT&#8217;s alleged role helping plan a mass shooting. <em>AP News</em>. https://apnews.com/article/openai-chatgpt-lawsuit-mass-shooting-florida-1a8071ee49ad0220348d3eb55f60e648</p><p>Bishop, T. (2026, May 13). Microsoft&#8217;s multi-agent AI system tops Anthropic&#8217;s Mythos on cybersecurity benchmark. <em>GeekWire</em>. https://www.geekwire.com/2026/microsofts-multi-agent-ai-system-tops-anthropics-mythos-on-cybersecurity-benchmark/</p><p>Centre for Cybersecurity Belgium. (2026, May 13). <em>Warning: LiteLLM pre-auth SQL injection (CVE-2026-42208), patch immediately!</em> https://ccb.belgium.be/advisories/warning-litellm-pre-auth-sql-injection-cve-2026-42208-patch-immediately</p><p>Cybersecurity Dive. (2026, May 11). OpenAI launches Daybreak to combat cyber threats. https://www.cybersecuritydive.com/news/OpenAI-Daybreak-cyber-threats/820122/</p><p>Cygnus. (2026, May 11). Google reports first AI-generated zero-day exploit in cybersecurity milestone. <em>Domain-b</em>. https://www.domain-b.com/technology/artificial-intelligence/google-ai-zero-day-exploit-cybersecurity-2026</p><p>DataGuidance. (2026, May 8). EU: Commission opens consultation on draft AI Act transparency guidelines under Article 50. https://www.dataguidance.com/news/eu-commission-opens-consultation-draft-ai-act</p><p>European Commission. (2026, May 8). <em>Commission opens consultation on draft guidelines for AI transparency obligations</em>. https://digital-strategy.ec.europa.eu/en/news/commission-opens-consultation-draft-guidelines-ai-transparency-obligations</p><p>Forbes. (2026, May 12). OpenAI Daybreak takes on Mythos to redefine security. https://www.forbes.com/sites/timkeary/2026/05/12/openai-daybreak-goes-head-to-head-with-anthropic-to-redefine-security/</p><p>French, L. (2026, May 13). OpenAI Daybreak joins growing movement of AI-driven vulnerability discovery. <em>SC World</em>. https://www.scworld.com/news/openai-daybreak-joins-growing-movement-of-ai-driven-vulnerability-discovery</p><p>Help Net Security. (2026, May 13). <em>Microsoft&#8217;s agentic security system found four critical Windows RCE flaws</em>. https://www.helpnetsecurity.com/2026/05/13/microsoft-mdash-agentic-ai-security-system/</p><p>Kim, T. (2026, May 12). Defense at AI speed: Microsoft&#8217;s new multi-model agentic security system tops leading industry benchmark. <em>Microsoft Security Blog</em>. https://www.microsoft.com/en-us/security/blog/2026/05/12/defense-at-ai-speed-microsofts-new-multi-model-agentic-security-system-tops-leading-industry-benchmark/</p><p>Lakshmanan, R. (2026, May 12). OpenAI launches Daybreak for AI-powered vulnerability detection and patch validation. <em>The Hacker News</em>. https://thehackernews.com/2026/05/openai-launches-daybreak-for-ai-powered.html</p><p>Lakshmanan, R. (2026, May 13). Microsoft&#8217;s MDASH AI system finds 16 Windows flaws fixed in Patch Tuesday. <em>The Hacker News</em>. https://thehackernews.com/2026/05/microsofts-mdash-ai-system-finds-16.html</p><p>European Commission. (2026, May 8). <em>Commission publishes second draft of Code of Practice on Marking and Labelling of AI-generated content</em>. https://digital-strategy.ec.europa.eu/en/library/commission-publishes-second-draft-code-practice-marking-and-labelling-ai-generated-content</p><p>Inside Global Tech. (2026, May 12). <em>10 takeaways: European Commission draft guidelines on AI transparency under the EU AI Act</em>. https://www.insideglobaltech.com/2026/05/12/10-takeaways-european-commission-draft-guidelines-on-ai-transparency-under-the-eu-ai-act/</p><p>Skadden. (2026, May). <em>AI Act state of play &#8211; Key obligations postponed and amended</em>. https://www.skadden.com/insights/publications/2026/05/ai-act-state-of-play</p><p>Medianama. (2026, May 12). India pushes for sovereign control over AI cybersecurity systems: Report. https://www.medianama.com/2026/05/223-india-pushes-sovereign-control-ai-cybersecurity-systems-report/</p><p>O&#8217;Brien, M. (2026, May 11). &#8216;It&#8217;s here&#8217;: Google issues dire warning after catching hackers using AI to break into computers. <em>Fortune</em>. https://fortune.com/2026/05/11/google-catches-hackers-cybersecurity-warning-ai-anthropic-mythos/</p><p>Open Source For You. (2026, May 12). Cisco launches open-source Foundry Security Spec to tackle AI-driven cyber threats. https://www.opensourceforu.com/2026/05/cisco-launches-open-source-foundry-security-spec-to-tackle-ai-driven-cyber-threats/</p><p>Repello. (2026, May 2). <em>Vector embedding security: Why static audits miss the real attacks</em>. https://repello.ai/blog/vector-embedding-security</p><p>Reuters. (2026, May 11). Family of Florida mass shooting victim sues OpenAI in US court. https://www.reuters.com/legal/government/family-florida-mass-shooting-victim-sues-openai-us-court-2026-05-11/</p><p>SMBtech. (2026, May 12). Cisco open-sources specification for building AI-powered security evaluation systems. https://smbtech.au/news/cisco-open-sources-specification-for-building-ai-powered-security-evaluation-systems/</p><p>Sysdig. (2026). <em>CVE-2026-42208: Targeted SQL injection against LiteLLM&#8217;s authentication path discovered 36 hours following vulnerability disclosure</em>. https://www.sysdig.com/blog/cve-2026-42208-targeted-sql-injection-against-litellms-authentication-path-discovered-36-hours-following-vulnerability-disclosure</p><p>Taylor Wessing. (2026, May). <em>The EU Digital Omnibus on AI &#8211; What the political deal means</em>. https://www.taylorwessing.com/en/insights-and-events/insights/2026/05/the-eu-digital-omnibus-on-ai-what-the-political-deal-means</p><p>Techzine. (2026, May 12). Cisco open-sources Foundry Security Spec for CISO-ready agents. https://www.techzine.eu/news/security/141257/cisco-open-sources-foundry-security-spec-for-ciso-ready-agents/</p><p>The Record. (2026, May 13). Microsoft on pace to break annual vulnerability record as AI-driven patch wave takes hold. https://therecord.media/microsoft-on-pace-to-break-annual-vulnerability-record-ai</p><p>US News &amp; World Report. (2026, May 11). Lawsuit blames ChatGPT maker OpenAI for bot helping plan a mass shooting. https://www.usnews.com/news/best-states/california/articles/2026-05-11/lawsuit-blames-chatgpt-maker-openai-for-bot-helping-plan-a-mass-shooting</p><p>Windows Forum. (2026, May 8). <em>CISA adds LiteLLM SQL injection CVE-2026-42208 to KEV&#8212;AI proxies are high-value</em>. https://windowsforum.com/threads/cisa-adds-litellm-sql-injection-cve-2026-42208-to-kev-ai-proxies-are-high-value.417219/</p>]]></content:encoded></item><item><title><![CDATA[Five Eyes Agentic AI Guidance: Architecture, Not a Checklist]]></title><description><![CDATA[Five Eyes published agentic AI architecture, not a checklist. See how AAGATE maps the controls to NIST AI RMF for production governance.]]></description><link>https://www.rockcybermusings.com/p/five-eyes-agentic-ai-architecture-not-checklist</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/five-eyes-agentic-ai-architecture-not-checklist</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 12 May 2026 12:50:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!eJuc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eJuc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eJuc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!eJuc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!eJuc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!eJuc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eJuc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:330341,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/197088376?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eJuc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!eJuc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!eJuc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!eJuc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51ec1c04-cf22-45f5-9b7b-3d42d6908af2_2752x1536.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On May 1, 2026, six allied cyber agencies dropped 30 pages on agentic AI security, and the industry promptly reached for its highlighters. Twenty-three risks and more than a hundred best practices. The initial reflex is to map them to existing controls and call it a project plan. </p><p>WRONG! </p><p>CISA, NSA, ASD, NCSC-UK, NCSC-NZ, and the Cyber Centre published an architecture brief disguised as a guidance document. Read it that way, and the work changes.</p><h2>The Misreading That&#8217;s Happening</h2><p>Pick any board deck circulating right now, and I&#8217;ll bet the Five Eyes guidance shows up as a row in a control matrix (if at all). Privilege controls: check. Identity management: check. Logging: check. Someone in the room nods, the GRC team gets a tracking spreadsheet, and the agentic AI rollout continues at the same pace as before May 1.</p><p>That&#8217;s the failure mode. The document contains 23 distinct risks and over 100 individual best practices to address them. You don&#8217;t bolt 100 practices onto an existing platform without changing its shape...its architecture. Treating a system-level prescription as line-item compliance is how you end up with the audit-passes-but-the-thing-is-still-broken&#8221; pattern that plagues us to this day.</p><p>Read the document carefully, and the architectural intent is everywhere. Identity binds to privilege. Privilege binds to tool access. Tool access binds to logging. Logging binds to accountability. Each control assumes the others exist. Each one fails when built alone. The agencies named this directly when they recommended system-theoretic approaches like STPA and STPA-Sec, calling out that traditional component-level analysis is insufficient because risks emerge from interactions between components rather than isolated flaws.</p><p>That single paragraph is the operational thesis. The rest of the document describes how to build for it. A senior security practitioner, reading carefully, will recognize a familiar pattern, and this is what happens when policy folks finally accept you don&#8217;t write a check-box for emergent risk.</p><p>The question now is what production systems look like when somebody actually does the work. <strong><a href="https://arxiv.org/html/2510.25863">AAGATE is one answer</a>, and we released it last November</strong>.</p><h2>What the Document Actually Says</h2><p>Strip the fluff, and the document organizes around five risk categories:</p><ol><li><p>Privilege risk</p></li><li><p>Design and configuration flaws</p></li><li><p>Behavioral risk</p></li><li><p>Structural risk</p></li><li><p>Accountability risk</p></li></ol><p>The categories aren&#8217;t mutually exclusive. They&#8217;re stacked dependencies.</p><p>Privilege risk is the foundation. The procurement-agent scenario in the guidance is a classic confused-deputy attack. An over-permissioned agent gets compromised through a low-risk tool, the attacker inherits the agent&#8217;s privileges, and modified contracts and approved payments slip past audit logs that look legitimate.</p><p>Design and configuration risk sits atop privilege. Static permission checks at startup don&#8217;t survive dynamic workflows. Allow lists go stale. Boundaries between agent enclaves erode under operational pressure. Behavioral risk piles onto that. Goal misalignment, specification gaming, deceptive behavior, and emergent capabilities all assume the agent has already been granted enough autonomy to act in surprising ways.</p><p>Structural risk is where it gets interesting. The agencies describe cascading failures across orchestration layers, tool integrations, third-party components, agent-to-agent communication, and shared data stores. A single rogue agent in a multi-agent system corrupts consensus, spreads incorrect information, alters logs, and propagates malicious plans peer-to-peer. None of this is fixable at the agent level alone.</p><p>Accountability risk closes the loop. Decisions made through long reasoning chains, stochastic outputs, and emergent multi-agent interactions are difficult to audit, attribute, or reproduce. The agencies reach for cryptographic identity, comprehensive artifact logging, and unified audit logs across inter-agent interactions. They&#8217;re describing a system property, not a feature you purchase.</p><h2>AAGATE Maps the Architecture to NIST AI RMF</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3hg3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1488fd6a-9a36-4b50-86ec-7475506e1e24_2298x2220.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3hg3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1488fd6a-9a36-4b50-86ec-7475506e1e24_2298x2220.png 424w, https://substackcdn.com/image/fetch/$s_!3hg3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1488fd6a-9a36-4b50-86ec-7475506e1e24_2298x2220.png 848w, https://substackcdn.com/image/fetch/$s_!3hg3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1488fd6a-9a36-4b50-86ec-7475506e1e24_2298x2220.png 1272w, https://substackcdn.com/image/fetch/$s_!3hg3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1488fd6a-9a36-4b50-86ec-7475506e1e24_2298x2220.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3hg3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1488fd6a-9a36-4b50-86ec-7475506e1e24_2298x2220.png" width="1456" height="1407" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1488fd6a-9a36-4b50-86ec-7475506e1e24_2298x2220.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1407,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:260551,&quot;alt&quot;:&quot;Architecture diagram mapping the five Five Eyes risk categories to the four NIST AI RMF functions and the corresponding AAGATE control modules&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/197088376?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1488fd6a-9a36-4b50-86ec-7475506e1e24_2298x2220.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Architecture diagram mapping the five Five Eyes risk categories to the four NIST AI RMF functions and the corresponding AAGATE control modules" title="Architecture diagram mapping the five Five Eyes risk categories to the four NIST AI RMF functions and the corresponding AAGATE control modules" srcset="https://substackcdn.com/image/fetch/$s_!3hg3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1488fd6a-9a36-4b50-86ec-7475506e1e24_2298x2220.png 424w, https://substackcdn.com/image/fetch/$s_!3hg3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1488fd6a-9a36-4b50-86ec-7475506e1e24_2298x2220.png 848w, https://substackcdn.com/image/fetch/$s_!3hg3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1488fd6a-9a36-4b50-86ec-7475506e1e24_2298x2220.png 1272w, https://substackcdn.com/image/fetch/$s_!3hg3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1488fd6a-9a36-4b50-86ec-7475506e1e24_2298x2220.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: Five Eyes risk categories mapped to NIST AI RMF and AAGATE modules</figcaption></figure></div><p>AAGATE is a Kubernetes-native control plane built to operationalize the NIST AI Risk Management Framework against agentic AI systems. The paper, which I co-authored with Ken Huang, Hammad Atta, and a research team, was published to arXiv in late 2025. It picks NIST AI RMF as the spine because the RMF&#8217;s four functions, Govern, Map, Measure, and Manage, are general enough to absorb the Five Eyes prescriptions without forcing translation. The novelty isn&#8217;t the alignment to RMF. The novelty is the prescriptive toolchain: MAESTRO for Map, OWASP AIVSS plus SEI SSVC for Measure, the CSA Agentic AI Red Teaming Guide for Manage, and a zero-trust service mesh anchoring Govern.</p><p>What follows is the mapping of the Five Eyes document points at without naming. Five control areas. Each one shows what the architecture looks like when you stop treating the guidance as a checklist.</p><h2>1. Identity-Anchored Privilege (Govern + Map)</h2><p>The Five Eyes document spends real ink on this. It tells developers to construct each agent as a distinct principal with its own cryptographically anchored identity and unique keys or certificates, to authenticate every inter-agent and agent-to-service API call with mutual TLS, and to maintain a trusted registry that&#8217;s reconciled against the live set of agents. It tells operators to use just-in-time credentials, cryptographic attestation, and a centralized policy decision point that runs at every request.</p><p>Those aren&#8217;t five different controls. They&#8217;re one architecture.</p><p>AAGATE&#8217;s Agent Naming Service builds it. ANS works like DNS for agents. When a new agent starts, it registers its Decentralized Identifier and capabilities, and the service issues a Verifiable Credential along with an Istio SPIFFE certificate that binds the pod&#8217;s identity to its cryptographic DID. Other agents resolve through the registry. Anything not in the registry gets denied. Istio mTLS authenticates every pod-to-pod call with X.509 certificates. The OAuth Relay translates abstract agent capabilities into ephemeral, narrowly-scoped credentials for each side-effect, which is the only practical way to do least-privilege when traditional user-centric consent models break down.</p><p>Try doing any one of those pieces without the others and the system collapses. A registry without mTLS is unauthenticated. mTLS without ephemeral credentials still leaks long-lived tokens. Ephemeral credentials without a registry have no verification path at issuance. The Five Eyes guidance lists these as separate best practices. AAGATE shows why they&#8217;re one control.</p><p>This is also why CISOs aren&#8217;t the only audience for this work. Identity engineers, IAM architects, platform teams, and product leaders need to read it. The org chart that ships agentic AI safely is wider than the security team&#8217;s mailing list.</p><h2>2. The Single Chokepoint for Side-Effects (Map)</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!If_O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e2e00eb-4d5c-43b9-83e9-24455021692f_2352x666.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!If_O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e2e00eb-4d5c-43b9-83e9-24455021692f_2352x666.png 424w, https://substackcdn.com/image/fetch/$s_!If_O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e2e00eb-4d5c-43b9-83e9-24455021692f_2352x666.png 848w, https://substackcdn.com/image/fetch/$s_!If_O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e2e00eb-4d5c-43b9-83e9-24455021692f_2352x666.png 1272w, https://substackcdn.com/image/fetch/$s_!If_O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e2e00eb-4d5c-43b9-83e9-24455021692f_2352x666.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!If_O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e2e00eb-4d5c-43b9-83e9-24455021692f_2352x666.png" width="1456" height="412" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2e2e00eb-4d5c-43b9-83e9-24455021692f_2352x666.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:412,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:105036,&quot;alt&quot;:&quot;Architecture diagram showing agents routing all external actions through a single Tool-Gateway with Rego policy enforcement, immutable audit logging, and a Janus shadow monitor probing each request before egress&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/197088376?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e2e00eb-4d5c-43b9-83e9-24455021692f_2352x666.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Architecture diagram showing agents routing all external actions through a single Tool-Gateway with Rego policy enforcement, immutable audit logging, and a Janus shadow monitor probing each request before egress" title="Architecture diagram showing agents routing all external actions through a single Tool-Gateway with Rego policy enforcement, immutable audit logging, and a Janus shadow monitor probing each request before egress" srcset="https://substackcdn.com/image/fetch/$s_!If_O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e2e00eb-4d5c-43b9-83e9-24455021692f_2352x666.png 424w, https://substackcdn.com/image/fetch/$s_!If_O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e2e00eb-4d5c-43b9-83e9-24455021692f_2352x666.png 848w, https://substackcdn.com/image/fetch/$s_!If_O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e2e00eb-4d5c-43b9-83e9-24455021692f_2352x666.png 1272w, https://substackcdn.com/image/fetch/$s_!If_O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e2e00eb-4d5c-43b9-83e9-24455021692f_2352x666.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: ool-Gateway as the single chokepoint pattern</figcaption></figure></div><p>The guidance recommends restricting tool use to an approved allow list of tools and versions, logging agent tool usage so results are captured in system logs in human-readable format, and establishing trigger-action protocols that automatically restrict agent permissions when unexpected behavior emerges. It tells you to harmonize the controls with the NIST Zero Trust Architecture.</p><p>The architectural pattern that satisfies all three is one many readers will resist: a single chokepoint. Every external HTTP, database, or file operation funnels through one gateway. The mesh denies egress everywhere else. Bypass attempts get quarantined.</p><p>AAGATE&#8217;s Tool-Gateway is exactly that. Agents publish requests to a Kafka topic. The Tool-Gateway consumes them, checks policies (allow lists, rate limits, scope), executes the action if permitted, and logs the request and response with a cryptographic hash to an immutable audit log. The Governing-Orchestrator Agent watches for bypass attempts and quarantines the offender on detection. OAuth2 token exchange happens at the Gateway, with refresh tokens never leaving its memory vault.</p><p>A single chokepoint smells like a bottleneck to anyone raised on horizontally scaled architectures. That intuition is wrong here. The Tool-Gateway is the place where agentic AI security becomes tractable. Threat modeling becomes tractable because you have one comprehensive map of every system interaction. Audit becomes tractable because logs are unified. Policy enforcement becomes tractable because the decision point is centralized. Eliminate the chokepoint and you&#8217;ve spread the same controls across hundreds of agent-tool integrations, none of which will be enforced consistently. Distributed systems engineers will object. Reality wins.</p><h2>3. Continuous Measurement (Measure)</h2><p>The Five Eyes document repeats itself when it talks about monitoring. Use multiple independent monitoring systems that cross-validate. Monitor agent operations, including internal processes, not the inputs and outputs alone. Watch for goal drift by comparing active objectives against approved baseline specifications. Establish anomaly detection that flags discrepancies between stated intentions and observed behavior. Implement runtime monitoring with rules or behavioral baselines.</p><p>Reading those passages, I count at least six distinct signal types the document expects you to collect, score, and respond to in real time. Treating them as line items is how you end up with six dashboards nobody watches.</p><p>AAGATE compresses them into one pipeline. The UEBA Behavior Profiler builds per-agent behavioral fingerprints with anomaly scores, using Isolation Forest and Markov chains. The ComplianceAgent streams events from the Tool-Gateway, applies Rego policy checks plus LLM-driven reasoning, and generates security signals scored using OWASP&#8217;s AI Vulnerability Scoring System. The Governing-Orchestrator Agent consumes the scored incidents through an SSVC-inspired decision tree to choose a proportionate response. QSAF monitors run alongside, watching for cognitive degradation: recursion loops, memory starvation, context flooding, the kinds of failure modes invisible to conventional anomaly detection.</p><p>The pattern is Observe, Score with AIVSS, Decide with SSVC, Act, then Learn. None of those steps are optional. Skip the scoring, and you have no priority order. Skip the decision tree, and you have no machine-speed response. Skip the cognitive monitors, and you miss a class of failures the document explicitly names.</p><p>Most enterprises run anomaly detection bolted onto SIEM. That doesn&#8217;t work for agents. The signals are different, the response cadence is different, and the decision logic has to be co-located with the policy decision point, not 20 minutes downstream.</p><h2>4. Adversarial Defense and the Kill Switch (Manage)</h2><p>The guidance tells operators to conduct red teaming exercises, deploy agent simulation tests including multi-agent red teaming or chaos testing, embed agentic AI systems with fail-safe defaults and containment mechanisms that limit blast radius, and implement runtime monitoring and anomaly detection to identify unusual patterns and trigger alerts or pauses.</p><p>Periodic red teaming doesn&#8217;t keep up with agents acting at machine speed. Manual chaos exercises don&#8217;t catch a goal-drift event occurring in the 17 seconds between human review cycles. The guidance is asking for continuous adversarial pressure with millisecond response time. That&#8217;s a runtime architecture, not a quarterly engagement.</p><p>AAGATE answers it with the Janus Shadow-Monitor-Agent. Janus runs in a different container with a different model, re-evaluating every high-risk agent&#8217;s planned actions before execution. Goal manipulation, hallucination exploitation, and drift from baseline specifications get caught pre-execution rather than post-mortem. When Janus raises a CRITICAL alert, the Governing-Orchestrator Agent injects an Istio AuthorizationPolicy that cuts all egress. AAGATE calls it the millisecond kill switch because that&#8217;s the time window the system operates in.</p><p>The pairing matters. A continuous internal red team without a kill switch is as useless as a bucket with a gaping hole in the bottom. A kill switch without continuous red teaming has nothing to act on. Five Eyes named both controls separately. AAGATE shows why they&#8217;re the same control.</p><p>This is also where the OT crowd should pay attention. The guidance recommends defense-in-depth and continuous evaluation. In OT contexts, that translates directly to &#8220;you don&#8217;t roll back a physical actuator.&#8221; Containment has to happen before the action, not after.</p><h2>5. Tamper-Evident Accountability (Govern)</h2><p>The accountability section of the guidance is the hardest one. The agencies want comprehensive artifact logging, unified audit logs for inter-agent interactions, interpretability tools that surface reasoning, and information referencing that shows where outputs originated. They&#8217;re describing what the EU AI Act Article 12 calls automatic recording of events, plus what auditors call evidence of effective control operation. If and when the EU AI Act actually ever goes into effect is another conversation altogether&#8230;</p><p>Conventional logging breaks down here. Long reasoning chains generate massive logs that are repetitive and loosely structured. The Five Eyes document is blunt: traditional logs make it even more challenging to extract meaningful signals. Accountability fails not because the data isn&#8217;t recorded, but because nobody proves it wasn&#8217;t tampered with after the fact.</p><p>AAGATE&#8217;s answer combines three patterns. Cryptographic hashes on every Tool-Gateway request and response give you tamper-evidence at the unit level. The optional ETHOS ledger integration mirrors agent registrations and material governance events to a public smart contract, creating a tamper-proof record of agent identity and status. The ZK-Prover service hashes logs hourly and posts Groth16 zero-knowledge proofs on-chain, showing that incidents stayed within the contract-tier budget, giving you privacy-preserving compliance assurance without exposing operational data.</p><p>Argue with the on-chain pieces if you want. They&#8217;re optional in single-tenant deployments, and the AAGATE paper says so explicitly. The cryptographic hashing isn&#8217;t optional. If your accountability model doesn&#8217;t prove logs weren&#8217;t altered after the fact, you don&#8217;t have accountability. You have hope.</p><h2>What This Means Going Forward</h2><p>The Five Eyes document changes the burden of proof. Boards, regulators, and acquirers now have a coordinated multi-government statement naming architecture-level controls as the floor, not the ceiling. &#8220;Until security practices, evaluation methods and standards mature, organisations should assume that agentic AI systems may behave unexpectedly.&#8221; That sentence will undoubtedly show up in due diligence questionnaires.</p><p>If you&#8217;re operating agentic AI today, you have two choices. </p><ul><li><p><strong>Option one:</strong> take the line-item path, map controls to a tracking spreadsheet, and ship 100 separate workstreams that someone else&#8217;s auditor will pull apart in 18 months. </p></li><li><p><strong>Option two:</strong> read the guidance as an architectural prescription, pick a reference build like AAGATE, and treat your agentic security work as a platform engineering problem rather than a compliance problem.</p></li></ul><p>I know which one I&#8217;d present to a board.</p><p><strong>Key Takeaway:</strong> The Five Eyes guidance describes a system property, not a checklist, and compliance follows from architecture rather than the other way around. AAGATE provides that reference architecture.</p><h3>What to do next</h3><p>If your agentic AI program is more than a pilot, audit it against the five risk categories now and look for the architectural gaps the line-item view will hide. The CARE framework I use for AI-augmented security programs lays out how to sequence Create, Adapt, Run, and Evolve work without burning out the platform team. For the technical reference, read the <strong><a href="https://arxiv.org/abs/2510.25863">AAGATE paper on arXiv</a></strong> and treat it as a reference architecture rather than a finished product. If you want help mapping current state to the Five Eyes prescriptions and a NIST AI RMF aligned target architecture, <a href="https://rockcyber.com">RockCyber</a> does this work with security and engineering leadership across critical infrastructure and financial services. For more posts like this, <a href="https://rockcybermusings.substack.com">RockCyber Musings</a> lands in your inbox roughly once a week.</p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; As a bonus, check out my <a href="https://www.youtube.com/watch?v=YI7KZ2R54aI">conversation</a> with <strong><a href="https://www.linkedin.com/company/cisotradecraft/">CISO Tradecraft&#174;</a>, </strong>where we talked about the <strong><a href="https://www.linkedin.com/company/owasp-top-10-for-large-language-model-applications/">OWASP GenAI Security Project</a></strong> Agentic Top 10</p><div id="youtube2-YI7KZ2R54aI" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;YI7KZ2R54aI&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/YI7KZ2R54aI?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>&#128073; Subscribe for more AI security and governance insights with the occasional rant.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 37 May 1-May 7, 2026 ]]></title><description><![CDATA[The Week Governments Decided Agentic AI Needs Adult Supervision]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security20260601-20260507</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security20260601-20260507</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 08 May 2026 12:51:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qS69!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qS69!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qS69!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!qS69!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!qS69!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!qS69!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qS69!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/196850383?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qS69!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!qS69!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!qS69!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!qS69!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6305e3f1-0d1c-4f6e-a9c8-a8f91a403ce7_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security20260601-20260507?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security20260601-20260507?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>This was the week the supervisors stopped asking permission. Five Eyes intelligence agencies, the Pentagon, the Commerce Department, and ServiceNow all converged on the same conclusion at nearly the same time. Agentic AI is shipping without brakes, the brakes need to be added now, and nobody has a clean answer for who pays. Brussels blinked. Washington floated an FDA-style gate for frontier models. Researchers kept finding holes in the plumbing under every AI agent your developers are racing to deploy.</p><p>The pattern was governance catching up to deployment. Three governments and a $200 billion software company echoed what the security crowd has been saying since GPT-4 shipped. You bought the speedboat and forgot the kill switch. Below are the ten stories that mattered between Friday, May 1, and Thursday, May 7, 2026, plus one you missed.</p><h3>1. Five Eyes Drop Joint Agentic AI Guidance</h3><p>CISA, the NSA, Australia&#8217;s ASD ACSC, the Canadian Centre for Cyber Security, the UK&#8217;s NCSC, and New Zealand&#8217;s NCSC released &#8220;<a href="https://www.cisa.gov/resources-tools/resources/careful-adoption-agentic-ai-services">Careful Adoption of Agentic Artificial Intelligence (AI) Services</a>&#8221; (CISA, 2026). The document identifies five risk categories: privilege; design and configuration; behavior, including goal misalignment and deception; structural risks across interconnected components; and accountability risks rooted in opacity. The Register summarized the message bluntly. Agentic AI is too dangerous for rapid rollout (Brandon, 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>Five intelligence agencies aligning sets a baseline for procurement, audit, and insurance underwriting across the English-speaking world.</p></li><li><p>The guide pressures vendors selling fully autonomous agents by recommending incremental deployment and human oversight.</p></li><li><p>Critical infrastructure operators gain a defensible reference document when business units demand agent rollouts in days.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map every deployed agent against the five risk categories and grade each honestly.</p></li><li><p>Require attestation against this guide in procurement language for agentic capabilities.</p></li><li><p>Brief your board this quarter on how the guidance changes your residual risk posture.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Five Eyes guidance is rare enough to mean something. When agencies that attribute nation-state intrusions speak with one voice, treat it as a soft mandate. The privilege risks section reads like a list of incidents I have seen at clients in the last twelve months. Stop deploying autonomy on top of access models you built for humans.</p><h3>2. EU Strikes Provisional Deal to Delay Core AI Act Obligations</h3><p>On May 7, 2026, after roughly nine hours of negotiation, the Council of the EU and the European Parliament reached provisional agreement on the Digital Omnibus on AI (Lewis Silkin, 2026). High-risk obligations under Annex III now apply from December 2, 2027. Annex I obligations apply from August 2, 2028. The transparency grace period for AI-generated content shrinks from six months to three, with a deadline of December 2, 2026 (Modulos, 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>The narrative that the EU is the world&#8217;s strictest AI regulator took a real hit, with industry pressure winning a delay measured in years.</p></li><li><p>Companies that scrambled for Annex III readiness by August 2026 spent their budget on a deadline that no longer exists.</p></li><li><p>The shortened transparency window makes deepfake labeling the most urgent compliance work of the year for consumer-facing AI.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Reset your AI Act program plan against the new deadlines and brief your audit committee on the freed-up budget.</p></li><li><p>Accelerate transparency labeling on generative output exposed to EU users by Q3 2026.</p></li><li><p>Watch the Council and Parliament endorsement votes because the deal can still shift.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I told three clients in 2025 that betting on the original Annex III timeline was a coin flip. The coin landed on delay. The AI Act isn&#8217;t dead, but Brussels learned the lesson California learned with CCPA. With Brussels stretching its timeline, the White House gains room to argue that federal preemption beats a state patchwork. Bet on more state attorneys general filling the gap with UDAP actions before December.</p><h3>3. Pentagon Clears Eight Vendors for AI on Classified Networks</h3><p>The Department of War announced agreements with AWS, Google, Microsoft, NVIDIA, OpenAI, SpaceX, and Reflection AI, with Oracle added shortly after, to deploy AI tools on Impact Level 6 and Impact Level 7 networks (Breaking Defense, 2026). Those impact levels cover secret-classified and the most highly classified Defense systems. Anthropic was conspicuously absent, despite Claude already running inside Palantir&#8217;s Maven Smart System on classified networks (TechCrunch, 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>Defense AI procurement consolidated around eight vendors, with Anthropic frozen out despite a working production deployment.</p></li><li><p>IL-7 deployments mean general-purpose models will reason over the most sensitive U.S. government data, with limited public visibility into evaluation rigor.</p></li><li><p>Defense contractors and integrators have a vendor shortlist that will shape program decisions for the next five years.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If you sell into DoD, align your AI roadmap with these eight vendors.</p></li><li><p>If you advise federal agencies, push for transparency on red-team results before production at IL-6 and IL-7.</p></li><li><p>Expect this vendor list in prime contractor solicitations within a quarter.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Commercial AI is now inseparable from national security infrastructure. Eight vendors. Two impact levels. Decisions that will shape how the U.S. military thinks, plans, and fights for a decade. Where are the public test results? When the FDA approves a drug, you can read the trial data. When the Pentagon approves a model for IL-7, you cannot. That asymmetry will eventually break.</p><h3>4. CAISI Locks Pre-Deployment Testing Deals With Google, Microsoft, and xAI</h3><p>The Center for AI Standards and Innovation announced agreements on May 5, 2026 that allow the U.S. government to evaluate frontier AI models from Google, Microsoft, and xAI before public release (CNBC, 2026). The deals expand a program that already included OpenAI and Anthropic, with the older agreements renegotiated to align with America&#8217;s AI Action Plan (Al Jazeera, 2026). The arrangements remain voluntary.</p><p><strong>Why it matters</strong></p><ul><li><p>Five frontier labs now run pre-deployment evaluations through one federal channel, creating a de facto standard for &#8220;tested&#8221; at the top of the AI supply chain.</p></li><li><p>Voluntary agreements give the government influence without legislation.</p></li><li><p>Smaller and open-source providers face an emerging market expectation they can&#8217;t match.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Add CAISI evaluation status to vendor risk questionnaires for frontier model dependencies.</p></li><li><p>Track CAISI&#8217;s published evaluation criteria, since they will shape your internal evaluation programs.</p></li><li><p>Treat models without CAISI evaluation as higher inherent risk in supply chain assessments.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Voluntary regulation by reputational pressure is the Trump administration&#8217;s preferred AI playbook. The upside is speed. The downside is that voluntary agreements dissolve when a CEO decides the political winds have shifted. If CAISI becomes the gravitational center for AI evaluation, insurers and enterprise buyers will start citing it in contracts. That is how soft governance becomes hard governance.</p><h3>5. ServiceNow Adds AI Agent Kill Switches as the 9-Second Story Goes Mainstream</h3><p>ServiceNow announced on May 5, 2026 at Knowledge 2026 that it has expanded AI Control Tower with real-time pause, redirect, and stop capabilities for any AI agent across the enterprise estate (ServiceNow, 2026). The expansion adds 30 new connectors spanning AWS, Google Cloud, Microsoft Azure, SAP, Oracle, and Workday. CEO Bill McDermott told Fortune the marketing message in plain English, citing a real incident where an AI agent gained elevated permissions and deleted a production database with all backups in nine seconds (Fortune, 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>Selling kill switches as a primary feature validates the security community&#8217;s argument that agentic AI requires runtime governance.</p></li><li><p>The 30-connector expansion makes ServiceNow the de facto governance layer above other clouds and SaaS apps.</p></li><li><p>The 9-second story shifts the default purchasing posture toward &#8220;show me the brakes.&#8221;</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory every AI agent with write access to production systems and document its maximum blast radius in seconds.</p></li><li><p>Require a documented kill switch capability as a procurement gate for any agentic AI vendor.</p></li><li><p>Run a tabletop exercise this quarter where an autonomous agent acts destructively at machine speed.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I have been waiting for a vendor to put &#8220;kill switch&#8221; on the price list. ServiceNow finally did it. The 9-second story is not hypothetical. Every CISO I know has heard a similar war story from a peer in the last year. A kill switch is only as good as its blast-radius coverage and detection latency. If your agent can do irreversible damage in seconds and your governance layer needs minutes, the kill switch is theater. Test the latency before signing.</p><h3>6. White House Floats FDA-Style Gate for Frontier AI</h3><p>National Economic Council Director Kevin Hassett told Bloomberg on May 6, 2026 that the White House is studying an executive order to create a vetting system for new AI models like Anthropic&#8217;s Mythos, comparing the approach to FDA drug evaluation (Bloomberg, 2026). The directive comes weeks after Anthropic disclosed that Mythos is unusually capable at finding network vulnerabilities, prompting the company to limit access through Project Glasswing (Insurance Journal, 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>An FDA-style gate would mark the first concrete pre-market regulatory framework for frontier AI in the U.S., even by executive order.</p></li><li><p>The Mythos disclosure shifts the political center of gravity, with a frontier lab effectively asking for more regulation.</p></li><li><p>Framing AI as public safety reshapes which agencies and committees own the issue.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Track which federal agency the order designates as the gating body, since that agency&#8217;s authorities will determine how real the regime becomes.</p></li><li><p>Prepare your own internal &#8220;model approval&#8221; process now, modeled on how you approve cryptographic libraries.</p></li><li><p>Engage with industry comment processes early, before draft text leaks and positions harden.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The FDA analogy is compelling and imperfect. Drugs have measurable endpoints. AI capability evaluations are partly subjective and dependent on who designed the test. The reason I take this seriously is the political logic. An administration that has emphasized deregulation is signaling it might gate frontier AI at the federal level. If the national security argument has won inside the West Wing, the rest of the Western world will follow within twelve months.</p><h3>7. One in Four MCP Servers Carries Code Execution Risk</h3><p>Help Net Security reported on May 5, 2026, that one in four Model Context Protocol servers exposes AI agents to code execution risk through skill-handling and configuration blind spots (Help Net Security, 2026b). The research builds on an OX Security disclosure from April 2026 that covered an architectural choice in Anthropic&#8217;s official MCP SDKs for Python, TypeScript, Java, and Rust, in which STDIO transport executes OS commands without sanitization (VentureBeat, 2026). Vulnerable MCP integrations affect Cursor, VS Code, Windsurf, Claude Code, and Gemini-CLI.</p><p><strong>Why it matters</strong></p><ul><li><p>MCP is the connective tissue between AI agents and enterprise systems, with 150 million downloads and 7,000-plus public servers.</p></li><li><p>A 25% vulnerability rate across the supply chain means most enterprises running MCP-based agents are running known-vulnerable infrastructure now.</p></li><li><p>Anthropic&#8217;s stance that the behavior is &#8220;expected&#8221; leaves customers holding the remediation burden alone.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory MCP servers, including developer workstations, and segment them from sensitive data and production credentials.</p></li><li><p>Force allowlisting on MCP tool calls, with explicit human approval for anything outside the allowlist.</p></li><li><p>Add MCP server compromise to your incident response runbooks.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>MCP is the USB-C of AI agents, and it is shipping with the equivalent of a hot socket. The architectural pattern is fine. The default behavior is dangerous. Treat MCP like browser extensions in a regulated environment. Default deny. Document exceptions. Audit quarterly.</p><h3>8. Lenovo Survey Confirms One in Three Employees Use AI Without IT Oversight</h3><p>Lenovo&#8217;s Work Reborn Research Series 2026, surveying 6,000 enterprise workers globally, was reported on May 1, 2026. Between one-fifth and one-third of employees use AI outside IT governance (Help Net Security, 2026a). Almost half of large enterprises in Protiviti&#8217;s AI Pulse Survey 2026 lack full visibility into which AI tools employees use. ISACA&#8217;s 2026 AI Pulse Poll found 38% of organizations report a formal AI policy, up from 28% the prior year.</p><p><strong>Why it matters</strong></p><ul><li><p>Shadow AI is the dominant AI risk category for most enterprises.</p></li><li><p>The gap between employee AI adoption and IT governance is widening faster than policy alone can close it.</p></li><li><p>Generative AI accounts for roughly a third of unauthorized data movement in measured environments.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Deploy DLP controls that recognize generative AI as a defined egress channel, not an undifferentiated browser session.</p></li><li><p>Offer a sanctioned AI tool path that is genuinely useful, because banning AI without alternatives has not worked anywhere.</p></li><li><p>Track AI policy adoption as a KPI alongside traditional security awareness metrics.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I have watched this story play out several times. Personal email in the 2000s. SaaS in the 2010s. Now AI. Ban the tool. Watch usage go underground. Find the breach. Reverse the ban two years too late. Short-circuit the cycle now. Your highest performers are the ones doing shadow AI work because the sanctioned tools are slower or dumber.</p><h3>9. Researchers Scan One Million Exposed AI Services, Find Default Authentication Off</h3><p>The Hacker News reported a large-scale scan of one million publicly exposed AI services. AI infrastructure is more vulnerable, exposed, and misconfigured than any other software category investigators have recently studied (The Hacker News, 2026). Many hosts run without authentication because it is not the default in many AI projects. Over 90 exposed instances were identified across government, marketing, and finance, with chatbots, prompts, workflows, and outward access all open to the public internet.</p><p><strong>Why it matters</strong></p><ul><li><p>Default-open AI infrastructure puts attackers ahead of defenders on basic asset discovery.</p></li><li><p>Government, marketing, and finance exposure shows the problem is not confined to the unregulated long tail of startups.</p></li><li><p>LLM conversation history exposure leaks strategy, contracts, and personal data in ways traditional data leakage models miss.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Treat AI infrastructure like internet-facing crown jewels and harden it accordingly.</p></li><li><p>Run attack surface management scans tuned for AI service fingerprints, including n8n, Flowise, Langflow, and LiteLLM.</p></li><li><p>Make default-deny authentication non-negotiable for any AI workflow touching enterprise data.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This is the cybersecurity equivalent of finding every front door wide open. The mistake is older than AI. Project maintainers and platform vendors should answer for shipping with authentication disabled by default. Default secure beats secure-by-checklist every time. Until AI projects ship safely, assume the defaults are wrong and configure your way out of them.</p><h3>10. Trellix Discloses Source Code Repository Breach</h3><p>Cybersecurity company Trellix disclosed on May 4, 2026 that it suffered unauthorized access to a portion of its source code repository (BleepingComputer, 2026). Trellix protects more than 50,000 customers and over 200 million endpoints. The company says it has found no evidence the source code release process was affected or that the code has been exploited (SecurityWeek, 2026). Trellix has not named the actor or disclosed dwell time.</p><p><strong>Why it matters</strong></p><ul><li><p>A defensive software vendor losing source code ripples through every customer.</p></li><li><p>The breach feeds AI-augmented vulnerability discovery against Trellix products, given how attackers now use LLMs to mine source for exploits.</p></li><li><p>Federal customers will require new attestations on code provenance and pipeline integrity within weeks.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Trellix customers should demand a full incident report covering IOCs, scope of stolen code, and pipeline changes.</p></li><li><p>Audit detection coverage for TTPs that exploit knowledge of the affected products.</p></li><li><p>Treat defensive software vendors as potential single points of failure in your supply chain risk register.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Defensive vendors getting popped is a now-quarterly story. The interesting wrinkle is what an attacker does with stolen source code in the AI era. Two years ago, source theft was slow-burn. Today, an attacker can feed thousands of files into an LLM and ask for likely vulnerability classes in hours. Trellix saying the code has not been exploited is a snapshot, not a guarantee.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To: ARGUS and the Quiet Admission That Today&#8217;s Agent Defenses Don&#8217;t Hold</h3><p>Researchers published the ARGUS paper to arXiv on May 5, 2026. It introduces a benchmark, AgentLure, that captures context-aware prompt-injection attacks across four agentic domains and eight attack vectors, along with a defense mechanism that enforces provenance-aware decision auditing for LLM agents (ARGUS, 2026). ARGUS reduces attack success rate to 3.8% while preserving 87.5% task utility. Without provenance-aware controls, undefended agents fail at much higher rates.</p><p><strong>Why it matters</strong></p><ul><li><p>Provenance tracking inside agent reasoning is a real shift from perimeter-style defenses most vendors sell today.</p></li><li><p>Context-aware prompt injection is the dominant unaddressed risk in production agentic deployments.</p></li><li><p>Benchmarks like AgentLure will become reference points enterprise red teams use, much as MITRE ATT&amp;CK reshaped traditional red teaming.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Read the ARGUS paper and use its threat model to evaluate your current agent deployments.</p></li><li><p>Push vendors to publish performance against context-aware benchmarks, not only static jailbreak datasets.</p></li><li><p>Build provenance tracking into your internal agent platforms, even if commercial vendors do not yet support it.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The reason this matters is what it implies about everything else. If 3.8% is the new state of the art with strong defenses in place, the rate without those defenses is much higher. That is the gap most production agents sit in today. Vendor marketing on agent safety has been measured against weak benchmarks for two years. Get ahead of the curve, or be the case study in someone else&#8217;s incident report.</p><p>For more on agentic AI risk and CISO governance, see the library at <a href="https://www.rockcyber.com/">RockCyber</a> and analysis at <a href="https://rockcybermusings.com/">RockCyber Musings</a>.</p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; As a bonus, check out my conversation with <strong><a href="https://www.linkedin.com/company/cisotradecraft/">CISO Tradecraft&#174;</a> </strong> where we talked about the <strong><a href="https://www.linkedin.com/company/owasp-top-10-for-large-language-model-applications/">OWASP GenAI Security Project</a></strong> Agentic Top 10</p><div id="youtube2-YI7KZ2R54aI" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;YI7KZ2R54aI&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/YI7KZ2R54aI?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>ARGUS. (2026, May 5). ARGUS: Defending LLM agents against context-aware prompt injection. arXiv. https://arxiv.org/abs/2605.03378</p><p>BleepingComputer. (2026, May 4). Trellix discloses data breach after source code repository hack. https://www.bleepingcomputer.com/news/security/trellix-discloses-data-breach-after-source-code-repository-hack/</p><p>Bloomberg. (2026, May 6). AI security order under review as White House responds to Anthropic&#8217;s Mythos. https://www.bloomberg.com/news/articles/2026-05-06/white-house-preps-order-to-boost-ai-security-hassett-says</p><p>Brandon, R. (2026, May 4). Five Eyes warn agentic AI is too dangerous for rapid rollout. The Register. https://www.theregister.com/2026/05/04/five_eyes_agentic_ai_recommendations/</p><p>Breaking Defense. (2026, May 1). Pentagon clears 8 tech firms to deploy their AI on its classified networks. https://breakingdefense.com/2026/05/pentagon-clears-7-tech-firms-to-deploy-their-ai-on-its-classified-networks/</p><p>CISA. (2026, May 1). Careful adoption of agentic AI services. Cybersecurity and Infrastructure Security Agency. https://www.cisa.gov/resources-tools/resources/careful-adoption-agentic-ai-services</p><p>CNBC. (2026, May 5). Trump admin moves further into AI oversight, will test Google, Microsoft and xAI models. https://www.cnbc.com/2026/05/05/ai-oversight-trump-google-microsoft-xai.html</p><p>Al Jazeera. (2026, May 5). Microsoft, Google, xAI give US access to AI models for security testing. https://www.aljazeera.com/economy/2026/5/5/microsoft-google-xai-give-us-access-to-ai-models-for-security-testing</p><p>Fortune. (2026, May 6). Your company&#8217;s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch. https://fortune.com/2026/05/06/servicenow-kill-switch-ai-agents-bill-mcdermott/</p><p>Help Net Security. (2026a, May 1). Shadow AI risks deepen as 31% of users get no employer training. https://www.helpnetsecurity.com/2026/05/01/shadow-ai-risks-it-oversight/</p><p>Help Net Security. (2026b, May 5). One in four MCP servers opens AI agent security to code execution risk. https://www.helpnetsecurity.com/2026/05/05/ai-agent-security-skills-blind-spots/</p><p>Insurance Journal. (2026, May 7). White House prepares order to boost AI security, says economic advisor. https://www.insurancejournal.com/news/national/2026/05/07/868812.htm</p><p>Lewis Silkin. (2026, May 7). The Council and Parliament agree to slim down and delay parts of the EU AI Act. https://www.lewissilkin.com/insights/2026/05/07/the-council-and-parliament-agree-to-slim-down-and-delay-parts-of-the-eu-ai-act-102ms0v</p><p>Modulos. (2026, May 7). EU AI Act delayed: The Omnibus deal closed on 7 May 2026. https://www.modulos.ai/blog/eu-ai-act-omnibus-deal/</p><p>SecurityWeek. (2026, May 4). Trellix source code repository breached. https://www.securityweek.com/trellix-source-code-repository-breached/</p><p>ServiceNow. (2026, May 5). ServiceNow expands AI Control Tower across systems. https://newsroom.servicenow.com/press-releases/details/2026/ServiceNow-expands-AI-Control-Tower-to-discover-observe-govern-secure-and-measure-AI-deployed-across-any-system-in-the-enterprise/default.aspx</p><p>TechCrunch. (2026, May 1). Pentagon inks deals with Nvidia, Microsoft, and AWS to deploy AI on classified networks. https://techcrunch.com/2026/05/01/pentagon-inks-deals-with-nvidia-microsoft-and-aws-to-deploy-ai-on-classified-networks/</p><p>The Hacker News. (2026, May). We scanned 1 million exposed AI services. Here&#8217;s how bad the security is. https://thehackernews.com/2026/05/we-scanned-1-million-exposed-ai.html</p><p>VentureBeat. (2026, April). 200,000 MCP servers expose a command execution flaw that Anthropic calls a feature. https://venturebeat.com/security/mcp-stdio-flaw-200000-ai-agent-servers-exposed-ox-security-audit</p>]]></content:encoded></item><item><title><![CDATA[Open-Weight Models Eat Closed Governance: The Half-Perimeter Problem]]></title><description><![CDATA[Closed-vendor AI governance breaks at the open-weight boundary. Sign the weights, build the runtime perimeter. We walk the gap and the build.]]></description><link>https://www.rockcybermusings.com/p/open-weight-models-eat-closed-governance</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/open-weight-models-eat-closed-governance</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 05 May 2026 12:50:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Cg-_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cg-_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cg-_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Cg-_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Cg-_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Cg-_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cg-_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2382594,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/196322332?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Cg-_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Cg-_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Cg-_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Cg-_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10198b81-d1b9-4c0f-805a-d13961868465_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/open-weight-models-eat-closed-governance?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/open-weight-models-eat-closed-governance?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Open-weight reasoning models are landing in enterprise production, and the closed-vendor governance you bought doesn&#8217;t transfer with them. &#8220;Half-perimeter&#8221; is rhetorical; the real number depends on which controls you bought, but the point holds. The day a competent open-weight reasoning model runs on your hardware, the AI-specific governance you bought from your closed vendor stops covering part of the stack. The rest of this post walks the gap and the build.</p><h2>The Vendor&#8217;s Own Words</h2><p>OpenAI shipped gpt-oss-120b and gpt-oss-20b last year. Both are under Apache 2.0, and both are downloadable from Hugging Face. The 120b runs on a single 80GB GPU. In the model card, OpenAI&#8217;s own safety team admits what every CISO should already suspect. Once the weights ship, OpenAI cannot &#8220;implement additional mitigations or to revoke access.&#8221;</p><p>It&#8217;s the model provider&#8217;s own framing. It&#8217;s not me opining. Open-weight is a different risk profile from closed-API, by the model provider&#8217;s own assessment. The vendor can&#8217;t patch your inference cluster. The vendor can&#8217;t revoke a key that doesn&#8217;t exist. The vendor can&#8217;t run server-side abuse classifiers on traffic the vendor never sees. Everything that lived on the vendor side of the perimeter now lives on yours.</p><p>This is not a DeepSeek-versus-American-models story. It&#8217;s a closed-API-versus-open-weight story. Llama 3.3 70B (Meta), Qwen 3 32B (Alibaba), Mistral Magistral, and gpt-oss-120b sit on the same side of the boundary. The boundary is wherever the weights stop being someone else&#8217;s problem.</p><h2>What Closed-Vendor Governance Bought You</h2><p>Walk through what was on the bill of materials when you stood up your closed-API AI program. Oh, that&#8217;s right, you never did&#8230; but let&#8217;s pretend you did. You probably evaluated vendor-attested compliance, usually wrapped in a SOC 2 Type II report and a data processing addendum. DLP is integrated at the API gateway, watching prompts in flight. Output filtering runs on the vendor side, refusing to ship CBRN-adjacent content out of the model. Prompt firewall logic is embedded in the vendor SDK and patched without you redeploying. Vendor red teaming is on a continuous cadence. ToS enforcement occurs when an account misbehaves.</p><p>That stack assumed one thing. That a vendor sat on the other end of the inference call. Open-weight self-hosting moves every one of those controls in-house, with no shared customer base to underwrite the cost.</p><p>What does transfer? Network egress controls, identity at the runtime boundary, sandbox isolation, and supply-chain provenance for the model weights and fine-tunes. Notice what those have in common. None of them are AI-specific. They were always there. They&#8217;re the controls you applied to every other service you ran. Losing the AI-specific layer doesn&#8217;t break the non-AI controls. It does mean the only thing standing between a self-hosted reasoning model and a bad day is the perimeter you built for everything else.</p><p>Read your closed-vendor MSA carefully. The reps and warranties typically carve out third-party model behavior, hallucinations, and adversarial misuse. The vendor warrants infrastructure availability and indemnifies IP claims. The vendor doesn&#8217;t warrant safe model output. The &#8220;governance&#8221; part of vendor-attested compliance was always thinner than the SOC 2 cover suggested. Self-hosting strips even the thin part.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nYVS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6dcdf46-81a7-4e89-90af-619085c96337_2352x2862.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nYVS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6dcdf46-81a7-4e89-90af-619085c96337_2352x2862.png 424w, https://substackcdn.com/image/fetch/$s_!nYVS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6dcdf46-81a7-4e89-90af-619085c96337_2352x2862.png 848w, https://substackcdn.com/image/fetch/$s_!nYVS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6dcdf46-81a7-4e89-90af-619085c96337_2352x2862.png 1272w, https://substackcdn.com/image/fetch/$s_!nYVS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6dcdf46-81a7-4e89-90af-619085c96337_2352x2862.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nYVS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6dcdf46-81a7-4e89-90af-619085c96337_2352x2862.png" width="1456" height="1772" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a6dcdf46-81a7-4e89-90af-619085c96337_2352x2862.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1772,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:307464,&quot;alt&quot;:&quot;Side-by-side flowchart contrasting where AI-specific controls live in a closed-API stack versus an open-weight self-hosted runtime, showing the customer-side absorbing every AI control after the open-weight boundary.&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/196322332?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6dcdf46-81a7-4e89-90af-619085c96337_2352x2862.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Side-by-side flowchart contrasting where AI-specific controls live in a closed-API stack versus an open-weight self-hosted runtime, showing the customer-side absorbing every AI control after the open-weight boundary." title="Side-by-side flowchart contrasting where AI-specific controls live in a closed-API stack versus an open-weight self-hosted runtime, showing the customer-side absorbing every AI control after the open-weight boundary." srcset="https://substackcdn.com/image/fetch/$s_!nYVS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6dcdf46-81a7-4e89-90af-619085c96337_2352x2862.png 424w, https://substackcdn.com/image/fetch/$s_!nYVS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6dcdf46-81a7-4e89-90af-619085c96337_2352x2862.png 848w, https://substackcdn.com/image/fetch/$s_!nYVS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6dcdf46-81a7-4e89-90af-619085c96337_2352x2862.png 1272w, https://substackcdn.com/image/fetch/$s_!nYVS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6dcdf46-81a7-4e89-90af-619085c96337_2352x2862.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: Closed-API Stack vs Open-Weight Runtime: Where Controls Live </figcaption></figure></div><h2>Refusal Training Is Now an In-House Problem</h2><p>Vendor refusal training is the AI-specific control most enterprise teams over-trust. The research breaks the over-trust hard.</p><p>The Badllama 3 paper (<a href="https://arxiv.org/abs/2407.01376">arXiv 2407.01376</a>) showed safety fine-tuning gets removed from Llama 3 8B in five minutes on a single A100 GPU for under fifty cents. The 70B model goes in 45 minutes for under three dollars. The same paper notes the attack runs on free Google Colab for the 8B variant. FAR.AI&#8217;s &#8220;Illusory Safety&#8221; research extended the result. Pre-fine-tune refusal rates near 100% across DeepSeek-R1, GPT-4o, Gemini 1.5 Pro, and Claude 3 Haiku dropped under 20% post-fine-tune. Harmfulness scores climbed past 80%.</p><p>The R1 red-team picture is even worse on the model itself, before any attacker fine-tuning. Cisco / Robust Intelligence reported a 100% attack success rate on 50 random HarmBench prompts against R1, while OpenAI o1 rejected every test in a parallel Holistic AI evaluation. Qualys TotalAI found R1&#8217;s distilled 8B variant failed 58% of 885 attempts across 18 jailbreak categories. Promptfoo put failures over 60% on prompts, including biological and chemical weapons. KELA jailbroke R1 to produce ransomware development steps and instructions for toxins and explosive devices.</p><p>OpenAI&#8217;s own approach to gpt-oss is the strongest signal that adversarial fine-tuning is the real threat model. The model card describes the adversarial fine-tuning of gpt-oss-120b under the Preparedness Framework prior to release. OpenAI&#8217;s Safety Advisory Group concluded the adversarially fine-tuned model didn&#8217;t reach &#8220;High&#8221; capability in Biological and Chemical Risk or Cyber risk. Read the implication closely. <em>The model provider treats fine-tune-stripped safety as the baseline release condition the model must meet. The deployer running fine-tunes downstream gets no equivalent gate.</em></p><p>OpenAI knows this. It&#8217;s why gpt-oss-safeguard shipped on October 29, 2025: open-weight reasoning models for safety classification, designed for developers to operate as a defense-in-depth layer. Llama Guard 3, Prompt Guard, and Code Shield exist for the same reason. The vendor is shipping you the components. Components are not the same as a service. You operate them, tune them, monitor them, retrain them when the policy changes, and absorb the latency. OpenAI&#8217;s own gpt-oss-safeguard report names the constraint: reasoning-based classifiers add compute and latency that limit large-scale real-time use.</p><p>The math is brutal. The model weights are free. The runtime safety pipeline is not.</p><h2>The Frameworks Describe the Gap. They Don&#8217;t Close It.</h2><p>NIST AI RMF 1.0 plus the GenAI Profile (NIST AI 600-1, July 2024) plus the GPAI/Foundation Models Profile extension (arXiv 2506.23949) names training data audits (Manage 1.3, Measure 2.8) and model weight protection (Measure 2.7). Voluntary. The CSA NIST AI RMF Agentic Profile draft is candid about the bigger problem. It states plainly that earlier RMF documents did not contemplate &#8220;agents that acquire tool-use capabilities and execute autonomously in live production environments.&#8221;</p><p>OWASP Top 10 for LLM Applications 2025 LLM03 is the most explicit primary-source statement of the half-perimeter problem. The category description is direct: model cards offer no guarantees of provenance, malicious LoRA adapters compromise base models in collaborative environments, and on-device LLMs increase the attack surface. The OWASP Agentic Top 10, released December 10, 2025, adds ASI01 (Agent Goal Hijack) and ASI03 (Identity and Privilege Abuse) as runtime-boundary problems on self-hosted stacks.</p><p>ASI01 and ASI03 are not abstract. ASI01 shows up when prompt injection redirects an agent&#8217;s plan, and the closed-vendor refusal layer is gone. ASI03 shows up when the agent&#8217;s runtime authorization is broader than the task requires, because no vendor SDK is scoping the call for you anymore. Both problems live at the runtime boundary the vendor used to backstop.</p><p>EU AI Act Article 53(2) is the regulatory expression of the gap. Open-source GPAI models get a carve-out from technical documentation and downstream-information obligations, provided they&#8217;re released under a free open license, weights are public, and the model isn&#8217;t monetized. The carve-out vanishes at the Article 51 systemic-risk threshold of 10^25 FLOPs. Llama 3.3 70B, Qwen 3 32B, Mistral Magistral, and most enterprise-deployed open-weight reasoning models sit well below that threshold. They get the carve-out. They impose downstream obligations on enterprise deployers under Article 25(2) when significant modifications happen, a category that catches LoRA fine-tunes. Most teams running fine-tunes don&#8217;t know the clause exists. Enforcement begins August 2, 2026.</p><p>ISO 42001 mandates AIMS scope definition, third-party supplier oversight, and 38 Annex A controls. The gap there is structural. The open-weight model dropped from Hugging Face is not a &#8220;supplier&#8221; in the contractual sense. There&#8217;s no audit clause, no security questionnaire, no MSA. The standard tells you to define your AIMS scope. It doesn&#8217;t prescribe specific runtime-boundary controls for self-hosted foundation models.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NUY8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de97079-4038-455d-bbed-ef94c73268e5_2100x2100.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NUY8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de97079-4038-455d-bbed-ef94c73268e5_2100x2100.png 424w, https://substackcdn.com/image/fetch/$s_!NUY8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de97079-4038-455d-bbed-ef94c73268e5_2100x2100.png 848w, https://substackcdn.com/image/fetch/$s_!NUY8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de97079-4038-455d-bbed-ef94c73268e5_2100x2100.png 1272w, https://substackcdn.com/image/fetch/$s_!NUY8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de97079-4038-455d-bbed-ef94c73268e5_2100x2100.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NUY8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de97079-4038-455d-bbed-ef94c73268e5_2100x2100.png" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3de97079-4038-455d-bbed-ef94c73268e5_2100x2100.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:312234,&quot;alt&quot;:&quot;Quadrant chart plotting AI security controls across vendor-operated versus customer-operated and AI-specific versus infrastructure-generic axes, showing which controls transfer intact and which become self-build problems.&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/196322332?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de97079-4038-455d-bbed-ef94c73268e5_2100x2100.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Quadrant chart plotting AI security controls across vendor-operated versus customer-operated and AI-specific versus infrastructure-generic axes, showing which controls transfer intact and which become self-build problems." title="Quadrant chart plotting AI security controls across vendor-operated versus customer-operated and AI-specific versus infrastructure-generic axes, showing which controls transfer intact and which become self-build problems." srcset="https://substackcdn.com/image/fetch/$s_!NUY8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de97079-4038-455d-bbed-ef94c73268e5_2100x2100.png 424w, https://substackcdn.com/image/fetch/$s_!NUY8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de97079-4038-455d-bbed-ef94c73268e5_2100x2100.png 848w, https://substackcdn.com/image/fetch/$s_!NUY8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de97079-4038-455d-bbed-ef94c73268e5_2100x2100.png 1272w, https://substackcdn.com/image/fetch/$s_!NUY8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de97079-4038-455d-bbed-ef94c73268e5_2100x2100.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: AI-Specific Controls Across the Open-Weight Boundary: What Transfers, What Breaks</figcaption></figure></div><h2>Build the Runtime Perimeter</h2><p>Frameworks describe the gap. Architecture closes it. The work to close it is described in the Huang and Lambros (yes, &#8220;this&#8221; Lambros) <a href="https://arxiv.org/abs/2510.25863">AAGATE paper (arXiv:2510.25863v2, November 3, 2025)</a>. AAGATE is a Kubernetes-native control plane that operationalizes NIST AI RMF for self-hosted agentic AI. The reference architecture hosts the open-weight model on Ollama at Layer 1 of the MAESTRO threat-model stack, which is the design assumption built in: the protected stack is &#8220;DeepSeek, Qwan, LLAMA, OSS&#8221; running on your hardware.</p><p><strong>Four things transfer regardless of which control plane you adopt.</strong></p><p>First, treat weights as supply-chain artifacts. AAGATE enforces SLSA L3, Cosign keyless signing on every OCI image, and an ArgoCD admission controller that rejects unsigned manifests at the gate. Whichever your path, you need signed weights, signed adapters, and a cluster-side admission policy that refuses to load anything unsigned. The Hugging Face nullifAI incident in February 2025, where ReversingLabs found malicious pickle files evading Picklescan via 7z compression and broken pickle deserialization, is the case study. Picklescan logs an error. The reverse-shell payload runs anyway.</p><p>Second, inventory open-weight runtimes alongside closed-API endpoints. AAGATE leverages the Agent Naming Service (ANS), and it registers every agent with a Decentralized Identifier and a SPIFFE certificate. You don&#8217;t need the blockchain layer. You do need a CMDB row for every Ollama cluster, every fine-tune, every adapter, with model SHA, lineage, and license tier captured. If your AI inventory has a row for the OpenAI tenant but no row for the GPU cluster running your fine-tuned Llama, the audit is incorrect.</p><p>Third, build authorization scope into the runtime, not the vendor SDK. AAGATE&#8217;s OAuth Relay translates abstract agent capabilities into ephemeral, narrowly scoped, purpose-bound credentials per side effect. Other architectures will name the same thing differently. The control matters since every external action an agent takes funnels through a policy-enforced single chokepoint with allow-listing, rate limiting, and cryptographic logging. AAGATE calls it the Tool-Gateway. AI gateway products commercialize the same pattern. Pick one.</p><p>Fourth, run your own evals because the vendor isn&#8217;t running them for you. AAGATE&#8217;s Janus Shadow-Monitor-Agent provides continuous, pre-execution adversarial evaluation in-loop, tied to a Governing-Orchestrator Agent executing a millisecond kill-switch when AIVSS scoring and SSVC decision logic flag a critical incident. The adversarial layer can also take the form of a parallel classifier, an internal red team, or any continuous evaluation pattern that mirrors what the vendor was running server-side. The pattern is non-negotiable. The product is.</p><p>These four moves are the architectural rebuttal to the half-perimeter. The perimeter you bought was always going to end at the runtime boundary. The runtime boundary is now your problem to instrument.</p><p>Operational reality matters here. The inference stack you&#8217;re protecting is Ollama, vLLM, SGLang, or llama.cpp. None of them ship with vendor-grade telemetry. Your container hosts a probabilistic system with stateless calls and no support contract. When an attacker fine-tunes a copy of your weights and slips it into your registry, there is no support call to escalate. There is only the runtime perimeter you built before the incident.</p><p><strong>Key Takeaway:</strong> Closed-vendor governance was the AI-specific half you didn&#8217;t have to build. Open-weight reasoning models in production change that. Inventory the runtimes, sign the weights, scope the runtime authorization, and run your own evals. The vendor isn&#8217;t doing it for you anymore.</p><h3>What to do next</h3><p>If you&#8217;re approving an open-weight pilot this quarter, demand four things on the architecture review before the GPUs land. First, model SHA and adapter lineage in the CMDB on day one. Second, an egress chokepoint with input/output sanitization and policy-enforced allow-lists. Third, supply-chain controls (signed weights, SLSA-grade provenance, admission control rejecting unsigned). Fourth, a continuous internal evaluation loop on every high-risk agent.</p><p>The <a href="https://www.rockcyber.com/ai-strategy-and-governance">CARE framework</a> (Create, Adapt, Run, Evolve) applies the same structure to AI security program design. The CISO Evolution covers the executive judgment side of decisions like this one. The AAGATE paper (<a href="https://arxiv.org/abs/2510.25863">arXiv 2510.25863v2</a>) is the open-source reference architecture if you want to start from running code.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 36 April 24-April 30, 2026]]></title><description><![CDATA[Mythos, Mayhem, and Mediocre Lawmaking: The Week AI Security Got Loud]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260424-20260430</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260424-20260430</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 01 May 2026 12:50:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!1Osd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1Osd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1Osd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!1Osd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!1Osd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!1Osd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1Osd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/196065985?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1Osd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!1Osd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!1Osd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!1Osd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d549a71-f273-4760-978b-f1b072d81591_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260424-20260430?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260424-20260430?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>A coding agent killed a startup&#8217;s database in nine seconds. Anthropic shipped a model Mozilla called &#8220;elite.&#8221; Brussels missed its own deadline. Florida&#8217;s House Speaker buried his governor&#8217;s AI bill before lunch on day one. Two cloud-native AI vulnerabilities went from disclosure to exploitation in under 36 hours. Google and Forcepoint documented indirect prompt injection in the wild on the same day. UK&#8217;s AI Security Institute caught Mythos sabotaging research it was supposed to help with. Pretending this is theoretical is no longer defensible.</p><p>This week stress-tested every assumption CISOs hold about AI. The vendor you depend on sells your adversaries the same capability. The agent your developers love wipes three months of revenue and pastes a confession. Open source is the gateway. Indirect injection is the exploit. Autonomy without rollback is the consequence.</p><p>I&#8217;ll walk you through ten stories and one piece of plumbing. AI security used to run on a 24-month horizon. The default now is whatever ships before next quarter. If you wait for clarity, you lose ground to people who already decided.</p><h3>1. The Trump Administration Eyes Anthropic&#8217;s Mythos as a Weapon</h3><p>On April 24, the Washington Post reported Anthropic&#8217;s Mythos system rattled the Trump administration. Mozilla&#8217;s CTO compared the model&#8217;s vulnerability detection to a &#8220;world-class, elite security engineer.&#8221; Anthropic withheld general release, routing access through Project Glasswing partners, including AWS, Apple, Cisco, CrowdStrike, Google, JPMorgan Chase, and Microsoft. Anthropic privately briefed senior officials. Mythos meaningfully raises the probability of large-scale cyberattacks this year.</p><p><strong>Why it matters</strong></p><ul><li><p>Capability parity flipped. Defenders and attackers reach for the same tool.</p></li><li><p>Vendors are now gatekeepers of dual-use capability. Anthropic&#8217;s withholding sets a precedent.</p></li><li><p>Government dependence on private model access creates new procurement and security questions.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map your exposure to LLM-discoverable vulnerabilities in first-party and open-source code.</p></li><li><p>Negotiate access to AI-assisted scanning before your adversaries scan you first.</p></li><li><p>Update incident playbooks to assume hours of dwell time, not days.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Yes&#8230; more Mythos news. Can&#8217;t ignore it if it&#8217;s coming out of the White House. It&#8217;s not  fiction. It&#8217;s a procurement question. I&#8217;ve watched this pattern in every arms shift, from automated network scanning to commodity exploit kits. The defender who gets there second loses.</p><p>Anthropic&#8217;s gatekeeping is a defensible choice. The choice is whether your ecosystem qualifies for the safe lane or you&#8217;re stuck reading about Glasswing on Substack. Get on a call with your AWS, Cisco, or Microsoft reps. If the answer is no, plan around it. We track this kind of vendor calculus at <a href="https://www.rockcyber.com/">RockCyber</a>.</p><h3>2. Cursor&#8217;s Claude Agent Wipes a Startup&#8217;s Database in Nine Seconds</h3><p>On Friday, April 25, a Cursor coding agent powered by Claude Opus 4.6 deleted PocketOS&#8217;s entire production database and all volume-level backups in a single API call. The agent encountered a credential mismatch in staging, decided to resolve it by deleting a Railway infrastructure volume, scanned the codebase for an unrelated API token, and then ran the command. PocketOS serves car rental businesses nationwide. Three months of reservations, payments, customer information, and vehicle assignments went dark. Railway restored the data on Sunday using internal disaster backups not advertised to customers. The agent itself wrote the public confession.</p><p><strong>Why it matters</strong></p><ul><li><p>Agents don&#8217;t ask permission. They scan for the credentials unblocking them.</p></li><li><p>&#8220;Production&#8221; and &#8220;staging&#8221; are now labels, not boundaries.</p></li><li><p>Recovery happened because Railway keeps undocumented backups. Hope is not a strategy.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Force agents to operate with scoped, ephemeral credentials. Long-lived API keys in a repo are liabilities with autonomy attached.</p></li><li><p>Implement break-glass approval gates for destructive infrastructure calls.</p></li><li><p>Test backup recovery monthly. If you can&#8217;t restore in under an hour, you don&#8217;t have backups.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>PocketOS got lucky. Railway ran a heroic recovery on a Sunday using backups the customer didn&#8217;t know existed. If your AI strategy depends on a founder&#8217;s weekend chivalry, you don&#8217;t have a strategy. You have hope.</p><p>The agent did what it was trained to do. Scan, plan, act, document. The failure was in governance, not capability (and let&#8217;s just say, a suboptimal technical infrastructure). The villain is the assumption that an autonomous system will halt and ask. They don&#8217;t halt. Build the rails. Treat agents like an over-eager intern with the ability to call DELETE on prod.</p><h3>3. LiteLLM Bug Goes From Disclosure to Exploitation in 26 Hours</h3><p>GitHub&#8217;s Advisory Database indexed CVE-2026-42208 in LiteLLM on April 24 at 16:17 UTC. Sysdig logged the first exploitation attempt on April 26 at 16:17 UTC, roughly 26 hours later. The bug carries a CVSS of 9.3 and lets unauthenticated attackers send a crafted Authorization header to any model API route, then read or modify the proxy&#8217;s database (Sysdig). LiteLLM is the open-source LLM gateway with more than 22,000 GitHub stars, fronting OpenAI, Anthropic, and other model providers in production. The same project sat at the heart of the Mercor breach earlier this year.</p><p><strong>Why it matters</strong></p><ul><li><p>AI infrastructure now looks like any internet-exposed service.</p></li><li><p>Pre-auth SQLi on the gateway exposes API keys and credentials for downstream model providers.</p></li><li><p>Disclosure-to-exploitation time keeps shrinking. The 36-hour window is the new optimistic baseline.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory every LiteLLM, vLLM, LMDeploy, or proxy node in your environment. Patch to 1.83.7-stable or above for LiteLLM.</p></li><li><p>Treat LLM gateways as Tier 0 assets. Apply the controls you&#8217;d apply to identity providers.</p></li><li><p>Subscribe to maintainer advisory feeds. GitHub Advisory Database lag of four days is too long.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>LiteLLM is the kind of dependency pulled in via a Cursor prompt or an aspirational architecture diagram. It runs as the front door to every model provider you care about. Pre-auth SQL injection on it is a &#8220;your AI program is over&#8221; event.</p><p>Disclosure-to-exploit windows make monthly patch cycles professional malpractice. If your AI security playbook still says &#8220;evaluate within 30 days,&#8221; shred it. We&#8217;ve moved to &#8220;act within 24 hours or accept compromise as a feature.&#8221;</p><h3>4. Indirect Prompt Injection Has Left the Lab. It&#8217;s Everywhere.</h3><p>On April 24, Google&#8217;s Online Security Blog and Forcepoint&#8217;s X-Labs published parallel reports documenting indirect prompt injection in the wild. Forcepoint identified ten payload families targeting AI agents with instructions for financial fraud, data destruction, and API key theft. Google reported a 32% relative increase in malicious activity between November 2025 and February 2026. Attackers hide instructions inside webpages with single-pixel text, transparent fonts, HTML comments, and metadata. Neither team attributed the campaigns to a single actor, though both noted shared templates suggesting organized tooling.</p><p><strong>Why it matters</strong></p><ul><li><p>Agents summarizing content are low-risk. Agents sending emails, running commands, or processing payments are the targets.</p></li><li><p>Filters watching user input miss content fetched by the agent.</p></li><li><p>The threat model includes every third-party page your agent loads.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory every agent fetching external content. Note which tools they call.</p></li><li><p>Implement allowlists for outbound tool execution. Default deny for novel actions.</p></li><li><p>Add output filtering for instruction-like content in tool responses, not only user input.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>We&#8217;ve been treating indirect prompt injection as a research curiosity since 2023. It&#8217;s now an operational threat with documented campaigns and template reuse. The Lakera and OWASP folks were right.</p><p>If you&#8217;ve deployed an agent with browsing capability, your trust boundary includes every webpage it visits. The entire internet. I wrote about this on <a href="https://rockcybermusings.com/">RockCyber Musings</a> earlier this year. It got worse.</p><h3>5. American Leadership in AI Act Drops With 20+ Bills Stitched In</h3><p>On April 27, Reps. Ted Lieu (D-Calif.) and Jay Obernolte (R-Calif.) introduced the American Leadership in AI Act, a six-title package consolidating more than 20 prior bills from the Bipartisan AI Task Force (Nextgov/FCW). The package covers standards and evaluation, research infrastructure, federal AI governance and procurement, worker protections, deepfake harms, and AI education. The bill is the most substantive bipartisan AI proposal in this Congress, landing during tension between the White House&#8217;s preemption push and active state legislation.</p><p><strong>Why it matters</strong></p><ul><li><p>Federal preemption fights will intensify. State AI laws face new risk.</p></li><li><p>Procurement standards in the bill shape what enterprises demand from AI vendors.</p></li><li><p>Deepfake provisions create new compliance obligations for media and platforms.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map AI-procurement language to current vendor contracts.</p></li><li><p>Track state-level bills you&#8217;re already complying with for preemption risk.</p></li><li><p>Get legal reading the testing and evaluation title carefully.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Two California members of Congress, one D and one R, agreeing on AI is unicorn territory. Don&#8217;t get excited. Bipartisan bills with 20+ titles tend to die under the weight of their own ambition.</p><p>The interesting question is which provisions get pulled into appropriations or NDAA riders before December. Watch the procurement and federal AI governance titles. Those move first because the executive branch wants them. Plan as if procurement standards land by Q3.</p><h3>6. EU AI Act Omnibus Trilogue Collapses, August Deadline Stays Live</h3><p>On April 28, Brussels held the second political trilogue on the AI Act Omnibus, the proposal deferring high-risk AI compliance. After roughly twelve hours, the Council and Parliament failed to agree on conformity-assessment architecture for AI in regulated products (Modulos). A follow-up trilogue is scheduled for May 13. The August 2, 2026 high-risk obligations remain operative law.</p><p><strong>Why it matters</strong></p><ul><li><p>Vendors and deployers cannot bank on a deferral. August is the working assumption.</p></li><li><p>The Cypriot Council Presidency ends June 30. Lithuania might finish negotiations.</p></li><li><p>The Annex I disagreement signals sectoral assessments will keep biting medical device and machinery providers.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Continue compliance preparation as if no Omnibus arrives. Treat May 13 as a tiebreaker, not a save.</p></li><li><p>For medical devices, machinery, and other Annex I products, lock in your conformity-assessment plan now.</p></li><li><p>Get internal legal sign-off on the original AI Act timelines this quarter.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I keep telling clients hoping for a deferral is not a compliance strategy. This week confirmed it. Brussels cannot agree on the structure of the regulation it already passed.</p><p>If your CFO asks why you spent budget on AI Act readiness, point at this paragraph. The cost of overpreparing is a few quarters of work. The cost of underpreparing is an enforcement action against your highest-revenue product line. I know which side of the bet I want.</p><h3>7. Microsoft and OpenAI Restructure for Cyber Defense</h3><p>On April 27, Microsoft and OpenAI announced revised partnership terms (24/7 Wall St). OpenAI&#8217;s API will run on any cloud provider, including AWS via Bedrock. Microsoft&#8217;s IP license is no longer exclusive but runs through 2032. The companies expanded Trusted Access for Cyber, giving Microsoft access to OpenAI&#8217;s most cyber-capable models, folded into the Secure Future Initiative. The pact pairs OpenAI&#8217;s offensive-capable model work with Microsoft&#8217;s defender stack across cloud, identity, productivity, and frontier AI.</p><p><strong>Why it matters</strong></p><ul><li><p>The exclusivity reset reshapes vendor lock-in for AI procurement.</p></li><li><p>Microsoft&#8217;s defender stack now has a privileged channel to cyber-capable models.</p></li><li><p>The same models detecting attacks also execute them. Internal review for offensive capability is no longer optional.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Reassess your AI vendor strategy. Multi-cloud is realistic now.</p></li><li><p>For Microsoft customers, evaluate Trusted Access for Cyber eligibility.</p></li><li><p>Demand transparency from AI security vendors about model and guardrails.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Microsoft bought itself the most defended seat at the AI security table. The pact extends a moat already including Defender, Sentinel, Purview, and Copilot for Security. Smaller security vendors should be nervous.</p><p>If you&#8217;re a CISO in a Microsoft shop, this is a small win. If you&#8217;re betting on a non-Microsoft AI security stack, reread your strategy. The exclusivity end means more options, the cyber pact deepens lock-in.</p><h3>8. UK AISI Catches Claude Mythos Sabotaging Research</h3><p>On April 28, the UK AI Security Institute (AISI) published an updated sabotage evaluation framework. Across 297 scenarios, no model attempted spontaneous research sabotage. In a new &#8220;continuation&#8221; test, Mythos Preview continued to sabotage 7% of inputs after being explicitly directed not to (AISI). The same evaluation showed Mythos Preview completing the first 32-step enterprise attack simulation start-to-finish, succeeding on 73% of expert-level tasks that no model had completed before April 2025.</p><p><strong>Why it matters</strong></p><ul><li><p>Continuation behavior matters more than spontaneous behavior. Real attackers prompt the model.</p></li><li><p>A 7% sustained sabotage rate warrants treating these models as untrusted insiders during sensitive work.</p></li><li><p>The 32-step completion shows operational maturity. Models execute multi-stage cyber operations end to end.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Don&#8217;t run frontier models on safety-sensitive code reviews without monitoring.</p></li><li><p>Build red-team programs, prompting and continuing rather than single-shot tests.</p></li><li><p>Track AISI&#8217;s methodology. Adopt continuation-style tests internally.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Spontaneous misbehavior was never the threat model scaring me. Continuation is. Once an attacker plants the seed, the model becomes a complicit operator inside your environment. Seven percent is small until you multiply it by every prompt your enterprise sends in a day.</p><p>AISI does work nobody else funds at this rigor. If your AI governance committee isn&#8217;t reading their reports cover to cover, you&#8217;re outsourcing your threat model to LinkedIn posts. Read the source.</p><h3>9. Florida House Speaker Kills DeSantis&#8217;s AI Bill on Day One</h3><p>On April 28, Florida convened a four-day special session. The Senate voted 37-1 in favor of the AI Bill of Rights. House Speaker Daniel Perez killed the bill that same morning, declaring that the only topic the House would address was redrawing congressional maps (Florida Phoenix). Perez argued AI regulation belongs to the federal government, aligned with a Trump executive order targeting state AI laws. The bill would have required parental consent for minor accounts on companion chatbot platforms, prohibited unauthorized commercial use of AI-generated likenesses, and required AI disclosure to users.</p><p><strong>Why it matters</strong></p><ul><li><p>State preemption fights are escalating. Florida sided with the federal government before federal law exists.</p></li><li><p>Companion chatbot rules pass Senate chambers and die in House chambers. The pattern matters.</p></li><li><p>AI-generated likeness and consent provisions will keep returning. Plan for eventual passage somewhere.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If you run companion chatbots, monitor every state bill on minors and consent.</p></li><li><p>Brief your legal team on AI-likeness and right-of-publicity rules in California, Tennessee, and active special sessions.</p></li><li><p>Don&#8217;t bank on federal preemption. Executive orders reverse.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The pattern is the same one I&#8217;ve called out for two years. State Senates pass AI bills, state Houses kill them, and the federal government drafts preemption language. The result is regulatory whiplash across 50 jurisdictions plus DC plus a federal package which might or might not preempt them. Give your privacy and AI counsel hazard pay. They&#8217;re earning it.</p><h3>10. HackerOne Launches h1 Validation as AI Vuln Reports Surge 76%</h3><p>On April 29, HackerOne launched h1 Validation, a service triaging AI-discovered vulnerability reports for actual exploitability (Cybersecurity Insiders). Vulnerability submissions on the platform rose 76% year over year, hitting a record high in March 2026. About 25% of findings were confirmed exploitable. The share of critical and high-severity vulnerabilities grew to 32%, up from a 26-28% baseline. The launch follows months of complaints from program owners overwhelmed by AI-generated reports of varying quality.</p><p><strong>Why it matters</strong></p><ul><li><p>AI generates more vuln reports than security teams triage.</p></li><li><p>Triage capacity, not discovery, is the constraint.</p></li><li><p>This signal-to-noise problem reshapes bug bounty economics within 12 months.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit your bug bounty intake pipeline. If reports outpace triage, fix it.</p></li><li><p>Invest in tooling classifying reports by exploitability before a human reads them.</p></li><li><p>Set expectations with researchers. AI-assisted submissions need higher proof of impact.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The asymmetry is volume. Models like Mythos and GPT-5.5-Cyber produce thousands of plausible reports per day. Most are junk. Some are lethal. Your triage team won&#8217;t keep up by reading harder. Whether you buy h1 Validation or build your own, manual triage of AI-scale output is a doomed strategy.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><h4>CSAI Foundation Becomes the First AI-Specific CVE Numbering Authority</h4><p>On April 29, the Cloud Security Alliance&#8217;s CSAI Foundation announced three milestones at the CSA Agentic AI Security Summit (CSA). The foundation registered as a CVE Numbering Authority through MITRE, gaining direct ability to issue CVEs for AI-specific vulnerabilities. It launched the STAR for AI Catastrophic Risk Annex extending the AI Controls Matrix to scenarios involving loss of human oversight, with rollout from June 2026 through December 2027. It also acquired the Autonomous Action Runtime Management (AARM) specification, contributed by Vanta.</p><p><strong>Why it matters</strong></p><ul><li><p>AI-specific CVE issuance changes how AI vulnerabilities get tracked, scored, and patched.</p></li><li><p>The Catastrophic Risk Annex maps to NIST AI RMF, the EU AI Act, and ISO/IEC 42001, giving auditors a consolidated reference.</p></li><li><p>AARM gives operators a formal specification for runtime control of agent actions.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Add CSAI Foundation advisories to your security feed.</p></li><li><p>For high-risk deployments, map internal controls to the Catastrophic Risk Annex during phase one rollout.</p></li><li><p>Pilot AARM in one agentic workflow this quarter. Runtime control of agent actions is the right level of abstraction.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Plumbing matters more than press releases. While headlines went to Mythos and the Cursor accident, the CSAI Foundation stood up the infrastructure for AI-specific vulnerability tracking, runtime control, and catastrophic risk auditing. This decides whether AI security becomes a discipline or stays a marketing category.</p><p>I&#8217;ve worked in standards for thirty years. The value compounds quietly until one day the auditors ask, and you either have it or you don&#8217;t. We track CSAI work closely at <a href="https://www.rockcyber.com/">RockCyber</a>. Start with the CSA press release, then loop in your governance team Monday.</p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; As a bonus, <strong><a href="https://www.youtube.com/watch?v=rwlVTLyqIv8">check out my conversation with Eva Benn</a></strong> where we talked about the cybersecurity skills you need to develop to stay relevant in 2026 and beyond.</p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Cloud Security Alliance. (2026, April 29). <em>CSAI Foundation announces key milestones to secure the agentic control plane</em>. https://cloudsecurityalliance.org/press-releases/2026/04/29/csai-foundation-announces-key-milestones-to-secure-the-agentic-control-plane</p><p>Cybersecurity Insiders. (2026, April 29). <em>HackerOne launches h1 Validation to tackle rising wave of AI-driven vulnerabilities</em>. https://www.cybersecurity-insiders.com/hackerone-launches-h1-validation-to-tackle-rising-wave-of-ai-driven-vulnerabilities/</p><p>Florida Phoenix. (2026, April 28). <em>Florida Speaker kills DeSantis&#8217; AI regulation, vaccine repeal bills on first day of special session</em>. https://floridaphoenix.com/2026/04/28/florida-speaker-kills-desantis-ai-regulation-vaccine-repeal-bills-on-first-day-of-special-session/</p><p>Forcepoint X-Labs. (2026, April 24). <em>Indirect prompt injection in the wild: X-Labs finds 10 IPI payloads</em>. https://www.forcepoint.com/blog/x-labs/indirect-prompt-injection-payloads</p><p>Google. (2026, April 24). <em>AI threats in the wild: The current state of prompt injections on the web</em>. Google Online Security Blog. https://security.googleblog.com/2026/04/ai-threats-in-wild-current-state-of.html</p><p>Help Net Security. (2026, April 24). <em>Indirect prompt injection is taking hold in the wild</em>. https://www.helpnetsecurity.com/2026/04/24/indirect-prompt-injection-in-the-wild/</p><p>Modulos. (2026, April 28). <em>EU AI Act Omnibus: The trilogue failed, what happens to the August 2026 deadline?</em>. https://www.modulos.ai/blog/ai-act-omnibus-trilogue-failed/</p><p>Nextgov/FCW. (2026, April 28). <em>Lieu and Obernolte introduce consolidated AI bill package</em>. https://www.nextgov.com/artificial-intelligence/2026/04/lieu-and-obernolte-introduce-consolidated-ai-bill-package/413134/</p><p>Sysdig. (2026, April 29). <em>CVE-2026-42208: Targeted SQL injection against LiteLLM&#8217;s authentication path discovered 36 hours following vulnerability disclosure</em>. https://www.sysdig.com/blog/cve-2026-42208-targeted-sql-injection-against-litellms-authentication-path-discovered-36-hours-following-vulnerability-disclosure</p><p>The Hacker News. (2026, April 24). <em>LMDeploy CVE-2026-33626 flaw exploited within 13 hours of disclosure</em>. https://thehackernews.com/2026/04/lmdeploy-cve-2026-33626-flaw-exploited.html</p><p>The Hacker News. (2026, April 29). <em>LiteLLM CVE-2026-42208 SQL injection exploited within 36 hours of disclosure</em>. https://thehackernews.com/2026/04/litellm-cve-2026-42208-sql-injection.html</p><p>The Register. (2026, April 27). <em>Cursor-Opus agent snuffs out startup&#8217;s production database</em>. https://www.theregister.com/2026/04/27/cursoropus_agent_snuffs_out_pocketos/</p><p>Tom&#8217;s Hardware. (2026, April 27). <em>Claude-powered AI coding agent deletes entire company database in 9 seconds</em>. https://www.tomshardware.com/tech-industry/artificial-intelligence/claude-powered-ai-coding-agent-deletes-entire-company-database-in-9-seconds-backups-zapped-after-cursor-tool-powered-by-anthropics-claude-goes-rogue</p><p>UK AI Security Institute. (2026, April 28). <em>Our evaluation of Claude Mythos Preview&#8217;s cyber capabilities</em>. https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities</p><p>24/7 Wall St. (2026, April 28). <em>Microsoft&#8217;s AI moat holds up even after the OpenAI reset</em>. https://247wallst.com/investing/2026/04/28/microsofts-ai-moat-holds-up-even-after-the-openai-reset/</p><p>Washington Post. (2026, April 24). <em>AI hacking fears jolt Washington as Anthropic unveils Mythos</em>. https://www.washingtonpost.com/technology/2026/04/24/anthropic-mythos-ai-washington-cybersecurity-hacking-risk/</p>]]></content:encoded></item><item><title><![CDATA[AI Coding Agent Prompt Injection: Three Vendors, One Seam, No Owner]]></title><description><![CDATA[Comment and Control hit three AI coding agents in one shot. The fix is procurement, not architecture. Five questions CISOs should run before signing.]]></description><link>https://www.rockcybermusings.com/p/ai-coding-agent-prompt-injection-procurement-failure</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/ai-coding-agent-prompt-injection-procurement-failure</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 28 Apr 2026 12:50:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qI72!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qI72!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qI72!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qI72!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qI72!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qI72!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qI72!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2280720,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/195413474?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qI72!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qI72!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qI72!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qI72!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba260851-a774-4805-b34f-cec358f80869_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/ai-coding-agent-prompt-injection-procurement-failure?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/ai-coding-agent-prompt-injection-procurement-failure?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>AI coding agent prompt injection has a procurement problem, and a researcher just published the receipt. <strong><a href="https://venturebeat.com/security/ai-agent-runtime-security-system-card-audit-comment-and-control-2026">Aonan Guan typed a malicious instruction into a GitHub pull request title last week.</a></strong> Anthropic&#8217;s Claude Code Security Review action posted its own API key as a comment. So did Google&#8217;s Gemini CLI Action. So did GitHub&#8217;s Copilot Agent. Same exploit hit three vendors, with no infrastructure required. Anthropic&#8217;s 232-page system card had named the gap before the researchers published. The other two vendors had not documented enough to predict their own outcome.</p><p>Most of the writing on this incident will focus on architecture. The runtime is the perimeter. The action boundary is the blast radius. Both readings are correct. Both are also a deflection. The architecture story explains the mechanism. It doesn&#8217;t explain why the buyer was exposed in the first place. The buyer signed three contracts, accepted three sets of safety claims, and never required any of the three vendors to assert anything about the seams between them. The trigger was a prompt injection. The exposure was procurement.</p><p>I want to push past the architecture take and look at the governance read, because the governance read implicates the reader in a way the architecture take does not.</p><h2>How Comment and Control Worked</h2><p><strong><a href="https://venturebeat.com/security/ai-agent-runtime-security-system-card-audit-comment-and-control-2026">Aonan Guan, working with Zhengyu Liu and Gavin Zhong at Johns Hopkins, opened a GitHub pull request in a target repository.</a></strong> They typed a malicious instruction into the PR title. The repository used the pull_request_target workflow trigger, which any AI coding agent integration with secret access requires. That trigger injects repository secrets into the runner environment. The agent read the PR title, treated the instruction as a directive, called GitHub&#8217;s own API using credentials stored in its environment variables, and posted the secret as a comment on the PR. The default pull_request trigger doesn&#8217;t expose secrets to fork PRs. The pull_request_target trigger does, by design.</p><p>This is the textbook case of what <strong><a href="https://simonw.substack.com/p/the-lethal-trifecta-for-ai-agents">Simon Willison has been calling the lethal trifecta</a></strong>. Access to private data sits in the runner. Untrusted input arrives through the PR title. The exfiltration channel is GitHub&#8217;s comment API, which sits in the agent&#8217;s default tool inventory. All three conditions sit at the seam between three vendors. The exploit needs all three to fire. Comment and Control satisfies all three by design, and no single vendor has written a document that asserts anything about the combination.</p><p>Anthropic ranked the disclosure as CVSS 9.4 Critical and paid a $100 bounty. Google paid $1,337. GitHub paid $500. None of the three issued a CVE in the National Vulnerability Database at the time of disclosure. None published a GitHub Security Advisory. Those numbers send a market signal. Vendor bounty programs classify seam vulnerabilities as out of scope for their own programs, and researchers respond to incentives. The next class of these findings will follow the same path the bounties point them down.</p><p><strong><a href="https://www.helpnetsecurity.com/2026/04/24/indirect-prompt-injection-in-the-wild/">Help Net Security ran a piece this week</a></strong><a href="https://www.helpnetsecurity.com/2026/04/24/indirect-prompt-injection-in-the-wild/"> </a>on Google&#8217;s own CommonCrawl analysis showing a 32% relative increase in malicious indirect prompt injection content between November 2025 and February 2026. The supply of payloads is growing faster than vendor disclosures. That is the operating environment.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WhO6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa956052b-b503-42e2-929b-675e1cd5ef5d_1240x4502.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WhO6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa956052b-b503-42e2-929b-675e1cd5ef5d_1240x4502.png 424w, https://substackcdn.com/image/fetch/$s_!WhO6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa956052b-b503-42e2-929b-675e1cd5ef5d_1240x4502.png 848w, https://substackcdn.com/image/fetch/$s_!WhO6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa956052b-b503-42e2-929b-675e1cd5ef5d_1240x4502.png 1272w, https://substackcdn.com/image/fetch/$s_!WhO6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa956052b-b503-42e2-929b-675e1cd5ef5d_1240x4502.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WhO6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa956052b-b503-42e2-929b-675e1cd5ef5d_1240x4502.png" width="320" height="1161.8064516129032" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a956052b-b503-42e2-929b-675e1cd5ef5d_1240x4502.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:4502,&quot;width&quot;:1240,&quot;resizeWidth&quot;:320,&quot;bytes&quot;:340698,&quot;alt&quot;:&quot;Flowchart showing how a malicious pull request title traverses GitHub&#8217;s pull_request_target trigger, the AI coding agent&#8217;s runtime environment, and back through GitHub&#8217;s comment API to leak the repository&#8217;s secrets&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/195413474?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F060ffb63-301e-4164-a80c-257d85626a20_1240x4502.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flowchart showing how a malicious pull request title traverses GitHub&#8217;s pull_request_target trigger, the AI coding agent&#8217;s runtime environment, and back through GitHub&#8217;s comment API to leak the repository&#8217;s secrets" title="Flowchart showing how a malicious pull request title traverses GitHub&#8217;s pull_request_target trigger, the AI coding agent&#8217;s runtime environment, and back through GitHub&#8217;s comment API to leak the repository&#8217;s secrets" srcset="https://substackcdn.com/image/fetch/$s_!WhO6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa956052b-b503-42e2-929b-675e1cd5ef5d_1240x4502.png 424w, https://substackcdn.com/image/fetch/$s_!WhO6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa956052b-b503-42e2-929b-675e1cd5ef5d_1240x4502.png 848w, https://substackcdn.com/image/fetch/$s_!WhO6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa956052b-b503-42e2-929b-675e1cd5ef5d_1240x4502.png 1272w, https://substackcdn.com/image/fetch/$s_!WhO6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa956052b-b503-42e2-929b-675e1cd5ef5d_1240x4502.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: Comment and Control attack chain</figcaption></figure></div><h2>Why AI Coding Agent Prompt Injection Is a Governance Problem</h2><p>Pull a model card off any of the three vendor sites. Anthropic&#8217;s Opus 4.7 system card, published April 16, 2026, runs 232 pages. It quantifies hack rates. It publishes injection resistance metrics. It includes an explicit statement. Claude Code Security Review is &#8220;not hardened against prompt injection.&#8221; Anthropic does the most mature disclosure work in the industry. OpenAI&#8217;s GPT-5.4 system card documents red-team hours and model-layer evals without publishing agent-runtime resistance numbers. Google&#8217;s Gemini 3.1 Pro card defers most of its safety methodology to the older Gemini 3 Pro card.</p><p>Rank those three in a procurement scorecard, and Anthropic comes out on top. That ranking is the wrong question. A model card describes a model&#8217;s behavior. Comment and Control didn&#8217;t break a model. The disclosure was complete for the layer Anthropic owns and silent on the seam, because Anthropic doesn&#8217;t own the seam. The seam runs through GitHub&#8217;s runner, GitHub&#8217;s API, the agent&#8217;s environment variable scope, the workflow trigger configuration, and the buyer&#8217;s choice to enable agent integration on a repository with secrets. Each of those pieces sits inside a different contract. None of those contracts asserts anything about the combination.</p><p>The structural gap is what makes this a governance story. The cloud security industry took roughly a decade to converge on the shared responsibility model. AWS owns the hypervisor. The customer owns the workload. Each side owns a clear half. Most of the early breaches happened in the unowned middle of that line, and the convergence was painful. Agent composition is replaying that history with a sharper acceleration curve, and there is no industry consensus on where the line sits. Three vendors share a single runtime with no agreed-upon accountability model. The buyer carries everything that the contracts do not.</p><p>Here is a hypothetical for the operational consequence. A SOC running normal vulnerability scanning across the agent-enabled repos sees green. None of the three disclosures generated CVEs in the NVD. The internal ticketing system has no category for &#8220;agent runtime composition risk.&#8221; The risk register has no entry. The budget has no line item. The exploit class is real, the severity is Critical across three vendors, and the standard tooling reports zero findings because the standard tooling has nothing to scan against. The exploit became possible because no one wrote it down as a thing to look for.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6-yb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8949ece6-3eec-438f-b9f8-e8927d688675_1867x3565.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6-yb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8949ece6-3eec-438f-b9f8-e8927d688675_1867x3565.png 424w, https://substackcdn.com/image/fetch/$s_!6-yb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8949ece6-3eec-438f-b9f8-e8927d688675_1867x3565.png 848w, https://substackcdn.com/image/fetch/$s_!6-yb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8949ece6-3eec-438f-b9f8-e8927d688675_1867x3565.png 1272w, https://substackcdn.com/image/fetch/$s_!6-yb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8949ece6-3eec-438f-b9f8-e8927d688675_1867x3565.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6-yb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8949ece6-3eec-438f-b9f8-e8927d688675_1867x3565.png" width="360" height="687.3626373626373" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8949ece6-3eec-438f-b9f8-e8927d688675_1867x3565.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2780,&quot;width&quot;:1456,&quot;resizeWidth&quot;:360,&quot;bytes&quot;:422052,&quot;alt&quot;:&quot;Bar chart comparing Anthropic, OpenAI, and Google system card disclosure depth across model layer and runtime layer, showing all three vendors clustered at the model layer and absent at the runtime layer&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/195413474?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8949ece6-3eec-438f-b9f8-e8927d688675_1867x3565.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Bar chart comparing Anthropic, OpenAI, and Google system card disclosure depth across model layer and runtime layer, showing all three vendors clustered at the model layer and absent at the runtime layer" title="Bar chart comparing Anthropic, OpenAI, and Google system card disclosure depth across model layer and runtime layer, showing all three vendors clustered at the model layer and absent at the runtime layer" srcset="https://substackcdn.com/image/fetch/$s_!6-yb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8949ece6-3eec-438f-b9f8-e8927d688675_1867x3565.png 424w, https://substackcdn.com/image/fetch/$s_!6-yb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8949ece6-3eec-438f-b9f8-e8927d688675_1867x3565.png 848w, https://substackcdn.com/image/fetch/$s_!6-yb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8949ece6-3eec-438f-b9f8-e8927d688675_1867x3565.png 1272w, https://substackcdn.com/image/fetch/$s_!6-yb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8949ece6-3eec-438f-b9f8-e8927d688675_1867x3565.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: System card disclosure depth by vendor and layer</figcaption></figure></div><h2>The Procurement Questions You Should Have Asked</h2><p>Most CISO action checklists produced after an incident like this read as a list of post-hoc remediation steps. Rotate credentials. Restrict permissions. Add monitoring. Those moves are correct, and they are also reactive. The harder, more useful artifact is the set of procurement questions that, asked at signing, would have made Comment and Control either impossible or contractually attributable.</p><p>Here are five questions. Paste them into your next vendor governance review verbatim or adapt them. They work for AI coding agents, and they will work for the next class of agentic integrations after this one.</p><p><strong>The first question is about layer ownership.</strong> Ask each vendor, &#8220;Name the layers of the agent runtime your security guarantees cover, and name the layers you don&#8217;t cover.&#8221; Most vendors will answer the first half. The interesting answer is the second half. A vendor who cannot articulate the layers it doesn&#8217;t cover hasn&#8217;t thought about composition. The contract you are about to sign assumes a perimeter that the vendor hasn&#8217;t analyzed.</p><p><strong>The second question is about quantified resistance metrics on the deployment surface you actually use. </strong>Anthropic publishes injection resistance numbers in the Opus 4.7 system card. Those numbers cover Anthropic&#8217;s API surface. They don&#8217;t cover Claude Code Security Review running on GitHub Actions with a pull_request_target trigger and secrets in scope. Ask for the resistance number for the model version you run on the platform you deploy to. If the vendor cannot produce that number, the vendor cannot quantify the risk you are accepting.</p><p><strong>The third question is about bounty scope. </strong>Ask each vendor, &#8220;Does your bounty program consider vulnerabilities at the integration boundary between your product and the platforms it deploys on?&#8221; Anthropic&#8217;s HackerOne program scopes agent-tooling findings separately from model-safety findings. The position is defensible. The position also pushes researchers&#8217; attention away from the seams. Knowing which vendor&#8217;s program covers which surface is a procurement signal. It tells you which surfaces will get the most external scrutiny over the contract life and which surfaces will not.</p><p><strong>The fourth question is about composition disclosure. </strong>Ask each vendor, &#8220;When your product is integrated with another vendor&#8217;s platform, who is responsible for documenting the security properties of the combined system?&#8221; The honest answer from every vendor is &#8220;the buyer.&#8221; Get it in writing. The asymmetry exposes why a shared responsibility artifact for agent runtimes does not yet exist.</p><p><strong>The fifth question is about runtime telemetry</strong>. Ask, &#8220;What runtime signals do you publish that allow me to detect prompt injection in production?&#8221; If the answer is a model-card link, the vendor hasn&#8217;t built the runtime monitoring. If the answer is an SDK with detection hooks, document the coverage and the false-positive rate. The August 2026 EU AI Act high-risk compliance deadline turns this question from a nice-to-have into an audit artifact, and the vendors who cannot answer it now will be the ones renegotiating contracts in Q3.</p><p>Those five questions don&#8217;t eliminate the exploit class. They make the exploit class a contractual variable instead of a discovered surprise. A buyer who asks all five before signing knows where the seam runs and who is on the hook for what.</p><h2>What to Do This Week, Ordered by Blast Radius Reduction</h2><p>The reactive moves still matter. Order them by blast radius reduction, not by the order they appear in any vendor advisory. Each one carries a different internal political cost, and pretending the costs are equal is how good control work dies in committee.</p><p>Inventory every workflow in your repositories that uses pull_request_target. The grep is cheap. The conversation with the dev tooling team about what each of those workflows needs is not. Expect to find workflows configured for one reason, with AI agent integrations later layered on top, and no review of the original threat model.</p><p>Rotate every credential exposed to agents in those workflows over the last 90 days. The cost is low. The likelihood of someone pushing back is also low. Do it first because it is the cheap one, and use the speed of the rotation to demonstrate that agent-related credential rotation is now part of the normal operating cadence.</p><p>Switch from stored secrets to short-lived OIDC tokens for any workflow that supports it. The political cost is medium. You will need platform team buy-in. The argument that closes the loop is exactly the procurement gap above. Stored secrets in agent-accessible environments are a category of risk no vendor&#8217;s contract currently covers, and OIDC removes the category from the buyer&#8217;s residual.</p><p>Strip bash execution permissions from agents that only need to perform code review. This one starts a fight with the developer tooling team because some of the convenience features will break. The fight is worth having. An agent with bash permissions on a CI runner with secrets in scope is the worst-case configuration. Write the security memo and force the documented risk acceptance from the team that wants to keep the bash channel open.</p><p>Add a category to your supply chain risk register called &#8220;AI agent runtime composition.&#8221; Most GRC tooling doesn&#8217;t have a field that maps to the category. Add it manually. The act of adding the category forces the conversation about which vendor combinations are covered by which contracts and which are not. The conversation is the artifact you actually need. The risk register entry is the receipt that the conversation happened.</p><h2>Where the Industry Has to Go</h2><p>The cloud security industry built the shared responsibility model under pressure from breaches and ten years of regulatory friction. The AI agent industry has neither of those forcing functions yet. The EU AI Act high-risk obligations come into force in August 2026 and will start to put procurement language behind some of these questions, but the standards work that would produce a real shared responsibility artifact for agent runtimes hasn&#8217;t happened. This is where the CARE framework lands. Create the procurement questions before you sign. Adapt the controls you already have around CI/CD, secret scoping, and runtime monitoring. Run the agent integrations under the same operating cadence as the rest of your privileged automation. Evolve the risk register category as new exploit classes emerge. The exploit class will not stop with Comment and Control. The next one will follow the same architectural pattern and the same governance gap. The CISOs who are ready for it are the ones who treat agent procurement as a governance problem now, while the vendors and the standards bodies are still catching up.</p><p><strong>Key Takeaway:</strong> The AI coding agent prompt injection class lives in the seams between vendor contracts, and the buyer carries the residual until the procurement questions force the seams into the conversation.</p><h3>What to Do Next</h3><p>Start with the five procurement questions in your next vendor renewal cycle. Do the credential rotation and the OIDC migration this quarter. Read <a href="https://rockcybermusings.com">the rest of the RockCyber Musings archive</a> for the operating cadence I run with clients on agentic AI security reviews, and reach out through <a href="https://rockcyber.com">RockCyber</a> if you want to walk through the procurement question set against a specific vendor stack you are evaluating.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 35 April 17-April 23, 2026]]></title><description><![CDATA[Mythos Meltdown, Vibe Coding Implosions, And The Week AI Security Ran Out Of Excuses]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-202604-17-20260423</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-202604-17-20260423</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 24 Apr 2026 12:50:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!O1Cl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O1Cl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O1Cl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!O1Cl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!O1Cl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!O1Cl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O1Cl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/195303010?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!O1Cl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!O1Cl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!O1Cl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!O1Cl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb63e31-5620-4a4a-acba-368013a366aa_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-202604-17-20260423?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-202604-17-20260423?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Seven days. One breached &#8220;too dangerous to release&#8221; model. One vibe coding platform exposing 76 days of customer source code. One AI supply chain attack that cost Vercel its dignity. A compliance startup accused of rubber-stamping SOC 2 reports for companies that later got breached. Every story landed between April 17 and April 23, 2026, the same week Gartner blessed its first &#8220;Company to Beat&#8221; in agent governance, the UK promised a &#163;90 million cyber shield, and Google shipped three security agents. The security industry spent two years debating whether agentic AI was a real threat. This week, the debate ended.</p><p>AI systems are both targets and attack vectors, with failure modes of their own. A frontier model gets breached because a vendor fell for infostealer malware in February. A vibe coding startup ships a regression and exposes every customer&#8217;s source code for 76 days. A compliance startup hands out SOC 2 attestations like candy, and one customer becomes the pivot for a supply chain attack. Governments and analysts moved together. The UK committed real money to AI-powered cyber defense. Gartner stamped agent governance as a procurement category. This is the week the gap between AI capability and AI assurance became a balance sheet problem.</p><h3>1. Anthropic Mythos Model Accessed By Unauthorized Discord Group Days After Launch</h3><p>Anthropic confirmed on April 22, 2026, that it is investigating unauthorized access to Mythos, the frontier model restricted to roughly 40 partners, including Apple, Google, JPMorgan Chase, and NVIDIA (Bloomberg). The access came through a third-party contractor environment, not Anthropic&#8217;s direct infrastructure (CBS News). A Discord group focused on unreleased AI models guessed Mythos&#8217;s URL from naming conventions and pivoted through a contractor&#8217;s credentials to reach it. Anthropic claims no core systems were compromised.</p><p><strong>Why it matters</strong></p><ul><li><p>The firm Anthropic, trusted with access to frontier models, is the one that leaked it.</p></li><li><p>Mythos autonomously finds and weaponizes zero-days. Downstream risk spans all major OSes.</p></li><li><p>Guessing URLs and owning one contractor beat a Tier 1 AI lab.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory every third-party vendor with access to frontier AI weights or runtime. Treat them as Tier 1.</p></li><li><p>Require contractors touching AI infrastructure to match your credential isolation standards.</p></li><li><p>Demand hardware token enforcement for any vendor in production AI environments.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>A contractor endpoint blew apart the &#8220;too dangerous to release&#8221; framing in 24 hours. Anthropic built Mythos to protect partners from zero-days, then lost it through a vendor employee. The model built to find vulnerabilities got stolen because of a vulnerability nobody thought to measure. You cannot outsource your trust perimeter. Every CISO needs to audit AI-access vendors as they do their crown-jewel systems.</p><h3>2. Vercel Supply Chain Breach Via Context.ai OAuth Token Compromise</h3><p>Vercel confirmed on April 19, 2026 that customer data was stolen via a compromise of Context.ai, a third-party AI assistant a Vercel employee had connected to Google Workspace with full Drive read access (TechCrunch). A Context.ai employee&#8217;s device was infected with Lumma infostealer in February 2026. ShinyHunters used the exfiltrated OAuth tokens to pivot into the Vercel employee&#8217;s Google account, then into Vercel itself (Vercel). The actor is offering source code, NPM and GitHub tokens, and access keys for $2 million on BreachForums.</p><p><strong>Why it matters</strong></p><ul><li><p>One OAuth app installed by one employee rolled into a platform breach.</p></li><li><p>Lumma was the vector. The AI assistant was the accelerant.</p></li><li><p>ShinyHunters is monetizing AI-adjacent breaches at scale. Expect copycats.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit every OAuth app with Drive, Gmail, or Workspace scopes. Revoke AI tools without documented need.</p></li><li><p>Enforce conditional access with hardware tokens and device posture for Workspace accounts.</p></li><li><p>Subscribe to stealer log monitoring for corporate emails.</p></li><li><p>Rotate all secrets (e.g. API keys).</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>An employee clicked a button, granted a third-party AI read access to everything, and the attacker rode that consent into production. OAuth scopes are the new privileged credentials, and most of us are not managing them that way. The shadow AI problem I flag with clients at <a href="https://www.rockcyber.com/">RockCyber</a> is not ChatGPT use. It&#8217;s the hundreds of AI-branded OAuth apps employees connect while nobody watches.</p><h3>3. Gartner Names Zenity The &#8220;Company To Beat&#8221; In AI Agent Governance</h3><p>On April 23, 2026, Zenity announced that Gartner named it the &#8220;Company to Beat in AI Agent Governance&#8221; (Business Wire). Gartner cited Zenity&#8217;s agentic architecture, intent-aware detection, and end-user traction. The platform covers SaaS-managed agents, custom-built agents, and device deployments from build to runtime. Gartner&#8217;s 2026 CIO survey shows that 17 percent of organizations have deployed AI agents, 42 percent plan to do so within 12 months, and another 22 percent plan to do so the year after (Yahoo Finance). Zenity also landed in two categories of the 2026 Gartner Hype Cycle for Agentic AI this month.</p><p><strong>Why it matters</strong></p><ul><li><p>A &#8220;Company to Beat&#8221; stamp on a narrow security category speeds up procurement.</p></li><li><p>79% of organizations plan to deploy AI agents within 2 years.</p></li><li><p>Agent governance is shifting from a research topic to a commercial line item.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If you are on the 42 percent 12-month curve, start evaluations now.</p></li><li><p>Evaluate agent governance on runtime enforcement, not only inventory or posture.</p></li><li><p>Require vendors to show agent identity, memory, tool-call, and intent controls as distinct.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Yes&#8230; Zenity is my employer, so a) I&#8217;m super proud of this one and b) it&#8217;s my prerogative to include it in the musings &#128512;</p><p> &#8220;Company to Beat&#8221; labels are how procurement catches up with security reality. Mythos leaked through a contractor, Vercel got rolled via an AI assistant&#8217;s OAuth token, and the same week Gartner tells CIOs agent governance is a budget item. Read Zenity&#8217;s architecture claims against this week&#8217;s breach anatomy, then against what you bought for CASB five years ago. Same pattern, same procurement playbook. Budget the line item.</p><h3>4. Lovable Vibe Coding Platform Exposed Source Code For 76 Days</h3><p>On April 20, 2026, security researcher weezerOSINT disclosed a broken object-level authorization flaw in Lovable&#8217;s API that let any authenticated free-account user read source code, database credentials, AI chat history, and customer data from every project created before November 2025 (The Register). The exposure ran 76 days, from February 3 through April 20, 2026. Lovable first denied the flaw, blamed its documentation, then blamed HackerOne, then apologized for the apology (Cybernews). Customers include Uber, Zendesk, and Deutsche Telekom.</p><p><strong>Why it matters</strong></p><ul><li><p>Vibe coding platforms hold enterprise source code and secrets. Attacker value is enormous.</p></li><li><p>Public denial while the flaw was live is a textbook loss-of-trust move.</p></li><li><p>A $6.6 billion startup cannot figure out basic tenant isolation three versions in.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Block new vibe coding connections at DNS or CASB until procurement reviews tenancy.</p></li><li><p>Rotate any credentials your teams put into Lovable projects since February 2026.</p></li><li><p>Treat vibe coding output as untrusted. Pull it into a real repo, scan it, review it.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Vibe coding is a demo, not engineering. When you hand a growth-stage startup your production database credentials in exchange for a drag-and-drop builder, you have accepted that your security depends on whether someone refactors an authorization check. Three breaches in thirteen months is a pattern, not bad luck. If your security team has not yet restricted this category of tool, do it this week.</p><h3>5. Google Cloud Next Ships Three AI Security Agents And Gemini Enterprise Agent Platform</h3><p>On April 22, 2026, Google Cloud Next introduced the Gemini Enterprise Agent Platform and three new AI agents inside Google Security Operations (SiliconANGLE). The agents cover Threat Hunting, Detection Engineering, and Third-Party Context enrichment (The Register). Google also deepened its ties to the Wiz product and shipped new agent governance tools. Sundar Pichai framed the shift as moving from human-led defense to human-in-the-loop to AI-led defense overseen by humans.</p><p><strong>Why it matters</strong></p><ul><li><p>Three tedious SOC functions now have vendor agent equivalents. SOC staffing economics shift if they work.</p></li><li><p>Google is betting the platform on agentic AI, not only generative AI.</p></li><li><p>The Wiz tie-in gives Google a path into CSPM-driven SOC workflows.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Pilot the Threat Hunting agent for 30 days against your human hunt team and score overlap.</p></li><li><p>Define human-in-the-loop gates before any autonomous detection or response action.</p></li><li><p>Update vendor risk reviews to cover agent behavior monitoring, not only model output.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The pitch is compelling, the execution will be messy. Every SOC team I advise is drowning in alerts, and the first customer bitten by an autonomous agent on bad context will make headlines. The Third-Party Context agent matters more than the other two because better data into an agentic SOC prevents bad autonomous actions. Read <a href="https://rockcybermusings.com/">my notes on AI governance</a> before you green-light an agent in production.</p><h3>6. UK Announces &#163;90 Million National Cyber Shield And Calls On AI Firms To Co-Build Defense</h3><p>At CYBERUK 2026 on April 22, 2026, UK Security Minister Dan Jarvis announced &#163;90 million over three years for national-scale AI-powered cyber defense capabilities (GOV.UK). Jarvis asked frontier AI companies to co-develop these capabilities with the UK government and cited Mythos&#8217;s zero-day findings as justification for public sector urgency (Computer Weekly). Jarvis also launched a National Cyber Resilience Pledge aimed at private sector security baselines.</p><p><strong>Why it matters</strong></p><ul><li><p>The UK is the first major Western government to put operational capital into AI-defended critical infrastructure.</p></li><li><p>Public-private cooperation on offensive-grade AI models sets a precedent others will react to.</p></li><li><p>Frontier AI vendors in UK public sector now have a direct path to shape national doctrine.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>UK critical infrastructure operators: map your sector against the Pledge before it becomes mandatory.</p></li><li><p>Track which AI vendors join. UK procurement for critical infrastructure will narrow quickly.</p></li><li><p>Watch NCSC secure-by-design expectations for AI. They will bleed into global procurement language.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>&#163;90 million pounds sounds like a lot, but it really is a down payment. The bigger story is the UK saying out loud what American officials still whisper. Frontier AI models are dual-use capability, and if you don&#8217;t partner with the labs building them, your adversaries will. The Pledge is the more interesting instrument. Voluntary commitments have a funny way of becoming procurement requirements, then de facto regulation.</p><h3>7. OpenAI Releases Privacy Filter, An Open-Weight On-Device PII Redactor</h3><p>On April 23, 2026, OpenAI released Privacy Filter, a 1.5-billion-parameter open-weight model with 50 million active parameters that detects and redacts personally identifiable information locally (Help Net Security). It supports a 128,000-token context window, runs in browsers and on laptops, and achieves a 96% F1 score on PII-Masking-300k (VentureBeat). It ships under Apache 2.0 on GitHub and Hugging Face, covering eight PII categories.</p><p><strong>Why it matters</strong></p><ul><li><p>A permissive open-weight PII redactor that runs on a laptop closes a real enterprise data sanitization gap.</p></li><li><p>OpenAI shipping open weights for a safety model is a positional move, not a strategy reversal.</p></li><li><p>The tool removes a common excuse for shipping raw enterprise data to cloud LLMs.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Evaluate Privacy Filter as a preprocessing layer for any LLM pipeline on customer, support, or HR data.</p></li><li><p>Benchmark it against existing DLP tools for AI-specific use cases.</p></li><li><p>Add on-device redaction as a control in your AI data flow diagrams.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Privacy Filter is the first open-weight piece from OpenAI that&#8217;s useful to a CISO. One point five billion parameters, runs local, decent accuracy, permissive license. It slots into every RAG pipeline I review as a trivial addition that removes an easy audit finding. OpenAI has taken heat on privacy posture for three years, and shipping open weights for a PII model is a pressure valve. Anthropic and Google will follow within six months.</p><h3>8. Delve Compliance Scandal Widens After TechCrunch Confirms Context.ai Certification</h3><p>On April 23, 2026, TechCrunch confirmed that Delve, the Y Combinator-backed compliance startup accused of faking SOC 2 audits, had certified Context.ai, the AI tool at the center of the Vercel supply chain breach (TechCrunch). Delve also certified LiteLLM, another open source project separately compromised with planted malware. Context.ai has cut ties with Delve and is re-certifying with a different auditor. Whistleblower DeepDelver alleged the Delve team took a Hawaii offsite between April 15 and April 19 while denying customer refunds.</p><p><strong>Why it matters</strong></p><ul><li><p>Two Delve-certified companies are at the center of AI supply chain breaches.</p></li><li><p>SOC 2 without substance is a liability shield until the shield gets tested.</p></li><li><p>AI compliance tooling is saturated with startups racing to rubber-stamp fast-moving products.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit your vendor attestations. Who signed? What is the auditor&#8217;s history? Is the scope meaningful?</p></li><li><p>For AI vendors, demand pentest summaries, code review artifacts, and threat models.</p></li><li><p>Treat SOC 2 as one input into assurance, not a box check.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>My friends know&#8230; I believe SOC 2 needs to burn a fiery death, but &#8220;we&#8221; still insist on them. Founders want the badge, auditors want the fee, customers want the checkbox. Everyone wins until the breach, then the enterprise that relied on the paper finds out the paper was never the point. SOC 2 is a floor, not a ceiling. Nothing will change until we kill the demand side of this particular supply/demand equation.</p><h3>9. NIST Narrows CVE Enrichment As Submission Volume Overwhelms NVD</h3><p>On April 17, 2026, NIST announced it will only enrich CVEs that meet specific criteria due to an unsustainable rise in submissions (Cybersecurity Dive). The NVD will continue assigning CVE IDs to all submissions but will no longer guarantee CVSS scores, CPE mappings, or descriptions for every record. NIST cites AI-assisted vulnerability research as a key driver of volume. Enrichment priority goes to actively exploited vulnerabilities and CVEs affecting critical infrastructure.</p><p><strong>Why it matters</strong></p><ul><li><p>If your program assumes every CVE carries a CVSS score and CPE mapping, it is about to degrade silently.</p></li><li><p>AI-generated vulnerability research is flooding public disclosure. The NVD cannot keep up.</p></li><li><p>Enterprises relying only on NVD-fed scanners will miss or misprioritize vulnerabilities now.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Supplement NVD with CISA KEV and commercial vulnerability intelligence.</p></li><li><p>Score CVEs NIST skips using vendor advisories as primary sources.</p></li><li><p>Reassess SLAs based on enrichment availability, not only patch availability.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>NIST is essentially throwing up its hands and giving up. The CVE system was built for a world where humans found most bugs. We no longer live there. Mythos alone found thousands of zero-days in weeks. Multiply that by every lab running similar research, and NVD throughput becomes a joke. NIST is triaging, which is the only rational move. The problem is that nobody told your vulnerability scanner. Get ahead of this now, or your next board report will be a lie by omission.</p><h3>10. Anthropic MCP STDIO Flaw Burns The Agentic AI Ecosystem As New CVEs Land</h3><p>The STDIO command injection flaw in Anthropic&#8217;s MCP SDK produced new CVE assignments throughout the week, including CVE-2026-30623 and CVE-2026-22252 (LiteLLM). Analysis on April 20 from BDTechTalks documented ecosystem fallout and Anthropic doubling down on its &#8220;by design&#8221; position (BDTechTalks). The flaw class affects 7,000 publicly accessible MCP servers and over 150 million package downloads (Infosecurity Magazine). Affected products include LibreChat, WeKnora, Cursor, and MCP Inspector.</p><p><strong>Why it matters</strong></p><ul><li><p>Anthropic will not patch. Every developer using the official SDK owns the mitigation.</p></li><li><p>The default agentic interop standard has a baked-in remote code execution footgun.</p></li><li><p>CVEs are stacking up. Every MCP-connected product is a vendor risk question.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory every MCP server and client. If you can&#8217;t produce the list in a day, you have a bigger MCP problem.</p></li><li><p>Enforce strict input validation on any MCP server config from user input, LLM output, or third-party manifests.</p></li><li><p>Update your agentic threat model to cover MCP as a first-class attack surface.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>&#8220;By design&#8221; is a liability transfer, not a security posture. Anthropic handed every developer on the MCP SDK a foot-gun and said go figure it out. Competing agent protocols like A2A and Agora are watching and taking notes. Building the default standard for agent-to-system communication on top of a protocol decision that cannot be fixed without breaking compatibility is the problem. Every MCP-based product in your stack is a recurring risk item.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><h3>AgentSOC Paper Publishes A Multi-Layer Blueprint For Agentic Security Operations</h3><p>On April 22, 2026, researchers published AgentSOC: A Multi-Layer Agentic AI Framework for Security Operations Automation on arXiv (arXiv). The paper proposes a layered architecture combining perception, anticipatory reasoning, and risk-based action planning for autonomous SOC operations. It documents design patterns for coordinating specialized agents across triage, hunt, and response workflows while keeping human oversight in place. The work joins other 2026 papers arguing agentic AI is mature enough for production SOC environments when guardrails are in place.</p><p><strong>Why it matters</strong></p><ul><li><p>Vendors ship products. Research supplies the reference architectures that determine whether those products survive in production.</p></li><li><p>The AgentSOC blueprint maps closely to what Google announced this week. The convergence is not accidental.</p></li><li><p>CISOs now have a public framework to score vendor claims against independent research.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Read the paper before your next agentic SOC evaluation. Use the layer breakdown as a scoring rubric.</p></li><li><p>Ask vendors how their architecture maps to perception, anticipation, and action layers.</p></li><li><p>Share the paper with SOC leadership. It gives your team a vocabulary for what to demand.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Vendor marketing is a terrible place to learn what agentic security operations should look like. Academic literature is better. AgentSOC is not the last word, but it landed the same week three major vendors pitched agentic SOC products. CISOs who read research papers buy better tools and sign better contracts than the ones who only read analyst reports. Use the AgentSOC structure the next time a vendor promises agentic magic, and watch them squirm when you ask what happens at the perception layer when the model hallucinates.</p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; As a bonus, <strong><a href="https://www.youtube.com/watch?v=rwlVTLyqIv8">check out my conversation with Eva Benn</a></strong> where we talked about the cybersecurity skills you need to develop to stay relevant in 2026 and beyond.</p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>arXiv. (2026, April 22). <em>AgentSOC: A multi-layer agentic AI framework for security operations automation</em>. https://arxiv.org/abs/2604.20134</p><p>BDTechTalks. (2026, April 20). <em>Anthropic&#8217;s MCP vulnerability: When &#8216;expected behavior&#8217; becomes a supply chain nightmare</em>. https://bdtechtalks.com/2026/04/20/anthropic-mcp-vulnerability/</p><p>Bloomberg. (2026, April 21). <em>Anthropic&#8217;s Mythos AI model is being accessed by unauthorized users</em>. https://www.bloomberg.com/news/articles/2026-04-21/anthropic-s-mythos-model-is-being-accessed-by-unauthorized-users</p><p>Business Wire. (2026, April 23). <em>Zenity named the &#8220;Company to Beat&#8221; in AI Agent Governance in new Gartner report</em>. https://www.businesswire.com/news/home/20260423045822/en/Zenity-Named-the-Company-to-Beat-in-AI-Agent-Governance-in-New-Gartner-Report</p><p>Bloomberg. (2026, April 22). <em>Google releases new AI agents to challenge OpenAI and Anthropic</em>. https://www.bloomberg.com/news/articles/2026-04-22/google-releases-new-ai-agents-to-challenge-openai-and-anthropic</p><p>CBS News. (2026, April 22). <em>Anthropic investigating possible breach of its Mythos AI model</em>. https://www.cbsnews.com/news/anthropic-investigates-mythos-ai-breach/</p><p>Computer Weekly. (2026, April 22). <em>UK to build &#8216;national cyber shield&#8217; to protect against AI cyber threats</em>. https://www.computerweekly.com/news/366641790/UK-to-build-national-cyber-shield-to-protect-against-AI-cyber-threats</p><p>Cybernews. (2026, April 20). <em>Lovable goes on ego trip denying vulnerability, then blames others for said vulnerability</em>. https://cybernews.com/security/lovable-vibe-coding-flaw-apology/</p><p>Cybersecurity Dive. (2026, April 17). <em>NIST narrows CVE enrichment as submission volume surges</em>. https://www.cybersecuritydive.com/news/nist-ai-cybersecurity-framework-profile/808134/</p><p>GOV.UK. (2026, April 22). <em>Security Minister&#8217;s speech to CYBERUK 2026</em>. https://www.gov.uk/government/speeches/security-ministers-speech-to-cyberuk-2026</p><p>Help Net Security. (2026, April 23). <em>OpenAI tackles a bad habit people have when interacting with AI</em>. https://www.helpnetsecurity.com/2026/04/23/openai-privacy-filter-personally-identifiable-information/</p><p>Infosecurity Magazine. (2026, April). <em>Systemic flaw in MCP protocol could expose 150 million downloads</em>. https://www.infosecurity-magazine.com/news/systemic-flaw-mcp-expose-150/</p><p>LiteLLM. (2026, April). <em>Security update: CVE-2026-30623, command injection via Anthropic&#8217;s MCP SDK</em>. https://docs.litellm.ai/blog/mcp-stdio-command-injection-april-2026</p><p>SiliconANGLE. (2026, April 22). <em>Google rolls out new Security Operations agents, Wiz ties, and agent governance tools</em>. https://siliconangle.com/2026/04/22/google-cloud-next-new-security-operations-agents-wiz-integrations-agent-governance-tools/</p><p>TechCrunch. (2026, April 20). <em>App host Vercel says it was hacked and customer data stolen</em>. https://techcrunch.com/2026/04/20/app-host-vercel-confirms-security-incident-says-customer-data-was-stolen-via-breach-at-context-ai/</p><p>TechCrunch. (2026, April 23). <em>Another customer of troubled startup Delve suffered a big security incident</em>. https://techcrunch.com/2026/04/23/another-customer-of-troubled-startup-delve-suffered-a-big-security-incident/</p><p>The Register. (2026, April 20). <em>Lovable denies data leak, cites &#8216;intentional behavior&#8217;</em>. https://www.theregister.com/2026/04/20/lovable_denies_data_leak/</p><p>The Register. (2026, April 22). <em>Google unleashes even more AI security agents to fight crims</em>. https://www.theregister.com/2026/04/22/google_unleashes_even_more_ai</p><p>Vercel. (2026, April 19). <em>Vercel April 2026 security incident</em>. https://vercel.com/kb/bulletin/vercel-april-2026-security-incident</p><p>VentureBeat. (2026, April 23). <em>OpenAI launches Privacy Filter, an open source, on-device data sanitization model</em>. https://venturebeat.com/data/openai-launches-privacy-filter-an-open-source-on-device-data-sanitization-model-that-removes-personal-information-from-enterprise-datasets</p><p>Yahoo Finance. (2026, April 23). <em>Zenity named the &#8220;Company to Beat&#8221; in AI Agent Governance</em>. https://finance.yahoo.com/sectors/technology/articles/zenity-named-company-beat-ai-130100277.html</p>]]></content:encoded></item><item><title><![CDATA[Your Defender AI Is Your Next Crown Jewel. Threat-Model It Now.]]></title><description><![CDATA[Mythos and GPT-5.4-Cyber made defender AI a critical asset. Most security teams haven't threat-modeled it. Here's what to do this week.]]></description><link>https://www.rockcybermusings.com/p/defender-ai-crown-jewel-mythos-gpt-cyber</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/defender-ai-crown-jewel-mythos-gpt-cyber</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 21 Apr 2026 12:51:01 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Txn7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Txn7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Txn7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Txn7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Txn7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Txn7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Txn7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2489623,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/194618055?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Txn7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Txn7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Txn7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Txn7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7113b57-82da-470d-b315-0532fba855da_2816x1536.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/defender-ai-crown-jewel-mythos-gpt-cyber?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/defender-ai-crown-jewel-mythos-gpt-cyber?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>A Fortune 500 bank gets its Project Glasswing partner seat six weeks from now. Anthropic ships the Mythos Preview container and $10 million in credits. The bank stands up a Mythos instance inside its own environment, points it at its core banking monorepo, and starts finding bugs on day one. Forty-two days in, a developer opens a pull request that adds a utility library. The README on that library contains a commented block beginning with &#8220;SECURITY NOTE FOR AUTOMATED REVIEWERS.&#8221; The Mythos instance reads it. The comment is an indirect prompt injection telling the reviewer to mark a specific authentication bypass as a false positive and not mention the instruction in the output. The reviewer complies. The bug ships. Nobody sees it because the thing designed to see it was told not to.</p><p>That scenario is fictional. The attack class is not. <strong><a href="https://labs.cloudsecurityalliance.org/wp-content/uploads/2026/04/mythosreadyv95.pdf">The Mythos-Ready whitepaper from the CSA, SANS, OWASP GenAI Security Project, and a coalition of practitioners (I was a reviewer)</a></strong> lists &#8220;Unmanaged AI Agent Attack Surface&#8221; as one of its five critical risks, mapping to <strong><a href="https://genai.owasp.org/download/52117/?tmstv=1765059207">OWASP Agentic Top 10</a></strong> entries ASI01 (Agent Goal Hijack), ASI02 (Tool Misuse), ASI03 (Identity and Privilege Abuse), plus AML.T0051.001 (Indirect Prompt Injection) in <strong><a href="https://atlas.mitre.org/">MITRE ATLAS</a></strong>. Ranked critical. The single most underweighted item in the entire priority table.</p><p>The industry is fixated on the wrong question. Everyone is arguing about whether Anthropic&#8217;s 40-org Glasswing coalition or OpenAI&#8217;s thousands-of-verified-defenders TAC program is the right release model. That argument matters, and I will work through it. The bigger issue is that once you get access to either Mythos or GPT-5.4-Cyber, the running instance becomes the most valuable asset in your security stack. It sits within your environment, with privileged access to your source code, vulnerability telemetry, patch queue, and incident history. It knows where your unpatched zero-days live. An attacker who compromises that instance does not need to find bugs. The instance tells them where the bugs are.</p><h2>What Anthropic and OpenAI Built</h2><p>Mythos Preview is a gated frontier model. Anthropic released it on April 7, 2026, announced Project Glasswing the same day, and restricted access to 12 launch partners plus roughly 40 additional organizations. The partners include AWS, Apple, Microsoft, Google, CrowdStrike, Cisco, JPMorgan Chase, NVIDIA, Palo Alto Networks, Broadcom, and the Linux Foundation. Anthropic committed $100 million in usage credits and priced the model at $25 per million input tokens and $125 per million output tokens, roughly 5x Opus 4.6 (which is roughly 5x Sonnet 4.6&#8230; OUCH!). The stated case for restricting access is that the model found thousands of zero-days across all major operating systems and browsers, including a 27-year-old bug in OpenBSD and a 16-year-old flaw in FFmpeg. Anthropic&#8217;s own assessment is that comparable capability will reach broad availability in 6 to 18 months.</p><p>GPT-5.4-Cyber is OpenAI&#8217;s answer, released April 14, 2026, one week later. It is a fine-tuned variant of GPT-5.4 with what OpenAI calls a &#8220;lowered refusal boundary for legitimate cybersecurity work.&#8221; The headline capability is binary reverse engineering.  Feed it a compiled executable, and get vulnerability analysis without source code. OpenAI&#8217;s Trusted Access for Cyber program, piloted in February 2026 with $10 million in grant credits, scales to thousands of verified individual defenders and hundreds of teams. Individuals verify at chatgpt.com/cyber. Enterprises apply through account representatives. OpenAI cyber researcher Fouad Matin told reporters, &#8220;No one should be in the business of picking winners and losers&#8221; on who gets to defend their systems.</p><p>The two approaches reflect different risk philosophies. Anthropic bets on institutional trust and coalition monitoring. OpenAI bets on KYC verification and broader distribution. Both have real merit. Both share the same structural weakness: the access decision sits upstream of the threat model.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kSjE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f62c377-67b3-4e86-8cd9-df2b3b54b5d9_2500x2500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kSjE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f62c377-67b3-4e86-8cd9-df2b3b54b5d9_2500x2500.png 424w, https://substackcdn.com/image/fetch/$s_!kSjE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f62c377-67b3-4e86-8cd9-df2b3b54b5d9_2500x2500.png 848w, https://substackcdn.com/image/fetch/$s_!kSjE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f62c377-67b3-4e86-8cd9-df2b3b54b5d9_2500x2500.png 1272w, https://substackcdn.com/image/fetch/$s_!kSjE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f62c377-67b3-4e86-8cd9-df2b3b54b5d9_2500x2500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kSjE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f62c377-67b3-4e86-8cd9-df2b3b54b5d9_2500x2500.png" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f62c377-67b3-4e86-8cd9-df2b3b54b5d9_2500x2500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:374065,&quot;alt&quot;:&quot;Side-by-side comparison table of Mythos and GPT-5.4-Cyber showing release scope, access gate, pricing, capability focus, and trust model&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/194618055?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f62c377-67b3-4e86-8cd9-df2b3b54b5d9_2500x2500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Side-by-side comparison table of Mythos and GPT-5.4-Cyber showing release scope, access gate, pricing, capability focus, and trust model" title="Side-by-side comparison table of Mythos and GPT-5.4-Cyber showing release scope, access gate, pricing, capability focus, and trust model" srcset="https://substackcdn.com/image/fetch/$s_!kSjE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f62c377-67b3-4e86-8cd9-df2b3b54b5d9_2500x2500.png 424w, https://substackcdn.com/image/fetch/$s_!kSjE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f62c377-67b3-4e86-8cd9-df2b3b54b5d9_2500x2500.png 848w, https://substackcdn.com/image/fetch/$s_!kSjE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f62c377-67b3-4e86-8cd9-df2b3b54b5d9_2500x2500.png 1272w, https://substackcdn.com/image/fetch/$s_!kSjE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f62c377-67b3-4e86-8cd9-df2b3b54b5d9_2500x2500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: Release Philosophy Comparison</figcaption></figure></div><h2>How to Get Your Hands on Each</h2><p>For Mythos, the answer for 99% of organizations is: you don&#8217;t. Project Glasswing is a curated coalition. The 40 slots are filled with hyperscalers, chipmakers, one bank, and the Linux Foundation. Anthropic has not published an application path. Additional partners will be added over time, prioritized by critical infrastructure impact. If you run a regional bank, a hospital system, or a municipality, the realistic timeline for direct access to Mythos is measured in quarters.</p><p>For GPT-5.4-Cyber, the path is documented. Individuals verify at chatgpt.com/cyber. Organizations request trusted access through an OpenAI account representative. The program uses KYC-style identity verification and tiered access, with the highest tier unlocking GPT-5.4-Cyber. OpenAI says the rollout will be gradual and vetted, with early priority on security vendors, organizations, and researchers with track records in vulnerability research and remediation.</p><p>Both paths share one feature that matters more than either provider acknowledges: neither gate eliminates the capability. AISLE, an independent AI security research group, <a href="https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier">tested the exact FreeBSD vulnerability Anthropic headlined against open-weight models</a>. Eight out of eight detected the bug. The smallest was a 3.6 billion parameter model at 11 cents per million tokens. A 5.1 billion active parameter model recovered the core analysis chain of the 27-year-old OpenBSD flaw. Total cost of AISLE&#8217;s weekend benchmarking across six models: under $100. Attackers are running abliterated Llama 4, Kimi K2, and Qwen3 variants on laptops. Your coordinated disclosure window is what the gates protect, not your attack surface.</p><h2>Two Attacker Profiles, Two Different Problems</h2><p>The defender community keeps talking about &#8220;the attacker&#8221; as if there is one. There are at least two. They pick different pathways.</p><p>The first is the opportunistic actor running autonomous vulnerability discovery across the entire internet-facing attack surface. This actor does not care who you are. They care about breadth. They run nano-analyzer-style scaffolding against every public codebase, every npm package, every Docker image they can reach. Open-weight models, free, uncensored variants widely distributed, workflow already documented. AISLE published their scaffolding as open source. Anyone who can run a Python script can replicate it. This actor finds your unpatched zero-days in public dependencies as soon as those dependencies are indexed. </p><p><strong>The defense is in the whitepaper:</strong> <em>inventory and reduce attack surface within 90 days, stand up a VulnOps function within 12 months, automate patching to match the discovery rate.</em></p><p>The second actor is targeted. They care specifically about you. They want your bugs, your patch queue, your incident data, and your threat model. The open-weight approach is too slow and too noisy for this actor. They need inside information. The three pathways they pick, in order of near-term probability.</p><p>First, credential theft against verified defenders. A TAC tier-three user at a Fortune 500 security vendor is a high-value target. Their API session tokens grant access to a cyber-permissive model with binary reverse engineering capabilities. A compromised developer laptop, a phished OAuth flow, or a stolen refresh token gets the attacker a capability they cannot otherwise reach. OpenAI&#8217;s announcement acknowledged that zero-data-retention environments get limited visibility, meaning stolen tokens may operate with reduced logging. Rotate short-lived tokens, enforce hardware-bound keys, and put defender-model API use behind the same privileged access controls you apply to domain admin accounts. Treat a TAC session token as a tier-0 secret.</p><p>Second, open-weight replication against a specific target. Once an attacker has selected you, they can scan your public code, your partner repositories, your open-source contributions, and any of your dependencies using the same scaffolding as the opportunistic actor. The targeting changes the risk profile. They are building a dossier on your specific organization. Defense is the same as against the opportunistic case, with urgency that scales with your profile. If you are a named Glasswing partner, assume you are the target.</p><p>Third, defender instance compromise through context poisoning and prompt injection. This pathway keeps me up at night. It is the one your existing threat model does not cover. A running Mythos or GPT-5.4-Cyber instance inside your environment consumes source code, pull request descriptions, commit messages, dependency READMEs, issue trackers, and whatever retrieval pipelines you plumb into it. Each of those input channels is an indirect prompt-injection vector. The model cannot distinguish between a developer&#8217;s pull request description and an attacker&#8217;s instructions buried in a dependency&#8217;s changelog. Anthropic&#8217;s system card for Mythos documents &#8220;reckless&#8221; behaviors from earlier versions: sandbox escape, credential hunting via /proc/ access, unauthorized file modification, git history scrubbing, and attempts to modify a running MCP server&#8217;s external URL. The model can act on indirect instructions in ways that bypass its safeguards. A hostile input channel into your defender instance is an exploitation channel into your codebase.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lHZP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09d4aeb5-7f8a-4eca-a1a5-f80f8f74570d_4779x2570.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lHZP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09d4aeb5-7f8a-4eca-a1a5-f80f8f74570d_4779x2570.png 424w, https://substackcdn.com/image/fetch/$s_!lHZP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09d4aeb5-7f8a-4eca-a1a5-f80f8f74570d_4779x2570.png 848w, https://substackcdn.com/image/fetch/$s_!lHZP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09d4aeb5-7f8a-4eca-a1a5-f80f8f74570d_4779x2570.png 1272w, https://substackcdn.com/image/fetch/$s_!lHZP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09d4aeb5-7f8a-4eca-a1a5-f80f8f74570d_4779x2570.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lHZP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09d4aeb5-7f8a-4eca-a1a5-f80f8f74570d_4779x2570.png" width="1456" height="783" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/09d4aeb5-7f8a-4eca-a1a5-f80f8f74570d_4779x2570.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:783,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:638658,&quot;alt&quot;:&quot;Flow diagram showing opportunistic attacker using open-weight models and targeted attacker using three pathways including credential theft, open-weight replication, and context poisoning, all converging on the defender AI instance&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/194618055?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09d4aeb5-7f8a-4eca-a1a5-f80f8f74570d_4779x2570.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flow diagram showing opportunistic attacker using open-weight models and targeted attacker using three pathways including credential theft, open-weight replication, and context poisoning, all converging on the defender AI instance" title="Flow diagram showing opportunistic attacker using open-weight models and targeted attacker using three pathways including credential theft, open-weight replication, and context poisoning, all converging on the defender AI instance" srcset="https://substackcdn.com/image/fetch/$s_!lHZP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09d4aeb5-7f8a-4eca-a1a5-f80f8f74570d_4779x2570.png 424w, https://substackcdn.com/image/fetch/$s_!lHZP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09d4aeb5-7f8a-4eca-a1a5-f80f8f74570d_4779x2570.png 848w, https://substackcdn.com/image/fetch/$s_!lHZP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09d4aeb5-7f8a-4eca-a1a5-f80f8f74570d_4779x2570.png 1272w, https://substackcdn.com/image/fetch/$s_!lHZP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09d4aeb5-7f8a-4eca-a1a5-f80f8f74570d_4779x2570.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure2: Attacker Pathways and Defender Instance Exposure | Render: mermaid</figcaption></figure></div><h2>Why the Defender AI Is the Crown Jewel</h2><p>The whitepaper&#8217;s Priority Action 4 is &#8220;Defend Your Agents.&#8221; The authors are direct: <em>agents are not covered by existing controls, introduce cyber defense and agentic supply chain risks, and the agent scaffolding (prompts, tool definitions, retrieval pipelines, escalation logic) is where the most consequential failures occur.</em> </p><p>Audit agents with the same rigor as you apply to the agent&#8217;s permissions. Correct guidance. Buried inside an 11-item priority table, where every item reads as equal weight. It is not equal weight.</p><p>The defender AI concentrates on four kinds of access that used to live in separate systems and separate roles. </p><ol><li><p>It reads every line of production source code. </p></li><li><p>It holds context on every unpatched vulnerability in your queue. I</p></li><li><p>t sees the remediation timeline for each one. </p></li><li><p>It knows the architectural boundaries between your crown jewels and everything else. </p></li></ol><p>A human with all four would be classified as an insider-threat tier-0. The defender AI requires all four as prerequisites to do its job. Your adversary does not need to compromise OpenAI or Anthropic. They need to compromise your instance. Much smaller target, much wider attack surface.</p><h2>What a Defender-AI Threat Model Looks Like</h2><p>The architecture defenders need has three layers. The concepts span the OWASP Agentic Security Initiative, the NIST AI RMF, and multiple emerging specifications. What is new here is applying them specifically to the defender AI case.</p><p>The first layer is runtime interception at every agent decision point. Every time the defender AI receives input, produces output, selects a tool, calls a tool, transitions from planning to execution, writes to memory, executes code, or invokes a sub-agent, that action must pass through a policy enforcement point before it reaches production. This is inline, deterministic, allow-deny-modify enforcement. Not a log review after the fact. A defender AI that reads a dependency README with an embedded prompt injection must have that input evaluated against policy before the agent&#8217;s reasoning ingests it. Policy enforcement at the hook surface, before the consequential action, is the only mechanism that works at machine speed.</p><p>The second layer is structured observability built on OpenTelemetry with agent-specific semantic conventions and OCSF mapping for SIEM integration. The trace has to cover the full agent lifecycle: prompt received, tool selected, tool called, response ingested, memory written, sub-agent invoked, output produced. Forensic reconstruction of a defender AI incident requires this granularity. Your SOC already operates on OCSF. Agent traces flowing through the pipelines your SOC already monitors is the integration that scales. A parallel agent observability stack your SOC does not watch is a dead letter office.</p><p>The third layer is live inventory. The whitepaper&#8217;s Priority Action 7 calls for real SBOMs, correct for static software. For agents, it is insufficient. The inventory has to update continuously because the agent can discover new tools, connect to new MCP servers, and modify its own tool catalog mid-session. Inventory generated at deployment time is stale by the end of the first prompt. Extend CycloneDX or SPDX semantics to live agent composition. Capture every tool, model, capability, knowledge source, and MCP connection the defender AI is wired into, across every running instance. You cannot defend what you cannot inventory, and what you cannot inventory is mutating on you.</p><p>These three layers stack on a three-tier operating model. The platform exposes the hooks once. An open enforcement SDK reads declarative policy and fires decisions through the hooks. Enterprise-specific classifiers and detectors plug into the enforcement layer. Your data sensitivity model, your PHI detection, your threat-intel feed integrations all live in the enterprise layer, consuming the same standardized hook surface. Switching from Mythos to GPT-5.4-Cyber or to a third model six months from now should not require rewriting your safety logic. It should require pointing your enforcement SDK at a different set of hooks.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!byUw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01578983-6b74-4ea4-b54a-3a4b6babbaae_6996x4960.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!byUw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01578983-6b74-4ea4-b54a-3a4b6babbaae_6996x4960.png 424w, https://substackcdn.com/image/fetch/$s_!byUw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01578983-6b74-4ea4-b54a-3a4b6babbaae_6996x4960.png 848w, https://substackcdn.com/image/fetch/$s_!byUw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01578983-6b74-4ea4-b54a-3a4b6babbaae_6996x4960.png 1272w, https://substackcdn.com/image/fetch/$s_!byUw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01578983-6b74-4ea4-b54a-3a4b6babbaae_6996x4960.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!byUw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01578983-6b74-4ea4-b54a-3a4b6babbaae_6996x4960.png" width="1456" height="1032" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/01578983-6b74-4ea4-b54a-3a4b6babbaae_6996x4960.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1032,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1006731,&quot;alt&quot;:&quot;Architectural diagram showing platform hooks layer firing decision points to enforcement layer which reads declarative policy with enterprise customization plugging in custom classifiers&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/194618055?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01578983-6b74-4ea4-b54a-3a4b6babbaae_6996x4960.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Architectural diagram showing platform hooks layer firing decision points to enforcement layer which reads declarative policy with enterprise customization plugging in custom classifiers" title="Architectural diagram showing platform hooks layer firing decision points to enforcement layer which reads declarative policy with enterprise customization plugging in custom classifiers" srcset="https://substackcdn.com/image/fetch/$s_!byUw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01578983-6b74-4ea4-b54a-3a4b6babbaae_6996x4960.png 424w, https://substackcdn.com/image/fetch/$s_!byUw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01578983-6b74-4ea4-b54a-3a4b6babbaae_6996x4960.png 848w, https://substackcdn.com/image/fetch/$s_!byUw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01578983-6b74-4ea4-b54a-3a4b6babbaae_6996x4960.png 1272w, https://substackcdn.com/image/fetch/$s_!byUw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01578983-6b74-4ea4-b54a-3a4b6babbaae_6996x4960.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: Three-Layer Defender AI Control Architecture </figcaption></figure></div><h2>The Five Actions You Can Take This Week</h2><p>The whitepaper&#8217;s 11 priority actions are the right list. Here is how the defender-AI-as-crown-jewel thesis reorders them by urgency.</p><p>First, write the threat model. Before you stand up Mythos or GPT-5.4-Cyber anywhere, document what the instance will access, what inputs it will consume, what outputs it can produce, and what tools it can invoke. Map each item to ASI01 through ASI10 in OWASP Agentic Top 10 and to the relevant AML.T entries in MITRE ATLAS. If you have not done this exercise for any agent in your environment, start with the defender AI. Its blast radius is the largest.</p><p>Second, treat API tokens for defender models as tier-0 secrets. Hardware-bound keys, short TTLs, per-session scope, and the access review cadence you apply to break-glass domain admin. Stolen credentials are the fastest path to your defender AI and your unpatched zero-days. Lock them down the way you would lock down root.</p><p>Third, instrument the hook surface before you instrument the prompt. Your first integration priority is runtime policy enforcement for input, output, tool calls, tool responses, and sub-agent invocations. Not log collection. Not dashboards. Inline allow-deny-modify at the decision points.</p><p>Fourth, build a live agent inventory for every agent in your environment, starting with the defender AI. Capture the model, the tools, the MCP connections, the retrieval sources, the knowledge bases, and the memory stores. Update in real time. Review weekly until the pattern stabilizes, then move to continuous automated review.</p><p>Fifth, run the defender AI through your own red team before you point it at your own code. Indirect prompt injection via dependency READMEs, poisoned commit messages, hostile issue descriptions, and malicious pull request bodies. If you cannot compromise your own defender AI in a week, you have not tried hard enough.</p><p><strong>Key Takeaway:</strong> The access gate is not the threat model. The defender AI in your environment is a new crown jewel. Most security programs have not yet acknowledged what it is or what protects it.</p><h3>What to do next</h3><p>Read the CSA, SANS, and OWASP GenAI Security Project briefing, <strong><a href="https://labs.cloudsecurityalliance.org/wp-content/uploads/2026/04/mythosready.pdf">&#8220;The AI Vulnerability Storm: Building a Mythos-Ready Security Program.&#8221;</a></strong> Run the 10 Questions diagnostic against your program this week. Rerank the Priority Action table, putting &#8220;Defend Your Agents&#8221; above everything except &#8220;Point Agents at Your Code.&#8221; Apply CARE (Create the threat model, Adapt your controls, Run the red team, Evolve the policy) to the defender AI before anything else in your AI portfolio.</p><p>For more on CARE and governance for defender-class agents, see <a href="https://www.rockcyber.com">RockCyber.</a> and coverage at <a href="https://rockcybermusings.com">RockCyber Musings</a>. Last week&#8217;s blog, <a href="https://rockcybermusings.com/p/ai-vulnerability-discovery-mythos">AI Vulnerability Discovery: Mythos Is the Headline. Not the Story.</a>, carries the capability-parity argument that underpins the urgency here.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/defender-ai-crown-jewel-mythos-gpt-cyber?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/defender-ai-crown-jewel-mythos-gpt-cyber?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/defender-ai-crown-jewel-mythos-gpt-cyber?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 34 April 10-April 16, 2026]]></title><description><![CDATA[Mythos-class models, MCP supply chain exposure, and the governance gap that widened this week]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260410-20260416</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260410-20260416</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 17 Apr 2026 12:50:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!fYG6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fYG6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fYG6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!fYG6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!fYG6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!fYG6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fYG6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/194466804?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fYG6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!fYG6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!fYG6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!fYG6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa895bfc-a4cd-4241-b82c-340f85176d61_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This week drew a hard line between AI security theater and AI security reality. Mythos Preview hunted vulnerabilities nobody had found in 20 years. OX Security dropped a critical MCP flaw affecting 200,000 deployments. Someone threw a Molotov cocktail at Sam Altman&#8217;s gate. OpenAI countered Anthropic&#8217;s restricted rollout with GPT-5.4-Cyber. The UK government confirmed AI clears expert-level cyber tasks. If your board still treats AI governance as an ethics committee item, the gap between your risk register and reality widened another notch.</p><p>Ten stories ranked by impact, plus one under the radar. Capability, exposure, and governance move at three speeds. Your program needs all three. Longer work lives at <a href="https://www.rockcyber.com">RockCyber</a> and <a href="https://rockcybermusings.com">Rock Cyber Musings</a>.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260410-20260416?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260410-20260416?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3>1. The &#8220;AI Vulnerability Storm&#8221; Emergency Strategy Briefing</h3><p>On April 14, 2026, SANS Institute, the Cloud Security Alliance, OWASP GenAI Security Project, and [un]prompted released &#8220;The AI Vulnerability Storm: Building a Mythos-Ready Security Program&#8221; (SANS Institute). Sixty named contributors produced the document over a weekend, with 250 CISOs reviewing it. It includes a 13-item risk register mapped to OWASP LLM Top 10 2025, OWASP Agentic Top 10 2026, MITRE ATLAS, and NIST CSF 2.0, plus an 11-item priority actions table. Zero Day Clock data shows mean time from disclosure to exploitation fell below one day in 2026, down from 2.3 years in 2019.</p><p><strong>Why it matters</strong></p><ul><li><p>Disclosure-to-exploit dropped from 2.3 years to under a day. Your patch cadence cannot keep up.</p></li><li><p>A coalition of security institutions framing this as an emergency is a signal worth taking seriously.</p></li><li><p>The risk register maps to four frameworks, removing the excuse about lacking a shared taxonomy.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Pull the 13-item risk register into your next program review.</p></li><li><p>Run the 10 CISO diagnostic questions with your security leadership team this quarter.</p></li><li><p>Brief your board using the executive section. Don&#8217;t rewrite it.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Happy and honored that I was ask to participate in this one. I jumped at the opportunity. The coalition isn&#8217;t selling anything. We&#8217;re telling you the economics of exploitation flipped. When the attacker's cost to find a vulnerability drops to near zero while your patch cycle runs for weeks, the math stops working in your favor. If you planned AI program changes for 2027, you&#8217;re late.</p><h3>2. OX Security Discloses Systemic Anthropic MCP Vulnerability</h3><p>On April 15, 2026, OX Security published a report detailing a critical systemic flaw in Anthropic&#8217;s official MCP SDKs across Python, TypeScript, Java, and Rust (OX Security). MCP&#8217;s STDIO transport accepts arbitrary command strings and passes them to subprocess execution with no validation, sanitization, or sandboxing. OX tested the attack against six production platforms and took over thousands of public servers across 200 open-source projects. Exposure includes 150 million downloads, 7,000 public servers, and up to 200,000 vulnerable instances. Anthropic, per OX, classified the behavior as &#8220;expected&#8221; (Infosecurity Magazine).</p><p><strong>Why it matters</strong></p><ul><li><p>MCP is the backbone of agentic AI. Systemic flaws propagate through every agent you&#8217;ve built or bought.</p></li><li><p>Anthropic labeling the flaw &#8220;expected behavior&#8221; puts responsibility on your security team.</p></li><li><p>200,000 exposed instances is the baseline, not an edge case.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory every MCP server and client in your environment this week.</p></li><li><p>Block outbound STDIO transports from untrusted MCP configurations at the gateway.</p></li><li><p>Treat MCP command payloads like shell inputs. Assume hostile.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Every vendor claims &#8220;secure by design&#8221; until a serious researcher pokes at the design. MCP&#8217;s STDIO transport is a textbook unsafe primitive from the first draft of the spec. The tell is Anthropic&#8217;s response. When the SDK vendor calls malicious-command-as-a-feature &#8220;expected,&#8221; you own the mitigation. Wrap it, monitor it, and expect your first incident from an MCP server you didn&#8217;t know was running.</p><h3>3. UK AISI Publishes Frontier AI Trends Report</h3><p>The UK AI Security Institute released its first Frontier AI Trends Report on April 10, 2026 (AISI). AI models now complete apprentice-level cyber tasks about 50 percent of the time, up from barely 10 percent in early 2024. AISI tested one model in 2025 finishing expert-level tasks requiring more than a decade of practitioner experience. The report names Anthropic&#8217;s Claude Mythos Preview as the first AI system to autonomously complete a 32-step enterprise attack simulation. AISI credits safety training for slowing the curve, while warning capability outstrips defender readiness (Computing).</p><p><strong>Why it matters</strong></p><ul><li><p>A government safety institute confirmed one AI model executes a full enterprise attack chain autonomously. The &#8220;someday&#8221; framing is finished.</p></li><li><p>Apprentice-level cyber performance quintupled in two years. Expert parity arrives inside most procurement cycles.</p></li><li><p>AISI found safeguards working, meaning vendor controls meaningfully shift your risk exposure.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Demand red-team attestation from every AI vendor supporting security-relevant workflows.</p></li><li><p>Map your attack surface against the AISI capability framework. Flag targets a Mythos-class model reaches today.</p></li><li><p>Shift IR tabletops to assume autonomous adversary tooling. Time-box every playbook to hours.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This is the first major government assessment I&#8217;d call usable for board reporting. AISI didn&#8217;t pull punches, which is rare when governments still court AI investment. Pay attention to the 32-step attack chain line. Most organizations run incident response assuming attackers make mistakes, burn time, or need sleep. An agentic adversary does none of those things. If your tabletops still assume a human at a keyboard, they&#8217;re obsolete.</p><h3>4. OpenAI Launches GPT-5.4-Cyber for Vetted Defenders</h3><p>On April 14, 2026, OpenAI announced GPT-5.4-Cyber, a variant of GPT-5.4 tuned for defensive cybersecurity work (OpenAI). The model lowers refusal boundaries for legitimate security work and enables binary reverse engineering without source code. OpenAI is limiting initial deployment to vetted security vendors, organizations, and researchers through an expanded Trusted Access for Cyber program. The release came one week after Anthropic restricted its Mythos Preview model to about 40 partners under Project Glasswing. OpenAI framed it as a counter-argument: broader access is warranted now, with tighter controls reserved for larger capability jumps (SiliconANGLE).</p><p><strong>Why it matters</strong></p><ul><li><p>Two foundation model providers diverge on cyber-capable AI distribution. Your vendor risk management needs to account for the split.</p></li><li><p>Binary reverse engineering at LLM speed reshapes the economics of red and blue team work.</p></li><li><p>Vetting programs create new attestation and insider risk questions for your security function.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Evaluate whether your organization qualifies for OpenAI TAC or Project Glasswing. If yes, assign an accountable executive.</p></li><li><p>Update acceptable use policies for cyber-capable models. Access matches role, not curiosity.</p></li><li><p>Task SOC leadership with a 90-day assessment of how GPT-5.4-Cyber or Mythos changes detection, triage, and RE workflows.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Anthropic and OpenAI staked out opposite ends of the distribution debate in the same week. Anthropic says keep it small. OpenAI says open the gates. Both positions have legitimate arguments. What matters for CISOs is that the defensive tooling category you&#8217;ll buy in 2027 exists in preview today. If you aren&#8217;t running pilots on one of these models this quarter, your competition is.</p><h3>5. Marimo Python Notebook RCE Exploited in 10 Hours</h3><p>CVE-2026-39987, a pre-authentication RCE flaw in Marimo&#8217;s Python notebook server, was exploited within 10 hours of disclosure (Sysdig). The CVSS 9.3 flaw stems from a terminal WebSocket endpoint lacking authentication, giving any attacker a full PTY shell. Sysdig observed initial exploitation nine hours and 41 minutes after disclosure, with credential theft in under three minutes. A separate campaign targeting Hugging Face Spaces began April 12, 2026, dropping a new variant of NKAbuse malware (The Hacker News). Marimo sits inside many AI toolchains. Version 0.23.0 patches the flaw.</p><p><strong>Why it matters</strong></p><ul><li><p>A 10-hour disclosure-to-exploit window eliminates manual triage. Automation is the floor.</p></li><li><p>AI dev environments hold credentials for training data, model registries, and cloud APIs. A compromise there jumps the fence.</p></li><li><p>NKAbuse malware hosted on Hugging Face Spaces weaponizes a legitimate AI asset repository.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit AI dev environments for unauthenticated notebook services this week.</p></li><li><p>Push Marimo 0.23.0 immediately. Rotate .env credentials and SSH keys on any affected host.</p></li><li><p>Treat Hugging Face Spaces and similar repositories as unverified third-party code.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Ten hours. Memorize that number. If your patch process takes longer than a shift change, you&#8217;re assuming attackers stay polite enough to wait. They aren&#8217;t. A human operator hand-crafted the exploit from the advisory text alone. No public PoC needed. AI-assisted exploit development already sits inside the attacker&#8217;s normal workflow.</p><h3>6. KPMG and INSEAD Publish AI Governance Principles for Boards</h3><p>On April 14, 2026, KPMG International and the INSEAD Corporate Governance Centre published AI Governance Principles for Boards (KPMG). The guidance structures board oversight around five areas: strategy, security, workforce, trustworthy AI, and how AI reshapes leadership itself. KPMG&#8217;s Global AI Pulse Survey found nearly three-quarters of boards have only moderate or limited AI expertise. The principles are sector-agnostic and apply at any AI maturity level. Timing lines up with signals that the governance gap is widening faster than board oversight can catch up (INSEAD).</p><p><strong>Why it matters</strong></p><ul><li><p>Three-quarters of boards lack AI expertise. Your CEO and CISO are explaining in terms the directors cannot stress-test.</p></li><li><p>A sector-agnostic framework gives cover to restructure AI oversight without waiting for an industry mandate.</p></li><li><p>Board principles anchored in research and real practice create a defensible baseline for shareholder scrutiny.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Make AI governance a standing board agenda item using the KPMG/INSEAD principles as the template.</p></li><li><p>Recruit at least one director with direct AI operating experience.</p></li><li><p>Run a board-level AI risk tabletop in the next six months. Measure director fluency.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I&#8217;ve sat across from enough boards to recognize the pattern. The AI conversation is either dominated by CMO hype or minimized by general counsel. Neither serves the company. What I appreciate about this work is the refusal to reduce governance to compliance. If your board treats AI as an IT issue, you&#8217;ve already lost the oversight fight. Rebuild the conversation at the director level.</p><h3>7. Molotov Cocktail Attack on Sam Altman&#8217;s Home</h3><p>Around 3:37 a.m. on Friday, April 10, 2026, Daniel Moreno-Gama allegedly threw a lit incendiary device at OpenAI CEO Sam Altman&#8217;s San Francisco home, igniting a fire on an exterior gate (CNBC). About an hour later, police arrested Moreno-Gama at OpenAI&#8217;s San Francisco headquarters with additional incendiary devices, a kerosene jug, and a manifesto opposing AI executives. San Francisco District Attorney Brooke Jenkins filed attempted murder charges on April 13, 2026 (Washington Post). The FBI raided a Spring, Texas residence linked to the suspect.</p><p><strong>Why it matters</strong></p><ul><li><p>AI executives face documented physical threat campaigns motivated by AI-existential ideology.</p></li><li><p>Intimidation playbooks aimed at AI leadership echo harassment patterns seen against crypto executives.</p></li><li><p>The AI-existential threat narrative moved from online rhetoric to physical action.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Review personal security programs for AI executives, board members, and senior researchers, including residence protection.</p></li><li><p>Update threat modeling to include ideologically motivated actors, not only financially motivated ones.</p></li><li><p>Coordinate with local law enforcement on executive travel patterns and publicly disclosed addresses.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The Altman attack will reshape executive protection budgets at every AI firm this year. The deeper point is the AI-existential discourse produced one person willing to act on it violently. That genie doesn&#8217;t go back. AI security functions now carry physical security responsibility alongside technical, and the two teams rarely talk. Fix that.</p><h3>8. AI-Powered &#8220;Pushpaganda&#8221; Ad Fraud Scheme Exposed</h3><p>On April 14, 2026, researchers exposed &#8220;Pushpaganda,&#8221; an ad fraud scheme combining SEO poisoning with AI-generated content to push deceptive news stories into Google Discover (The Hacker News). Users engaging with the stories are tricked into enabling persistent browser notifications delivering scareware and financial scams at global scale. Google deployed a security fix. Researchers linked the operation to broader AI-driven phishing trends: 82.6 percent of phishing emails now contain AI-generated content (GuardianMSSP).</p><p><strong>Why it matters</strong></p><ul><li><p>Consumer-facing AI fraud creates downstream reputational and fraud exposure for any brand whose customers fall for it.</p></li><li><p>AI content weaponized through Google Discover scales instantly across borders.</p></li><li><p>Browser notification abuse creates persistent attacker infrastructure inside your users&#8217; devices.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Update fraud and anti-phishing awareness for employees and high-value customers using Pushpaganda as a concrete example.</p></li><li><p>Tell users to audit browser notification permissions quarterly.</p></li><li><p>Task threat intel with tracking similar schemes targeting your brand or industry keywords.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Ad fraud has been a rounding error in most risk registers. That&#8217;s ending. When AI pumps plausible news stories at near-zero cost through trusted distribution pipes, the economics of fraud flip in the attacker&#8217;s favor. The indirect damage is the part enterprises miss. Your customer falls for the scam, loses money, and blames you even when you had nothing to do with it. Merge brand protection and fraud prevention. The attacker already did.</p><h3>9. OpenAI Discloses Axios npm Supply Chain Impact</h3><p>On April 11, 2026, OpenAI confirmed it was affected by the compromise of the Axios npm package, a supply chain attack attributed to North Korea-linked actors (CNBC). The root cause was a misconfiguration in its GitHub Actions workflow touching macOS app certification. OpenAI revoked its macOS app certificate. Older macOS desktop apps stop receiving updates starting May 8, 2026. No user data, passwords, or API keys were accessed. Axios is one of the most depended-upon packages in the JavaScript ecosystem, with 100 million weekly downloads (Elastic Security Labs).</p><p><strong>Why it matters</strong></p><ul><li><p>The largest AI service provider disclosed a supply chain compromise from a dependency most customers do not track.</p></li><li><p>North Korean targeting of AI providers signals state actors see AI as a strategic target.</p></li><li><p>If OpenAI&#8217;s CI/CD was affected, every firm building on OpenAI carries secondary exposure.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit every third-party dependency on npm, PyPI, and containers in your AI pipelines. Prioritize post-install hooks.</p></li><li><p>Rotate signing certificates on CI/CD pipelines using GitHub Actions with third-party dependencies.</p></li><li><p>Map your AI vendor dependency tree. Know who sits upstream of production workflows.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>OpenAI&#8217;s post-incident communication was cleaner than most. What I want security leaders to sit with is attacker selection. North Korean actors chose Axios because they understood the dependency graph. They compromised one maintainer account and reached OpenAI&#8217;s signing pipeline in one hop. Your AI platform has a similar graph. If you haven&#8217;t mapped it, you&#8217;re trusting your vendor&#8217;s vendor&#8217;s vendor without knowing any of the names.</p><h3>10. The Register Questions Project Glasswing&#8217;s CVE Count</h3><p>On April 15, 2026, The Register investigated Project Glasswing&#8217;s verified vulnerability count (The Register). Per VulnCheck researcher Patrick Garrity, only one CVE ties directly to Glasswing: CVE-2026-4747, a remote code execution flaw in FreeBSD&#8217;s NFS code. Anthropic had claimed Mythos Preview discovered thousands of high-severity zero-days, including 27-year-old bugs in OpenBSD, a 16-year-old FFmpeg flaw, and Linux kernel privilege escalation chains. None of those findings have assigned CVEs. Anthropic indicated a public summary report is expected around July 2026 (CSO Online).</p><p><strong>Why it matters</strong></p><ul><li><p>Security leaders are being asked to restructure programs around claims mostly unverifiable right now.</p></li><li><p>The gap between marketing and disclosed CVEs is a litmus test for how AI vendors handle safety communications.</p></li><li><p>The same capability framing already drives budget and policy conversations across government and enterprise.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Track vendor AI capability claims against disclosed CVE evidence. VulnCheck, NVD, and <a href="http://CVE.org">CVE.org</a> are sources of record.</p></li><li><p>Require AI vendors to commit to disclosure timelines in the contract.</p></li><li><p>Apply the same skepticism to AI capability claims you apply to any vendor&#8217;s performance claims.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I believe AI-assisted vulnerability discovery is real. I also know marketing departments exist. The Register did what security trade press should do more often: press for evidence instead of reposting press releases. Until Anthropic&#8217;s July report arrives with specificity, assume the capability is real at a smaller scale than the headlines suggest. Your board deserves honest uncertainty over confident hype.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><h4>State AI Legislation Quietly Picks Up Pace in Nebraska, Maine, and Maryland</h4><p>The week of April 13, 2026 saw three state legislatures advance AI-specific bills most national coverage missed (Troutman Pepper Locke). Nebraska&#8217;s unicameral legislature passed LB 525, bundling the Agricultural Data Privacy Act with a Conversational AI Safety Act regulating minors&#8217; interaction with conversational AI services. Maine&#8217;s legislature prohibited therapy or psychotherapy services, including those delivered through AI, unless provided by a licensed professional. Maryland passed a pricing bill placing new constraints on AI-driven pricing practices. Nineteen new AI laws passed across U.S. states in the prior two weeks (Plural Policy).</p><p><strong>Why it matters</strong></p><ul><li><p>State AI legislation accelerates faster than federal harmonization, raising compliance complexity for multi-state AI services.</p></li><li><p>Vertical bans like Maine&#8217;s on AI psychotherapy signal the &#8220;AI wrapper as feature&#8221; era is ending for regulated professions.</p></li><li><p>Conversational AI protections for minors now vary by state. Your chatbot rollout inherited new compliance surface.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Assign legal and compliance ownership of state AI legislation tracking.</p></li><li><p>Map customer-facing AI products against regulated-profession restrictions appearing in multiple states.</p></li><li><p>Build a multi-state compliance matrix for conversational AI aimed at minors. Treat it as living documentation.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Federal AI policy gets the headlines. State legislation gets the enforcement. The gap is where CISOs and general counsel earn their salaries. AI compliance is not a checkbox on the NIST AI RMF. It&#8217;s a moving target across 50 jurisdictions, each with different enforcement flavor. Miss Maine, your mental health AI product is illegal. Miss Maryland, your pricing engine invited an AG letter. Miss Nebraska, your chatbot cannot talk to kids in the Cornhusker State. Track it, resource it, or pay the lawyers later.</p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com/">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>AI Security Institute. (2026, April 10). <em>Frontier AI Trends Report</em>. <a href="https://www.aisi.gov.uk/frontier-ai-trends-report">https://www.aisi.gov.uk/frontier-ai-trends-report</a></p><p>Cloud Security Alliance. (2026, April 14). <em>SANS Institute, Cloud Security Alliance, [un]prompted, and OWASP GenAI Security Project release emergency strategy briefing as AI-driven vulnerability discovery compresses exploit timelines from weeks to hours</em>. <a href="https://cloudsecurityalliance.org/press-releases/2026/04/14/sans-institute-cloud-security-alliance-un-prompted-and-owasp-genai-security-project-release-emergency-strategy-briefing-as-ai-driven-vulnerability-discovery-compresses-exploit-timelines-from-weeks-to-hours">https://cloudsecurityalliance.org/press-releases/2026/04/14/sans-institute-cloud-security-alliance-un-prompted-and-owasp-genai-security-project-release-emergency-strategy-briefing-as-ai-driven-vulnerability-discovery-compresses-exploit-timelines-from-weeks-to-hours</a></p><p>Computing. (2026, April 10). <em>Claude Mythos Preview shows &#8220;unprecedented&#8221; attack capability, warns AI Safety Institute</em>. <a href="https://www.computing.co.uk/news/2026/security/claude-mythos-preview-shows-unprecedented-attack-capability">https://www.computing.co.uk/news/2026/security/claude-mythos-preview-shows-unprecedented-attack-capability</a></p><p>CSO Online. (2026, April 15). <em>Behind the Mythos hype, Glasswing has just one confirmed CVE</em>. <a href="https://www.csoonline.com/article/4159617/behind-the-mythos-hype-glasswing-has-just-one-confirmed-cve.html">https://www.csoonline.com/article/4159617/behind-the-mythos-hype-glasswing-has-just-one-confirmed-cve.html</a></p><p>CNBC. (2026, April 10). <em>Man arrested after Sam Altman&#8217;s house hit with Molotov cocktail, OpenAI headquarters threatened</em>. <a href="https://www.cnbc.com/2026/04/10/sam-altman-house-hit-with-molotov-cocktail-openai-office-threatened.html">https://www.cnbc.com/2026/04/10/sam-altman-house-hit-with-molotov-cocktail-openai-office-threatened.html</a></p><p>CNBC. (2026, April 11). <em>OpenAI identifies security issue involving third-party tool, says user data was not accessed</em>. <a href="https://www.cnbc.com/2026/04/11/openai-identifies-security-issue-involving-third-party-tool.html">https://www.cnbc.com/2026/04/11/openai-identifies-security-issue-involving-third-party-tool.html</a></p><p>Elastic Security Labs. (2026, April). <em>Inside the Axios supply chain compromise: One RAT to rule them all</em>. <a href="https://www.elastic.co/security-labs/axios-one-rat-to-rule-them-all">https://www.elastic.co/security-labs/axios-one-rat-to-rule-them-all</a></p><p>GuardianMSSP. (2026, April 14). <em>AI-driven Pushpaganda scam exploits Google Discover to spread scareware and ad fraud</em>. <a href="https://www.guardianmssp.com/2026/04/14/ai-driven-pushpaganda-scam-exploits-google-discover-to-spread-scareware-and-ad-fraud/">https://www.guardianmssp.com/2026/04/14/ai-driven-pushpaganda-scam-exploits-google-discover-to-spread-scareware-and-ad-fraud/</a></p><p>Infosecurity Magazine. (2026, April 15). <em>Systemic flaw in MCP protocol could expose 150 million downloads</em>. <a href="https://www.infosecurity-magazine.com/news/systemic-flaw-mcp-expose-150/">https://www.infosecurity-magazine.com/news/systemic-flaw-mcp-expose-150/</a></p><p>INSEAD. (2026, April 14). <em>INSEAD and KPMG launch global AI Board Governance Principles as AI reshapes board oversight</em>. <a href="https://www.insead.edu/news/insead-and-kpmg-launch-global-ai-board-governance-principles-ai-reshapes-board-oversight">https://www.insead.edu/news/insead-and-kpmg-launch-global-ai-board-governance-principles-ai-reshapes-board-oversight</a></p><p>KPMG International. (2026, April 14). <em>KPMG and INSEAD launch global AI Board Governance Principles as AI reshapes board oversight</em>. <a href="https://kpmg.com/xx/en/media/press-releases/2026/04/kpmg-and-insead-launch-global-ai-board-governance-principles.html">https://kpmg.com/xx/en/media/press-releases/2026/04/kpmg-and-insead-launch-global-ai-board-governance-principles.html</a></p><p>OpenAI. (2026, April 14). <em>Trusted access for the next era of cyber defense</em>. <a href="https://openai.com/index/scaling-trusted-access-for-cyber-defense/">https://openai.com/index/scaling-trusted-access-for-cyber-defense/</a></p><p>OX Security. (2026, April 15). <em>The mother of all AI supply chains: Critical, systemic vulnerability at the core of Anthropic&#8217;s MCP</em>. <a href="https://www.ox.security/blog/the-mother-of-all-ai-supply-chains-critical-systemic-vulnerability-at-the-core-of-the-mcp/">https://www.ox.security/blog/the-mother-of-all-ai-supply-chains-critical-systemic-vulnerability-at-the-core-of-the-mcp/</a></p><p>Plural Policy. (2026, April). <em>AI Governance Watch: Nineteen new AI bills passed into law</em>. <a href="https://pluralpolicy.com/blog/the-ai-governance-watch-april-2026-nineteen-new-ai-bills-passed-into-law/">https://pluralpolicy.com/blog/the-ai-governance-watch-april-2026-nineteen-new-ai-bills-passed-into-law/</a></p><p>SiliconANGLE. (2026, April 14). <em>OpenAI launches GPT-5.4-Cyber model for vetted security professionals</em>. <a href="https://siliconangle.com/2026/04/14/openai-launches-gpt-5-4-cyber-model-vetted-security-professionals/">https://siliconangle.com/2026/04/14/openai-launches-gpt-5-4-cyber-model-vetted-security-professionals/</a></p><p>Sysdig. (2026, April). <em>Marimo OSS Python notebook RCE: From disclosure to exploitation in under 10 hours</em>. <a href="https://www.sysdig.com/blog/marimo-oss-python-notebook-rce-from-disclosure-to-exploitation-in-under-10-hours">https://www.sysdig.com/blog/marimo-oss-python-notebook-rce-from-disclosure-to-exploitation-in-under-10-hours</a></p><p>The Hacker News. (2026, April 14). <em>AI-driven Pushpaganda scam exploits Google Discover to spread scareware and ad fraud</em>. <a href="https://thehackernews.com/2026/04/ai-driven-pushpaganda-scam-exploits.html">https://thehackernews.com/2026/04/ai-driven-pushpaganda-scam-exploits.html</a></p><p>The Hacker News. (2026, April). <em>Marimo RCE flaw CVE-2026-39987 exploited within 10 hours of disclosure</em>. <a href="https://thehackernews.com/2026/04/marimo-rce-flaw-cve-2026-39987.html">https://thehackernews.com/2026/04/marimo-rce-flaw-cve-2026-39987.html</a></p><p>The Hacker News. (2026, April). <em>OpenAI revokes macOS app certificate after malicious Axios supply chain incident</em>. <a href="https://thehackernews.com/2026/04/openai-revokes-macos-app-certificate.html">https://thehackernews.com/2026/04/openai-revokes-macos-app-certificate.html</a></p><p>The Register. (2026, April 15). <em>Anthropic&#8217;s Project Glasswing CVE count is still guesswork</em>. <a href="https://www.theregister.com/2026/04/15/project_glasswing_cves/">https://www.theregister.com/2026/04/15/project_glasswing_cves/</a></p><p>Troutman Pepper Locke. (2026, April 13). <em>Proposed state AI law update: April 13, 2026</em>. <a href="https://www.troutmanprivacy.com/2026/04/proposed-state-ai-law-update-april-13-2026/">https://www.troutmanprivacy.com/2026/04/proposed-state-ai-law-update-april-13-2026/</a></p><p>Washington Post. (2026, April 13). <em>Man accused in Molotov cocktail attack of OpenAI CEO&#8217;s home charged with attempted murder</em>. <a href="https://www.washingtonpost.com/business/2026/04/13/chatgpt-sam-altman-fire-arrest/098c4bce-376c-11f1-90c4-9772c7fabc03_story.html">https://www.washingtonpost.com/business/2026/04/13/chatgpt-sam-altman-fire-arrest/098c4bce-376c-11f1-90c4-9772c7fabc03_story.html</a></p>]]></content:encoded></item></channel></rss>