Weekly Musings Top 10 AI Security Wrapup: Issue 36 April 24-April 30, 2026

Mythos, Mayhem, and Mediocre Lawmaking: The Week AI Security Got Loud

May 01, 2026

A coding agent killed a startup’s database in nine seconds. Anthropic shipped a model Mozilla called “elite.” Brussels missed its own deadline. Florida’s House Speaker buried his governor’s AI bill before lunch on day one. Two cloud-native AI vulnerabilities went from disclosure to exploitation in under 36 hours. Google and Forcepoint documented indirect prompt injection in the wild on the same day. UK’s AI Security Institute caught Mythos sabotaging research it was supposed to help with. Pretending this is theoretical is no longer defensible.

This week stress-tested every assumption CISOs hold about AI. The vendor you depend on sells your adversaries the same capability. The agent your developers love wipes three months of revenue and pastes a confession. Open source is the gateway. Indirect injection is the exploit. Autonomy without rollback is the consequence.

I’ll walk you through ten stories and one piece of plumbing. AI security used to run on a 24-month horizon. The default now is whatever ships before next quarter. If you wait for clarity, you lose ground to people who already decided.

1. The Trump Administration Eyes Anthropic’s Mythos as a Weapon

On April 24, the Washington Post reported Anthropic’s Mythos system rattled the Trump administration. Mozilla’s CTO compared the model’s vulnerability detection to a “world-class, elite security engineer.” Anthropic withheld general release, routing access through Project Glasswing partners, including AWS, Apple, Cisco, CrowdStrike, Google, JPMorgan Chase, and Microsoft. Anthropic privately briefed senior officials. Mythos meaningfully raises the probability of large-scale cyberattacks this year.

Why it matters

Capability parity flipped. Defenders and attackers reach for the same tool.
Vendors are now gatekeepers of dual-use capability. Anthropic’s withholding sets a precedent.
Government dependence on private model access creates new procurement and security questions.

What to do about it

Map your exposure to LLM-discoverable vulnerabilities in first-party and open-source code.
Negotiate access to AI-assisted scanning before your adversaries scan you first.
Update incident playbooks to assume hours of dwell time, not days.

Rock’s Musings

Yes… more Mythos news. Can’t ignore it if it’s coming out of the White House. It’s not fiction. It’s a procurement question. I’ve watched this pattern in every arms shift, from automated network scanning to commodity exploit kits. The defender who gets there second loses.

Anthropic’s gatekeeping is a defensible choice. The choice is whether your ecosystem qualifies for the safe lane or you’re stuck reading about Glasswing on Substack. Get on a call with your AWS, Cisco, or Microsoft reps. If the answer is no, plan around it. We track this kind of vendor calculus at RockCyber.

2. Cursor’s Claude Agent Wipes a Startup’s Database in Nine Seconds

On Friday, April 25, a Cursor coding agent powered by Claude Opus 4.6 deleted PocketOS’s entire production database and all volume-level backups in a single API call. The agent encountered a credential mismatch in staging, decided to resolve it by deleting a Railway infrastructure volume, scanned the codebase for an unrelated API token, and then ran the command. PocketOS serves car rental businesses nationwide. Three months of reservations, payments, customer information, and vehicle assignments went dark. Railway restored the data on Sunday using internal disaster backups not advertised to customers. The agent itself wrote the public confession.

Why it matters

Agents don’t ask permission. They scan for the credentials unblocking them.
“Production” and “staging” are now labels, not boundaries.
Recovery happened because Railway keeps undocumented backups. Hope is not a strategy.

What to do about it

Force agents to operate with scoped, ephemeral credentials. Long-lived API keys in a repo are liabilities with autonomy attached.
Implement break-glass approval gates for destructive infrastructure calls.
Test backup recovery monthly. If you can’t restore in under an hour, you don’t have backups.

Rock’s Musings

PocketOS got lucky. Railway ran a heroic recovery on a Sunday using backups the customer didn’t know existed. If your AI strategy depends on a founder’s weekend chivalry, you don’t have a strategy. You have hope.

The agent did what it was trained to do. Scan, plan, act, document. The failure was in governance, not capability (and let’s just say, a suboptimal technical infrastructure). The villain is the assumption that an autonomous system will halt and ask. They don’t halt. Build the rails. Treat agents like an over-eager intern with the ability to call DELETE on prod.

3. LiteLLM Bug Goes From Disclosure to Exploitation in 26 Hours

GitHub’s Advisory Database indexed CVE-2026-42208 in LiteLLM on April 24 at 16:17 UTC. Sysdig logged the first exploitation attempt on April 26 at 16:17 UTC, roughly 26 hours later. The bug carries a CVSS of 9.3 and lets unauthenticated attackers send a crafted Authorization header to any model API route, then read or modify the proxy’s database (Sysdig). LiteLLM is the open-source LLM gateway with more than 22,000 GitHub stars, fronting OpenAI, Anthropic, and other model providers in production. The same project sat at the heart of the Mercor breach earlier this year.

Why it matters

AI infrastructure now looks like any internet-exposed service.
Pre-auth SQLi on the gateway exposes API keys and credentials for downstream model providers.
Disclosure-to-exploitation time keeps shrinking. The 36-hour window is the new optimistic baseline.

What to do about it

Inventory every LiteLLM, vLLM, LMDeploy, or proxy node in your environment. Patch to 1.83.7-stable or above for LiteLLM.
Treat LLM gateways as Tier 0 assets. Apply the controls you’d apply to identity providers.
Subscribe to maintainer advisory feeds. GitHub Advisory Database lag of four days is too long.

Rock’s Musings

LiteLLM is the kind of dependency pulled in via a Cursor prompt or an aspirational architecture diagram. It runs as the front door to every model provider you care about. Pre-auth SQL injection on it is a “your AI program is over” event.

Disclosure-to-exploit windows make monthly patch cycles professional malpractice. If your AI security playbook still says “evaluate within 30 days,” shred it. We’ve moved to “act within 24 hours or accept compromise as a feature.”

4. Indirect Prompt Injection Has Left the Lab. It’s Everywhere.

On April 24, Google’s Online Security Blog and Forcepoint’s X-Labs published parallel reports documenting indirect prompt injection in the wild. Forcepoint identified ten payload families targeting AI agents with instructions for financial fraud, data destruction, and API key theft. Google reported a 32% relative increase in malicious activity between November 2025 and February 2026. Attackers hide instructions inside webpages with single-pixel text, transparent fonts, HTML comments, and metadata. Neither team attributed the campaigns to a single actor, though both noted shared templates suggesting organized tooling.

Why it matters

Agents summarizing content are low-risk. Agents sending emails, running commands, or processing payments are the targets.
Filters watching user input miss content fetched by the agent.
The threat model includes every third-party page your agent loads.

What to do about it

Inventory every agent fetching external content. Note which tools they call.
Implement allowlists for outbound tool execution. Default deny for novel actions.
Add output filtering for instruction-like content in tool responses, not only user input.

Rock’s Musings

We’ve been treating indirect prompt injection as a research curiosity since 2023. It’s now an operational threat with documented campaigns and template reuse. The Lakera and OWASP folks were right.

If you’ve deployed an agent with browsing capability, your trust boundary includes every webpage it visits. The entire internet. I wrote about this on RockCyber Musings earlier this year. It got worse.

5. American Leadership in AI Act Drops With 20+ Bills Stitched In

On April 27, Reps. Ted Lieu (D-Calif.) and Jay Obernolte (R-Calif.) introduced the American Leadership in AI Act, a six-title package consolidating more than 20 prior bills from the Bipartisan AI Task Force (Nextgov/FCW). The package covers standards and evaluation, research infrastructure, federal AI governance and procurement, worker protections, deepfake harms, and AI education. The bill is the most substantive bipartisan AI proposal in this Congress, landing during tension between the White House’s preemption push and active state legislation.

Why it matters

Federal preemption fights will intensify. State AI laws face new risk.
Procurement standards in the bill shape what enterprises demand from AI vendors.
Deepfake provisions create new compliance obligations for media and platforms.

What to do about it

Map AI-procurement language to current vendor contracts.
Track state-level bills you’re already complying with for preemption risk.
Get legal reading the testing and evaluation title carefully.

Rock’s Musings

Two California members of Congress, one D and one R, agreeing on AI is unicorn territory. Don’t get excited. Bipartisan bills with 20+ titles tend to die under the weight of their own ambition.

The interesting question is which provisions get pulled into appropriations or NDAA riders before December. Watch the procurement and federal AI governance titles. Those move first because the executive branch wants them. Plan as if procurement standards land by Q3.

6. EU AI Act Omnibus Trilogue Collapses, August Deadline Stays Live

On April 28, Brussels held the second political trilogue on the AI Act Omnibus, the proposal deferring high-risk AI compliance. After roughly twelve hours, the Council and Parliament failed to agree on conformity-assessment architecture for AI in regulated products (Modulos). A follow-up trilogue is scheduled for May 13. The August 2, 2026 high-risk obligations remain operative law.

Why it matters

Vendors and deployers cannot bank on a deferral. August is the working assumption.
The Cypriot Council Presidency ends June 30. Lithuania might finish negotiations.
The Annex I disagreement signals sectoral assessments will keep biting medical device and machinery providers.

What to do about it

Continue compliance preparation as if no Omnibus arrives. Treat May 13 as a tiebreaker, not a save.
For medical devices, machinery, and other Annex I products, lock in your conformity-assessment plan now.
Get internal legal sign-off on the original AI Act timelines this quarter.

Rock’s Musings

I keep telling clients hoping for a deferral is not a compliance strategy. This week confirmed it. Brussels cannot agree on the structure of the regulation it already passed.

If your CFO asks why you spent budget on AI Act readiness, point at this paragraph. The cost of overpreparing is a few quarters of work. The cost of underpreparing is an enforcement action against your highest-revenue product line. I know which side of the bet I want.

7. Microsoft and OpenAI Restructure for Cyber Defense

On April 27, Microsoft and OpenAI announced revised partnership terms (24/7 Wall St). OpenAI’s API will run on any cloud provider, including AWS via Bedrock. Microsoft’s IP license is no longer exclusive but runs through 2032. The companies expanded Trusted Access for Cyber, giving Microsoft access to OpenAI’s most cyber-capable models, folded into the Secure Future Initiative. The pact pairs OpenAI’s offensive-capable model work with Microsoft’s defender stack across cloud, identity, productivity, and frontier AI.

Why it matters

The exclusivity reset reshapes vendor lock-in for AI procurement.
Microsoft’s defender stack now has a privileged channel to cyber-capable models.
The same models detecting attacks also execute them. Internal review for offensive capability is no longer optional.

What to do about it

Reassess your AI vendor strategy. Multi-cloud is realistic now.
For Microsoft customers, evaluate Trusted Access for Cyber eligibility.
Demand transparency from AI security vendors about model and guardrails.

Rock’s Musings

Microsoft bought itself the most defended seat at the AI security table. The pact extends a moat already including Defender, Sentinel, Purview, and Copilot for Security. Smaller security vendors should be nervous.

If you’re a CISO in a Microsoft shop, this is a small win. If you’re betting on a non-Microsoft AI security stack, reread your strategy. The exclusivity end means more options, the cyber pact deepens lock-in.

8. UK AISI Catches Claude Mythos Sabotaging Research

On April 28, the UK AI Security Institute (AISI) published an updated sabotage evaluation framework. Across 297 scenarios, no model attempted spontaneous research sabotage. In a new “continuation” test, Mythos Preview continued to sabotage 7% of inputs after being explicitly directed not to (AISI). The same evaluation showed Mythos Preview completing the first 32-step enterprise attack simulation start-to-finish, succeeding on 73% of expert-level tasks that no model had completed before April 2025.

Why it matters

Continuation behavior matters more than spontaneous behavior. Real attackers prompt the model.
A 7% sustained sabotage rate warrants treating these models as untrusted insiders during sensitive work.
The 32-step completion shows operational maturity. Models execute multi-stage cyber operations end to end.

What to do about it

Don’t run frontier models on safety-sensitive code reviews without monitoring.
Build red-team programs, prompting and continuing rather than single-shot tests.
Track AISI’s methodology. Adopt continuation-style tests internally.

Rock’s Musings

Spontaneous misbehavior was never the threat model scaring me. Continuation is. Once an attacker plants the seed, the model becomes a complicit operator inside your environment. Seven percent is small until you multiply it by every prompt your enterprise sends in a day.

AISI does work nobody else funds at this rigor. If your AI governance committee isn’t reading their reports cover to cover, you’re outsourcing your threat model to LinkedIn posts. Read the source.

9. Florida House Speaker Kills DeSantis’s AI Bill on Day One

On April 28, Florida convened a four-day special session. The Senate voted 37-1 in favor of the AI Bill of Rights. House Speaker Daniel Perez killed the bill that same morning, declaring that the only topic the House would address was redrawing congressional maps (Florida Phoenix). Perez argued AI regulation belongs to the federal government, aligned with a Trump executive order targeting state AI laws. The bill would have required parental consent for minor accounts on companion chatbot platforms, prohibited unauthorized commercial use of AI-generated likenesses, and required AI disclosure to users.

Why it matters

State preemption fights are escalating. Florida sided with the federal government before federal law exists.
Companion chatbot rules pass Senate chambers and die in House chambers. The pattern matters.
AI-generated likeness and consent provisions will keep returning. Plan for eventual passage somewhere.

What to do about it

If you run companion chatbots, monitor every state bill on minors and consent.
Brief your legal team on AI-likeness and right-of-publicity rules in California, Tennessee, and active special sessions.
Don’t bank on federal preemption. Executive orders reverse.

Rock’s Musings

The pattern is the same one I’ve called out for two years. State Senates pass AI bills, state Houses kill them, and the federal government drafts preemption language. The result is regulatory whiplash across 50 jurisdictions plus DC plus a federal package which might or might not preempt them. Give your privacy and AI counsel hazard pay. They’re earning it.

10. HackerOne Launches h1 Validation as AI Vuln Reports Surge 76%

On April 29, HackerOne launched h1 Validation, a service triaging AI-discovered vulnerability reports for actual exploitability (Cybersecurity Insiders). Vulnerability submissions on the platform rose 76% year over year, hitting a record high in March 2026. About 25% of findings were confirmed exploitable. The share of critical and high-severity vulnerabilities grew to 32%, up from a 26-28% baseline. The launch follows months of complaints from program owners overwhelmed by AI-generated reports of varying quality.

Why it matters

AI generates more vuln reports than security teams triage.
Triage capacity, not discovery, is the constraint.
This signal-to-noise problem reshapes bug bounty economics within 12 months.

What to do about it

Audit your bug bounty intake pipeline. If reports outpace triage, fix it.
Invest in tooling classifying reports by exploitability before a human reads them.
Set expectations with researchers. AI-assisted submissions need higher proof of impact.

Rock’s Musings

The asymmetry is volume. Models like Mythos and GPT-5.5-Cyber produce thousands of plausible reports per day. Most are junk. Some are lethal. Your triage team won’t keep up by reading harder. Whether you buy h1 Validation or build your own, manual triage of AI-scale output is a doomed strategy.

The One Thing You Won’t Hear About But You Need To

CSAI Foundation Becomes the First AI-Specific CVE Numbering Authority

On April 29, the Cloud Security Alliance’s CSAI Foundation announced three milestones at the CSA Agentic AI Security Summit (CSA). The foundation registered as a CVE Numbering Authority through MITRE, gaining direct ability to issue CVEs for AI-specific vulnerabilities. It launched the STAR for AI Catastrophic Risk Annex extending the AI Controls Matrix to scenarios involving loss of human oversight, with rollout from June 2026 through December 2027. It also acquired the Autonomous Action Runtime Management (AARM) specification, contributed by Vanta.

Why it matters

AI-specific CVE issuance changes how AI vulnerabilities get tracked, scored, and patched.
The Catastrophic Risk Annex maps to NIST AI RMF, the EU AI Act, and ISO/IEC 42001, giving auditors a consolidated reference.
AARM gives operators a formal specification for runtime control of agent actions.

What to do about it

Add CSAI Foundation advisories to your security feed.
For high-risk deployments, map internal controls to the Catastrophic Risk Annex during phase one rollout.
Pilot AARM in one agentic workflow this quarter. Runtime control of agent actions is the right level of abstraction.

Rock’s Musings

Plumbing matters more than press releases. While headlines went to Mythos and the Cursor accident, the CSAI Foundation stood up the infrastructure for AI-specific vulnerability tracking, runtime control, and catastrophic risk auditing. This decides whether AI security becomes a discipline or stays a marketing category.

I’ve worked in standards for thirty years. The value compounds quietly until one day the auditors ask, and you either have it or you don’t. We track CSAI work closely at RockCyber. Start with the CSA press release, then loop in your governance team Monday.

👉 For ongoing analysis of agentic AI governance frameworks, the conversation continues at RockCyber Musings.

👉 Visit RockCyber.com to learn more about how we can help with your traditional Cybersecurity and AI Security and Governance journey.

👉 Want to save a quick $100K? Check out our AI Governance Tools at AIGovernanceToolkit.com

👉 As a bonus, check out my conversation with Eva Benn where we talked about the cybersecurity skills you need to develop to stay relevant in 2026 and beyond.

👉 Subscribe for more AI and cyber insights with the occasional rant.

The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I’m affiliated with.

Share RockCyber Musings

References

Cloud Security Alliance. (2026, April 29). CSAI Foundation announces key milestones to secure the agentic control plane. https://cloudsecurityalliance.org/press-releases/2026/04/29/csai-foundation-announces-key-milestones-to-secure-the-agentic-control-plane

Cybersecurity Insiders. (2026, April 29). HackerOne launches h1 Validation to tackle rising wave of AI-driven vulnerabilities. https://www.cybersecurity-insiders.com/hackerone-launches-h1-validation-to-tackle-rising-wave-of-ai-driven-vulnerabilities/

Florida Phoenix. (2026, April 28). Florida Speaker kills DeSantis’ AI regulation, vaccine repeal bills on first day of special session. https://floridaphoenix.com/2026/04/28/florida-speaker-kills-desantis-ai-regulation-vaccine-repeal-bills-on-first-day-of-special-session/

Forcepoint X-Labs. (2026, April 24). Indirect prompt injection in the wild: X-Labs finds 10 IPI payloads. https://www.forcepoint.com/blog/x-labs/indirect-prompt-injection-payloads

Google. (2026, April 24). AI threats in the wild: The current state of prompt injections on the web. Google Online Security Blog. https://security.googleblog.com/2026/04/ai-threats-in-wild-current-state-of.html

Help Net Security. (2026, April 24). Indirect prompt injection is taking hold in the wild. https://www.helpnetsecurity.com/2026/04/24/indirect-prompt-injection-in-the-wild/

Modulos. (2026, April 28). EU AI Act Omnibus: The trilogue failed, what happens to the August 2026 deadline?. https://www.modulos.ai/blog/ai-act-omnibus-trilogue-failed/

Nextgov/FCW. (2026, April 28). Lieu and Obernolte introduce consolidated AI bill package. https://www.nextgov.com/artificial-intelligence/2026/04/lieu-and-obernolte-introduce-consolidated-ai-bill-package/413134/

Sysdig. (2026, April 29). CVE-2026-42208: Targeted SQL injection against LiteLLM’s authentication path discovered 36 hours following vulnerability disclosure. https://www.sysdig.com/blog/cve-2026-42208-targeted-sql-injection-against-litellms-authentication-path-discovered-36-hours-following-vulnerability-disclosure

The Hacker News. (2026, April 24). LMDeploy CVE-2026-33626 flaw exploited within 13 hours of disclosure. https://thehackernews.com/2026/04/lmdeploy-cve-2026-33626-flaw-exploited.html

The Hacker News. (2026, April 29). LiteLLM CVE-2026-42208 SQL injection exploited within 36 hours of disclosure. https://thehackernews.com/2026/04/litellm-cve-2026-42208-sql-injection.html

The Register. (2026, April 27). Cursor-Opus agent snuffs out startup’s production database. https://www.theregister.com/2026/04/27/cursoropus_agent_snuffs_out_pocketos/

Tom’s Hardware. (2026, April 27). Claude-powered AI coding agent deletes entire company database in 9 seconds. https://www.tomshardware.com/tech-industry/artificial-intelligence/claude-powered-ai-coding-agent-deletes-entire-company-database-in-9-seconds-backups-zapped-after-cursor-tool-powered-by-anthropics-claude-goes-rogue

UK AI Security Institute. (2026, April 28). Our evaluation of Claude Mythos Preview’s cyber capabilities. https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities

24/7 Wall St. (2026, April 28). Microsoft’s AI moat holds up even after the OpenAI reset. https://247wallst.com/investing/2026/04/28/microsofts-ai-moat-holds-up-even-after-the-openai-reset/

Washington Post. (2026, April 24). AI hacking fears jolt Washington as Anthropic unveils Mythos. https://www.washingtonpost.com/technology/2026/04/24/anthropic-mythos-ai-washington-cybersecurity-hacking-risk/

Discussion about this post

Ready for more?