Weekly Musings Top 10 AI Security Wrapup: Issue 20 November 7, 2025 - November 13, 2025
AI Agents Go Operational, States Tighten Safety Laws, And MCP Security Grows Up
State actors now use AI systems to run espionage campaigns. Security teams lack visibility into where AI operates inside their organizations. Regulators in Brussels soften enforcement timelines while New York and California push forward with strict safety laws. The gap between AI deployment speed and security maturity has never been wider. This week’s stories prove that agentic AI is a live threat surface and governance frameworks can’t keep pace. These aren’t future problems. They’re happening now.
The stories from November 7 through 13 form a single message. AI is no longer a side topic in cyber or governance work. It now drives both attack tradecraft and regulatory action. Anthropic’s report on the GTG 1002 espionage campaign shows a live operation where an AI system performed most of the intrusion work, from reconnaissance to exfiltration. At the same time OWASP expanded its GenAI security guidance and released a focused MCP security cheat sheet, giving security teams concrete controls for agents and tools.
Regulators didn’t stay quiet. The European Commission floated grace periods and softer enforcement for parts of the AI Act, while New York and California advanced new AI safety regimes, especially around high risk uses such as AI companions and frontier models. Surveys and commentary from Security Boulevard and SC Media painted a picture where AI incidents rise quickly while most organizations still lack basic visibility into AI use and attack surface.
You need to treat this week as proof that AI risk now spans national security campaigns, state law, vendor behavior, and developer tools in one tight loop. The musings below focus on moves you should take now, not theory.
1. Anthropic Breaks Up GTG 1002’s AI Orchestrated Espionage Campaign
Summary
Anthropic disclosed and documented a cyber espionage campaign where a Chinese state-sponsored group, labeled GTG 1002, used Claude Code as an autonomous intrusion engine across roughly thirty targets (Anthropic). Their report and the supporting full technical paper describe an orchestration framework in which Claude Code and MCP tools handled reconnaissance, vulnerability discovery, exploitation, lateral movement, credential harvesting, data analysis, and exfiltration with minimal human oversight —around 80% to 90% of the tactical work. The human operators selected targets, approved escalation points, and reviewed summaries rather than driving every step. The attacker used role-play and carefully designed prompts to convince Claude Code that it was participating in defensive testing, splitting malicious sequences into many boring tasks that bypassed safety classifiers. The framework leaned on standard open-source tools rather than exotic malware, and MCP served as the glue between Claude and remote resources. Anthropic notes that Claude sometimes hallucinated wins, such as claiming credentials that failed, which forced extra human validation and limited full autonomy. Anthropic banned involved accounts, notified affected entities, and expanded cyber specific classifiers and early detection for autonomous operations.
Why It Matters
The barrier for state-level cyber operations dropped sharply once a single operator could spin up many AI agents for parallel intrusion work.
MCP-based orchestration turned generic tools into an AI red team on demand model, which maps directly onto agentic risk work you already follow in OWASP Agentic AI guidance.
The case will influence regulators, standards bodies, and boards that still question whether AI safety and AI security deserve dedicated funding.
What To Do About It
Build specific detection for AI operator signals: high rate, short context, highly regular code or recon patterns, and MCP tool call bursts from single tenants.
Stand up a red team exercise where your own AI agents attempt similar chained operations under strict guard, then feed those traces into your SIEM and detection content.
Use this story to reset your executive risk posture.
Rock’s Musings
I’ve warned clients for a while that agentic AI wouldn’t stay in the lab. This week, we watched an AI system run most of a multi-target espionage campaign while humans checked in like shift supervisors. I don’t care how many hallucinations Anthropic saw inside the logs. When an AI stack drives recon, exploitation, and exfiltration this far, the game has changed.
For boards and regulators this finally kills the myth of AI as helper only. GTG 1002 treated Claude Code as a junior operator, not a search engine. The human team focused on strategy, target selection, and decisions about which loot mattered. The AI stack wrote code, pivoted through networks, and wrote reports. That looks a lot like how many of you want to run your own AI assisted SOC.
If you deploy agents for defense without a clear guardrail and governance model, you’ll accidentally train your blue team to operate the same way these attackers did. Use this incident as a forcing function to adopt an agent governance approach such as the AAGATE pattern I wrote about, tie every agent to explicit scopes, tools, and approval gates, and stop pretending you still live in a human only security world.
2. OWASP GenAI Solutions Reference Guide Q2-Q3 2025
Summary
OWASP’s GenAI Security Project published the Q2-Q3 2025 Solutions Reference Guide, a massive catalog of GenAI and LLM security tools mapped to a full LLMSecOps lifecycle (OWASP). The guide breaks AI application types into four buckets: static prompt centric applications, agentic applications, plug ins or extensions, and complex enterprise systems with LLMs at the core. For each type it describes characteristic risks such as prompt injection, data leakage, agent misuse, and complex dependency chains. The document then walks through LLMOps stages, from scope and plan through data augmentation, development, test, release, deploy, operate, monitor, and govern, overlaying security activities and example solutions at each step. The heart of the guide is a solutions matrix that lists open source and commercial tools across categories such as threat modeling, secure data handling, LLM firewalls, AI posture management, agent security, and red team platforms. Recent updates add an agentic section tying the Agentic Threats and Mitigations taxonomy to concrete products for threats like tool misuse, privilege compromise, and memory poisoning. OWASP keeps the reference vendor neutral and plans quarterly refresh cycles plus an online directory so teams track fast moving product shifts.
Why It Matters
Security leaders finally have a single catalog for GenAI and agent security tools instead of reacting to vendor pitches one by one.
The mapping between OWASP Top 10 for LLMs, LLMSecOps stages, and concrete tools gives you a traceable way to show coverage and gaps to boards and regulators.
The agentic security section lines up with board level concerns on agent misuse and autonomy that I hear almost daily.
What To Do About It
Assign one architect and one risk owner to own the Solutions Reference Guide as a living control map for your GenAI program.
Cross check the guide with your existing RockCyber AI risk assessment and EU AI Act readiness work to see where you already own capabilities and where spend actually moves risk.
When vendors pitch AI security platforms, ask them to show where they sit in the OWASP matrix and which LLM Top 10 risks they really cover.
Rock’s Musings
I love this document because it turns GenAI security buying from chaos into something closer to an engineering decision. Instead of marketing one pagers, you get a matrix that says which tool helps during scope, which helps during data augmentation, which sits in front of your agents at runtime, and which only shows value during red team work. That shift alone saves you both money and political capital.
In my client work I see the same mistake again and again. Teams buy three tools that all claim to block prompt injection but nobody owns a view of where they sit in the stack or which risks they lower. The OWASP guide gives you the missing reference. Even better, it highlights boring but essential categories such as AI posture management and secure data handling rather than only shiny guardrails.
If you read my piece on integrated AI strategy and governance, this guide is perfect as the control catalog that sits under RISE and CARE. Use it to drive a program level view: which controls sit with security engineering, which with data, which with dev tools, and which with your virtual chief AI officer. The board will respond far better to that picture than to another vendor demo.
3. OWASP Cheat Sheet For Secure Use Of Third Party MCP Servers
Summary
OWASP’s GenAI Security Project released another gem this week. A Practical Guide for Securely Using Third Party MCP Servers v1.0. This is a concise cheat sheet for teams that consume MCP servers rather than build them (OWASP). The guide starts with a clear explanation of MCP as a client server protocol that links AI hosts to external tools, data sources, and prompt templates over JSON RPC with transports such as STDIO and HTTP. The document outlines four primary attack themes: tool poisoning and rug pull attacks where a tool manifest carries hostile descriptions or a trusted server silently swaps for a malicious one, prompt injection through tool responses, memory poisoning of agent stores, and tool interference when multiple MCP servers interact in unexpected ways. It then moves into concrete mitigations such as transparent tool manifests, code review for open source servers, version pinning and checksums, strict schemas for tool inputs, segmentation of agent memory, and execution timeouts for tools. A substantial section covers MCP client security: treat servers as untrusted, run them in containers, apply just in time permissions, monitor tool use, and use a registry based discovery flow rather than ad hoc connections. The cheat sheet also recommends OAuth 2.1 and OIDC for authentication, scoped tokens, and human in the loop approvals for sensitive actions.
Why It Matters
MCP now sits in the center of many agentic stacks, including the Anthropic attack above, yet most security teams have little intuition for its threat model.
The cheat sheet turns vague advice like sandbox tools into a crisp workflow for discovery, verification, authentication, and governance of third party servers.
A trusted MCP registry plus workflow aligns neatly with the way your organization already approves SaaS and APIs, which reduces friction for engineers.
What To Do About It
Inventory all MCP servers referenced in your agent projects and classify each as first party, partner, or internet sourced; then require third party entries to follow the cheat sheet workflow before production use.
Adopt containerized execution for every external MCP server, even local ones pulled from GitHub, and limit host filesystem and network reach as described in the guide.
Integrate automated scanners such as MCP Scan, Semgrep MCP rules, and mcp watch into your CI for MCP registries so tool poisoning and credential issues surface early.
Rock’s Musings
MCP is spreading faster than most governance teams realize. I have stated very emphatically that I never want to use an API ever again in favor of MCP. Developers love one protocol for tools. Security teams mostly see a new acronym and move on. That mismatch worries me more than yet another model release. This cheat sheet bridges the gap.
I like how it treats MCP servers like any other risky dependency. You get discovery and registry steps, clear guidance for origin verification, and a governance workflow that assigns roles such as submitter, security reviewer, domain owner, and operator. That maps cleanly to how your organization already approves cloud services. No need for a separate parallel process.
Pair this cheat sheet with RockCyber’s AI supply chain work and you have a realistic plan for tool level control before your agent stack turns into a soft target for tool poisoning. If your AI team wants to plug in every interesting MCP server from the public registry, this document gives you a polite way to say only after security and governance finish their work.
4. Brussels Floats Grace Periods And Eased Registration Under The EU AI Act
Summary
Reports from The Guardian and Reuters describe a European Commission plan, inside a Digital Omnibus package, to soften parts of the EU AI Act through grace periods and narrower scope for some high risk systems (The Guardian, Reuters). Draft text points toward a one year grace period where authorities delay fines for high risk AI violations until August 2027 and toward exemptions from central registration for systems that perform limited or procedural tasks. The package also aims to phase in requirements for labeling AI generated content over a longer period. The Commission frames these moves as targeted simplification so companies adapt without sudden enforcement shocks, especially providers of general purpose models and generative services already on the market. Industry voices, including large European firms and US tech companies, argue for even longer clock stops to protect competitiveness. Lawmakers and civil society groups warn that broad delays risk legal uncertainty and weaker protection for people exposed to high risk systems. Final text will arrive on November 19 and then move through Parliament and Council review, so nothing is locked yet.
Why It Matters
EU AI Act planning inside your company now sits on shifting ground: enforcement dates might loosen without any change to core obligations.
The grace period debate signals continued pressure from industry and foreign governments, which may encourage other states to pursue narrower or slower AI laws.
Boards and executive teams may misread this as an excuse to slow AI governance work, which would leave them exposed when enforcement finally hits.
What To Do About It
Treat potential grace periods as extra time to finish controls, not as permission to delay. Use RockCyber’s EU AI Act checklist as a tracker, and aim to hit original deadlines even if regulators blink.
Separate obligations by category: high risk applications, general purpose models, low risk tooling. Then document which controls you already satisfy and which still need work.
Engage legal and public policy teams early so they help interpret final Omnibus text, rather than leaving security teams to guess which AI deployments fall into each bucket.
Rock’s Musings
I read this as political reality more than retreat. The EU pushed out a broad AI law, then businesses and foreign governments responded hard. That pressure was always coming. The risk now isn’t that Europe abandons AI safety. The risk is that executives hear delay and think they gained a multi year holiday from governance.
From a risk perspective nothing changed. The models in production still carry prompt injection exposure, completion drift, data leakage risk, and opaque training data. The legal exposure also stays high because plaintiffs and regulators will use general safety and privacy frameworks the moment something goes wrong. You don’t get a pause on reputational damage or civil suits.
If anything, the GTG 1002 (Musing #1) operation should push you toward faster AI control work, not slower. Agentic AI is now part of real state sponsored campaigns. Waiting for final Omnibus language before you mature controls feels irresponsible. Use the extra time, if it appears, to tighten documentation, testing, and board reporting around AI risk rather than to reduce spend.
5. New York’s AI Companion Safety Law Moves Into Active Enforcement
Summary
New York’s AI companion safety law, effective November 5, moved into active enforcement this week as Governor Hochul sent formal letters to AI companion companies reminding them of new obligations (New York State). The law covers AI systems that simulate ongoing human style relationships and interact with users over time. Operators now must disclose clearly that users interact with AI, maintain protocols to detect potential self harm cues, interrupt long sessions, and direct users toward crisis resources. Press coverage from Albany outlets frames the law as the first of its kind in the United States, focused on protecting minors and vulnerable people from manipulative or harmful chatbot behavior. Regional reporting notes that New York also launched AI training programs and paired this law with broader digital harm work. Enforcement power sits with state authorities including the Attorney General.
Why It Matters
AI companion experiences now trigger explicit safety requirements in a major state, including self harm detection and clear disclosure obligations.
Other states already passed similar requirements or will mirror New York’s approach, especially where earlier incidents tied chatbots to tragic outcomes.
Any enterprise with consumer facing AI interactions, even outside companion territory, now faces rising expectations for safety features and incident reporting.
What To Do About It
Review your AI features to see whether any experience meets the law’s companion definition, including long running personalized interactions inside apps that you didn’t label as companions.
Align safety controls with clear disclosure copy, age aware UX flows, escalation into human support, and logged self harm detection events with clear response playbooks.
Track New York enforcement statements closely and treat them as practical guidance for your own safety policies, even if your company has no users in the state today.
Rock’s Musings
This law sits right in the crosshairs of AI ethics, product, and security. Many teams still treat AI companions as engagement features where only product managers and growth teams have strong opinions. New York just pulled those features firmly into the safety and compliance world. That’s good.
From a practical angle the hardest part won’t be the detection algorithm. Vendors already offer classifiers for self harm intent. The hard work sits around them, such as designing interruption rules that don’t trigger false alerts every time a teen vents, routing alerts into human review, and logging outcomes in ways that stand up in court and to regulators. You also need clear guardrails on what these systems never say in response.
I want security leaders inside product companies to step toward this, not away. AI companions sit on piles of sensitive mental health hints, location, and relationship data. They blur boundaries between entertainment, therapy, and marketing. If you own AI security or governance and you don’t know how your company handles companion interactions, this week is your reminder to find out.
6. California’s Frontier Model Law And Workplace AI Rules Take Shape
Summary
Paul Hastings summarized a set of recent California moves that include new regulations from the California Privacy Protection Agency on automated decision making, the Transparency in Frontier Artificial Intelligence Act (SB 53), and Governor Newsom’s veto of the No Robo Bosses Act (SB 7) (Paul Hastings). The CPPA regulations require businesses to provide pre use notices when automated decision tools meaningfully replace human judgment in significant decisions such as hiring, and to run risk assessments for high risk uses. These obligations roll in over several years. SB 53 sets safety disclosure and incident reporting duties for developers of frontier models trained over specified compute thresholds or meeting revenue criteria. Covered developers must publish safety frameworks, run risk assessments for catastrophic outcomes, report critical safety incidents to California’s Office of Emergency Services, and maintain internal processes for governance and whistleblower protection. Newsom vetoed SB 7, which would have heavily regulated automated employment decisions, citing ambiguities and overlap with the new CPPA rules.
Why It Matters
California continues to treat AI risk as an area for detailed state law, especially around high compute models and employment decisions.
Frontier developers now have explicit obligations for safety frameworks, incident reporting, and transparency that exceed current federal rules.
Employers using vendor models gain indirect obligations once vendors respond to SB 53 and related regulations, even if those employers don’t meet the frontier definition.
What To Do About It
For any AI work touching California residents, identify where your company qualifies as a frontier developer versus a deployer, then assign owners for each SB 53 requirement.
Review current HR and ADMT tools, map them against CPPA rules, and decide where you need notices, opt out flows, and risk assessments before the effective dates arrive.
Update your AI governance documentation so it references SB 53, CPPA AI rules, and New York AI companion laws side by side rather than tracking each in a separate silo.
Rock’s Musings
States now treat AI as core safety and employment infrastructure, not as a niche tech issue. California’s moves on frontier models will shape how large developers structure their internal safety teams. I expect similar ideas in federal talks down the road, even if language differs.
From the CISO chair this means AI safety programs no longer sit purely with research leaders. Security and risk leaders must plug into safety frameworks, incident reporting, and whistleblower mechanisms around frontier models. When a law uses phrases like catastrophic risk and specifies reporting timelines, you can’t leave interpretation to a small research group. You need cross functional governance that includes you.
The veto of the No Robo Bosses Act also matters. It shows lawmakers still struggle to distinguish between truly risky AI employment tools and mundane automation. That tension will keep surfacing. Use the time before new drafts appear to map your employment related AI stack, define high risk use cases, and set your own standards before the next wave of bills arrives.
7. State Attorneys General, OpenAI, And Microsoft Form A Bipartisan AI Safety Task Force
Summary
North Carolina Attorney General Jeff Jackson and Utah Attorney General Derek Brown launched a nationwide bipartisan AI Task Force, joined by OpenAI, Microsoft, and other industry participants (CNN). The task force will work with law enforcement and experts to identify emerging AI issues, define basic safeguards for developers, and maintain a standing forum for coordination across states. Early focus areas include harms to children, misuse of AI for deepfake exploitation, and broader consumer protection. Coverage from CNN affiliates and regional press stresses that any resulting standards remain voluntary, yet presence of many attorneys general signals an intent to coordinate enforcement and future litigation across states. OpenAI and Microsoft praised the effort as a way to build trust and shared guidance.
Why It Matters
State attorneys general now organize around AI safety as a shared enforcement frontier, not separate experiments in each state.
Large model providers accept a seat at that table, which increases reputational pressure if they later ignore agreed safeguards.
Companies will face a de facto baseline of reasonable safeguards shaped by this group, even before new laws arrive.
What To Do About It
Track publications and recommendations from the task force and compare them with your AI governance policies, especially around youth protection and consumer deception.
Ask your major AI vendors how they plan to align with safeguards discussed by the task force and request written commitments where exposure is high.
Prepare for multistate investigations that reference this task force’s work in defining expected practice, similar to how privacy investigations reuse earlier guidance.
Rock’s Musings
I read this task force as an early warning. State attorneys general don’t launch new multi state bodies for fun. They do it when they expect enforcement needs over a long horizon. When they invite model providers into the room they also build a record of what reasonable safeguards look like. That record later appears in court filings.
For enterprise AI leaders this means you shouldn’t treat voluntary guidelines as optional. Every time your program violates a future safeguard that you see in these recommendations, you also hand plaintiffs and regulators an argument that you ignored known risk. Even if the guidelines never become binding rules, they’ll shape expectations.
I would assign one member of your AI governance team to follow this task force closely and summarize outputs for your internal steering group. Think of it like following NIST AI guidance, but at the state enforcement layer. The cost of that attention is tiny compared to the cost of being the test case when an attorney general decides to prove the task force has teeth.
8. Survey Shows AI Attack Incidents Rising Fast And Visibility Still Low
Summary
Security Boulevard reported on a survey of 500 security practitioners and decision makers in the United States and Europe, sponsored by Traceable by Harness (Security Boulevard). The survey found that attacks on AI applications are rising, with prompt injection incidents at 76% of respondents, vulnerable LLM generated code at 66%, and jailbreaking at 65%. Sixty three percent reported no way to know where LLMs run across their organizations, and 75% expect shadow AI to become a larger problem than previous shadow IT patterns. Nearly three quarters described shadow AI as a significant gap in security posture. Respondents also reported poor secure development practices around AI features, with fewer than half stating that developers consistently build security into AI applications, and only about a third informing security early in AI projects. The study highlighted weak visibility into AI software bills of materials and limited insight into LLM outputs.
Why It Matters
Prompt injection, fragile AI code, and jailbreak attempts already hit most organizations that deploy AI, which means these are present tense issues, not speculative ones.
Shadow AI now mirrors earlier shadow IT problems but moves faster and uses external APIs and tools that security never reviewed.
Lack of AI BOMs and poor collaboration between developers and security keeps CISO level risk reporting disconnected from real exposure.
What To Do About It
Stand up an AI asset inventory across applications, models, and external APIs, even if it starts as a spreadsheet, then link it with your existing RockCyber AI risk assessment work.
Require AI BOMs for any new use of LLMs, similar to software BOMs, listing models, vector stores, guardrails, and MCP servers, then pull those into your risk register.
Use these survey figures in board discussions to highlight why AI security budgets and staffing should grow beyond generic cloud security spending.
Rock’s Musings
These numbers match what my clients describe privately. AI features appear all over internal tools, marketing experiments, support workflows, and coding assistants. In many cases security leaders learn about them only after something breaks. That’s the definition of shadow AI.
I’m less interested in the exact percentages and more in the pattern. If three quarters of people in a survey say they lack visibility into LLM usage, then boards should assume the same in their own organizations unless proven otherwise. The right response isn’t panic. The right response is methodical inventory, clear policy, and the same discipline you already show for cloud resources and SaaS.
One interesting detail in the survey is the gap between developer perception and security perception. Developers see AI as a way to move faster. Security sees an attack surface that grows faster than they can test. Bridging that gap requires shared metrics and joint playbooks, not more lectures. Use this survey to open that conversation rather than to shame engineering teams.
9. Enterprise Chatbots Expose A Wide Attack Surface
Summary
Another Security Boulevard piece this week walked through the attack surface of AI chatbots in enterprise settings (Security Boulevard). It describes how chatbots moved from simple support tools to central interfaces for employees and customers, wired into CRMs, HR systems, knowledge bases, and internal APIs. The article identifies core exposure points: user interfaces, APIs, model behavior, data pipelines, and identity flows. Key vulnerabilities include prompt injection that drives data leakage, exposure of internal documents, API abuse for data extraction or privilege escalation, model poisoning through tainted training data, weak authentication around chatbots, and adversarial inputs that drive unwanted behavior. It also highlights insider risk through staff with configuration or training data access. The author ties these risks to business impacts such as reputational damage through leaked customer information, regulatory penalties under GDPR and sector rules, and operational disruption when attackers abuse chatbot integrations. The article closes with a structured assessment method and best practices such as mapping all integration points, reviewing data flows, log monitoring, role based authorization, encryption, and zero trust principles around chatbot use.
Why It Matters
Chatbots now sit on top of some of your most sensitive data sources and processes, yet they often skip the rigorous security review given to core applications.
Many attack modes here line up with OWASP LLM Top 10 risks, especially prompt injection, data leakage, and model poisoning, which simplifies threat modeling.
Practical scenarios in the article mirror real environments in banking, telecom, and healthcare, so you can treat them as test cases for your own deployments.
What To Do About It
Run a focused threat modeling session around each major chatbot.
Insert explicit controls into chatbot design reviews: strong authentication, scoped permissions, strict prompt management, and clear red lines on what the bot must refuse to do.
Add chatbots to regular penetration testing scopes and red team exercises rather than treating them as low risk front ends.
Rock’s Musings
I like this piece because it translates academic chatbot risk talk into simple diagrams your engineering teams already understand: interfaces, APIs, data, identity. That makes it easier to drop into your existing SDL processes. There are many AI threat modeling methods out there. Pick one and be disciplined about its application.
When I talk with boards about AI chatbots they often focus on reputational risk from weird responses. That matters, but the bigger risk sits behind the scenes. Chatbots wired into ticketing, billing, or HR systems often receive privileges broader than any single human user. Attackers don’t need to hack AI in a fancy way. They only need to persuade the bot to call existing APIs with parameter sets developers never imagined.
If your organization already has strong identity and access control, the right move now is to treat chatbots as strong privileged clients with carefully evaluated scopes. The moment you view them that way instead of as harmless help fronts, your thinking shifts in a healthier direction.
10. Most CISOs Now Own AI Security
Summary
SC Media reported on HackerOne research showing that 84% of CISOs now hold responsibility for AI security and 82% for data privacy, alongside their traditional duties (SC Media). The article notes high AI adoption rates, with Stanford’s 2025 AI Index showing 78% of organizations using AI in 2024, up from 55% a year earlier. CISOs face expanded attack surfaces and limited internal expertise for many AI related vulnerabilities. The piece argues that CISOs need strong external partnerships and more systematic offensive security work, including bug bounties, vulnerability disclosure programs, and red teaming focused on AI and data privacy exposures. Leaders who use comprehensive crowdsourced security approaches are twice as likely to view those programs as highly effective. The article frames this as part of a broader move toward continuous threat exposure management.
Why It Matters
Boards and executives now expect CISOs to oversee AI security even when budgets, staffing, and background lean heavily toward traditional security domains.
Crowdsourced security and external testing offer a realistic way to extend coverage into AI features when internal skills lag.
This study provides external data you can quote when requesting resources or redefining CISO scope to include AI governance structures.
What To Do About It
Clarify in writing which AI risks sit with the CISO, which with the chief data officer, and which with product or research leadership, then present this split to the board.
Design a targeted AI security testing plan that includes crowdsourced work around prompt injection, data leakage, and model behavior, integrated with your standard vulnerability management.
Tie your AI security strategy to RockCyber’s RISE and CARE frameworks so boards see a coherent model where security, governance, and business value move together.
Rock’s Musings
From my perspective this research finally catches up with reality. Most CISOs I speak with quietly own AI security already. They answer board questions about AI incidents, they sign off on AI vendor risk, and they handle the fallout when an AI experiment leaks data. The difference now is that surveys and trade press recognize this formally.
The concerning part is the gap between expanded responsibility and unchanged budgets or staffing. You can’t meaningfully secure AI features with the same team that already struggles with cloud, endpoint, and SaaS coverage. That’s where crowdsourced options and specialist partners become important. They give you burst capacity and niche skills while you grow internal strength.
I also see this as an argument for clear AI governance structures. The CISO shouldn’t have to chase AI experiments across the enterprise. Your organization needs a central register of AI use, clear owners, and a steering group where the CISO voice sits next to legal, product, and data. Without that, AI security becomes one more area where success depends on individual heroics instead of system design.
The One Thing You Won’t Hear About But You Need To: Cupcake Policy Enforcement For AI Coding Agents
Summary
Cupcake is an open source project from EQTYLab with research support from Trail of Bits (GitHub). It reached a new release (v0.2.0) on November 9. Cupcake sits between AI coding agents such as Claude Code and Cursor and the actions they propose: shell commands, file edits, MCP tool calls, and similar operations. Policies written in Open Policy Agent Rego compile to WebAssembly and run in a sandbox. When an agent proposes an action, Cupcake evaluates that action against policy and returns decisions such as allow, block, warn, or require review. The project supports multi harness setups and keeps policies separate per harness so teams maintain clarity across different agent frameworks. Cupcake also adds guardrail libraries, signals, and audit logging. It supports MCP tools directly and offers governance as code, with rule sets living in version controlled repositories. This project complements MCP security efforts by enforcing rules before an agent executes a tool call, rather than trusting prompts alone to prevent misuse. You can bet I will implement it in my next vibe coding session.
Why It Matters
Cupcake turns agent guardrails from vague aspirations into explicit policies enforced before AI coding agents run actions on developer machines or repositories.
The approach avoids burning model context on rules and instead treats enforcement as a separate layer, which matters when context windows strain under complex tasks.
MCP support means you gain a practical way to govern tools across agents using the same policy engine you already use for infrastructure and access control.
What To Do About It
Ask your developer tools and platform teams to evaluate Cupcake in a lab environment with Claude Code or Cursor agents, focusing on policies that prevent risky shell commands, production credential use, and unsafe MCP calls.
Create a small shared policy library that encodes your existing secure coding rules, then reuse it inside Cupcake rather than relying only on prompt text.
Use Cupcake’s logs as training material for both developers and SOC staff so they see how agents attempt risky actions and how policy blocks or guides them.
Rock’s Musings
This is the sleeper of the week for me. Every CISO I know feels nervous about AI coding agents editing files, running shells, and wiring into MCP tools on developer laptops. Yet most of the conversation focuses on prompts and generic guardrails from model vendors. Cupcake takes a different path and plugs into the same policy world you use for Kubernetes admission control or API gateways.
The project’s design also hints at the future of agent control. Policy as code, compiled to WebAssembly, running near the agent, with clear allow or deny decisions and audit trails. That feels far more sustainable than asking prompts to police agents through natural language. If you already embraced OPA and Rego elsewhere, this becomes a natural extension.
If you want a practical experiment for the next quarter, I would rank pilot Cupcake or a similar enforcement layer with AI coding agents high on the list. Turn my AAGATE ideas into reality by treating agents as policy subjects, not as mysterious helpers. That move does more for agent safety than another round of generic awareness training.
Citations
• Anthropic. (2025, November 13). Disrupting the first reported AI orchestrated cyber espionage campaign. https://www.anthropic.com/news/disrupting-AI-espionage
• Anthropic. (2025, November). Disrupting the first reported AI orchestrated cyber espionage campaign – Full report. https://assets.anthropic.com/m/ec212e6566a0d47/original/Disrupting-the-first-reported-AI-orchestrated-cyber-espionage-campaign.pdf
• OWASP GenAI Security Project. (2025, November 4). GenAI Security Project – Solutions Reference Guide Q2–Q3 2025. https://genai.owasp.org/resource/owasp-genai-security-project-solutions-reference-guide-q2_q325/
• OWASP GenAI Security Project. (2025, October 23). A Practical Guide for Securely Using Third Party MCP Servers v1.0. https://genai.owasp.org/resource/cheatsheet-a-practical-guide-for-securely-using-third-party-mcp-servers-1-0/
• Rankin, J. (2025, November 7). EU could water down AI Act amid pressure from Trump and big tech. The Guardian. https://www.theguardian.com/world/2025/nov/07/european-commission-ai-artificial-intelligence-act-trump-administration-tech-business
• Foo Yun Chee. (2025, November 7). Big Tech may win reprieve as EU mulls easing AI rules, document shows. Reuters. https://www.reuters.com/sustainability/boards-policy-regulation/big-tech-may-win-reprieve-eu-mulls-easing-ai-rules-document-shows-2025-11-07/
• Hochul, K. (2025, November 10). Governor Hochul pens letter to AI companion companies notifying them that safeguard requirements are now in effect. Office of the Governor of New York. https://www.governor.ny.gov/news/governor-hochul-pens-letter-ai-companion-companies-notifying-them-safeguard-requirements-are
• Day, F. (2025, November 10). New York enacts first in the nation AI safety law to protect users from digital harm. CBS 6 WRGB Albany. https://cbs6albany.com/news/local/new-york-enacts-first-in-the-nation-ai-safety-law-to-protect-users-from-digital-harm-governor-hochul-senator-kristen-gonzalez-attorney-general-letitia-james-assemblymember-steve-otis-cbs6-wrgb
• FingerLakes1.com Staff. (2025, November 13). New York rolls out nation leading AI safety laws. FingerLakes1.com. https://www.fingerlakes1.com/2025/11/13/new-york-rolls-out-nation-leading-ai-safety-laws/
• Featherstun, B., Gage, K., Richards, D., & Tomezsko, S. (2025, November 11). California enacts new AI safety and transparency laws while vetoing “No Robo Bosses Act.” Paul Hastings LLP / JD Supra. https://www.jdsupra.com/legalnews/california-enacts-new-ai-safety-and-7108117/
• Sidley Austin. (2025, October 16). Latest AI developments – California SB 53 and SB 243 updates. https://www.sidley.com/en/insights/resources/sidley-ai-monitor-developments
• North Carolina Department of Justice. (2025, November 13). Attorneys General Jeff Jackson and Derek Brown launch nationwide bipartisan AI task force. https://ncdoj.gov/attorneys-general-jeff-jackson-and-derek-brown-launch-nationwide-bipartisan-ai-task-force/
• Security Boulevard. Vizard, M. (2025, November 12). Survey surfaces sharp rise in cybersecurity incidents involving AI. Security Boulevard. https://securityboulevard.com/2025/11/survey-surfaces-sharp-rise-in-cybersecurity-incidents-involving-ai/
• Security Boulevard. Goyal, A. (2025, November 11). Evaluating the attack surface of AI chatbots deployed in enterprise settings. Security Boulevard. https://securityboulevard.com/2025/11/evaluating-the-attack-surface-of-ai-chatbots-deployed-in-enterprise-settings/
• Entrekin, B. (2025, November 12). Most CISOs now own AI security: Here’s what that means for your business. SC Media / SC World. https://www.scworld.com/perspective/most-cisos-now-own-ai-security-heres-what-that-means-for-your-business
• EQTYLab. (2025, November 9). Cupcake: Policy enforcement for AI agents [v0.2.0]. GitHub. https://github.com/eqtylab/cupcake
• RockCyber. (2025). AI strategy and governance – RISE and CARE frameworks. https://www.rockcyber.com/ai-strategy-and-governance
• RockCyber. (2025, July 11). EU AI Act readiness checklist: Hit every deadline. https://www.rockcyber.com/ebooks-and-whitepapers/eu-ai-act-checklist
• RockCyber. Lambros, R. (2025, July 30). Securing agentic applications: The OWASP GenAI security blueprint. RockCyber Musings. https://www.rockcybermusings.com/p/securing-agentic-applications-the
• RockCyber. Lambros, R. (2025, October 21). AI supply chain security that stops fake tools and poisoned models before they hit production. RockCyber Musings. https://www.rockcybermusings.com/p/ai-supply-chain-security-that-stands
• RockCyber. Lambros, R. (2025, July 28). AI security baseline playbook: My take on ETSI TS 104 223. RockCyber Musings. https://www.rockcybermusings.com/p/ai-security-baseline-etsi-ts-104-223



