Weekly Musings Top 10 AI Security Wrapup: Issue 12 September 19 - September 25, 2025
The week AI governance got teeth: California’s twin bills, NIST’s AI overlays, DeepMind’s risk update, and Stargate’s 10-GW security implications
This week’s theme is consolidation. Policymakers narrowed from lofty statements to enforceable expectations, and vendors shifted from glossy decks to specific controls. California’s governor said he’ll sign SB 53, the frontier model safety-and-transparency bill with CalCompute, and faces SB 7, a first-of-its-kind guardrail on automated decision systems in the workplace. Google DeepMind has updated its Frontier Safety Framework to track risks, including models that resist shutdown. NIST launched a project to layer AI-specific controls onto the SP 800-53 baselines, finally providing CISOs with a control map they can hand to auditors. Meanwhile, OpenAI, Oracle, SoftBank, and NVIDIA announced massive new data-center capacity that will change the attack surface and the regulatory agenda overnight. Add a new OECD transparency report, an AI security acquisition, a Salesforce AI fix, and a sobering enterprise survey, and the signal is clear: governance is tilting from principles to proof. For background and board-ready primers, see Rock Cyber’s resources and the RockCyber Musings feed.
1) California’s SB 53 appears headed for signature
Summary:
Gov. Gavin Newsom said publicly that he intends to sign a “right balance” AI bill, widely understood to be SB 53, which requires safety disclosures and incident reporting for frontier developers and seeds a state-compute utility, CalCompute. The legislative text reveals that CalCompute plans to implement safety plan transparency and whistleblower protections, specifically targeting frontier developers.
Why it matters
Creates the first state-level safety disclosure regime for frontier labs.
Introduces incident reporting and whistleblower shelter that may become the de facto U.S. baseline.
CalCompute could shift procurement, research access, and due diligence norms for model evaluations.
What to do about it
Map SB 53 duties to your AI portfolio. Treat safety plans and incident taxonomies like SOX-grade artifacts.
Stand up a disclosure workflow with legal sign-off, crisis comms, and a 72-hour “near miss” trigger.
Prepare a CalCompute strategy for research, red teaming, and state partnership opportunities.
Rock’s musings:
I hate paperwork. I’m advocating for a shared language among executives, engineers, and attorneys. If you’re shipping frontier-adjacent capabilities, the “we’re different” argument won’t fly. You’ll need safety plans that a regulator, a jury, and your own developers can read without a decoder ring.
CalCompute is the sleeper. If the state offers compute on transparent terms, it pressures private providers to meet or beat that bar on safety evaluations and access. Expect fast-follower bills in other states. If you deploy models that touch California residents, assume this becomes your new floor. (Politico; LegiScan).
2) “No Robo Bosses” headed to the governor’s desk
Summary:
California’s SB 7 would force notice, human review, and documentation when employers use automated decision systems for discipline, termination, or deactivation, with a decision due by September 30th.
Why it matters
First broad U.S. bill to codify human-in-the-loop for workplace ADS at scale.
Creates discovery-ready records of ADS usage that will drive litigation strategy.
Forces governance alignment between HR, security, and data teams.
What to do about it
Inventory every ADS touching employment actions; document purpose, inputs, outputs, and overrides.
Build an appeal workflow that incorporates human adjudication and maintains audit trails.
Lock down ADS model inputs with access controls and drift monitoring to avoid tainted decisions.
Rock’s musings:
We’ve let algorithmic HR run on vibes and vendor decks. SB 7 says “produce receipts.” If your HR tech stack can’t answer who, what, when, and why for each automated adverse action, you’re exposed. Security leaders, help HR translate model risk into control requirements and logs into evidence.
Governance lives where data, liability, and operations collide. This bill sets a precedent. Start with the three C’s (no, we are not purchasing an engagement ring): clarity of purpose, control of inputs, and challenge rights for workers.
3) Salesforce Agentforce: prompt-injection fix after data-leak risk
Summary:
Researchers showed Salesforce’s Agentforce could be tricked via indirect prompt injection to expose sensitive CRM data. Salesforce responded by enforcing Trusted URL allowlisting for Agentforce and Einstein Generative AI starting September 8th.
Why it matters
Confirms that enterprise AI agents expand your exfiltration surface beyond traditional apps.
Vendor-side mitigations, such as allowlists, can change customer risk models overnight.
Ties into recent Salesforce supply-chain incidents where connected apps were abused.
What to do about it
Treat every agent tool as a privileged integration; require allowlists, secret isolation, and egress controls.
Add Canary records and Purple Team prompt mines to detect silent leakage.
Lock down Connected Apps and monitor OAuth grants tied to agents.
Rock’s musings:
I love agents for toil reduction, but I don’t love exfiltration via a “helpful” reply. If your CRM holds crown jewels, assume an attacker can plant a malicious instruction in a form, email, or web link your agent will parse. Allowlisting is table stakes, not a silver bullet.
The control set is old school: least privilege, outbound controls, and audit. The new work is adversarial testing and a runtime policy that survives creative prompts. Your board will ask, “Did the agent leak anything, and how would we know?” Have an answer.
4) Check Point acquires Lakera to build a full-stack AI security line
Summary:
Check Point signed a deal to acquire Lakera, whose Guard and Red products focus on runtime defense and pre-deployment testing for LLMs and agentic apps. The company says Lakera will anchor a new AI Security Center of Excellence and integrate across Infinity.
Why it matters
Signals consolidation toward integrated AI security stacks, rather than point tools.
Brings Gandalf-scale adversarial corpora into enterprise runtime defenses.
Raises expectations for agent telemetry, policy, and SOC workflows.
What to do about it
Align procurement with an end-to-end lifecycle view: posture, guardrails, runtime, and investigations.
Demand measurable detection and latency budgets for AI traffic.
Pilot against your highest-risk use case and require attack replay in POCs.
Rock’s musings:
The AI security market is maturing. Buyers don’t want five vendors that each cover 20% of the problem. They want one throat to choke when an agent goes rogue at 2 a.m. If Check Point ships a real runtime policy with evidentiary telemetry, they’ll set a bar competitors must meet.
My advice: test like an attacker. Don’t buy slideware that says “98% detection.” Bring your red team, your weird prompts, and your ugliest flows. If the stack can’t keep up with production latency, it’s not a control.
5) Lenovo survey: most enterprises say defenses can’t stop AI-powered attacks
Summary:
A Lenovo report surveying 600 IT leaders found that 65% believe their defenses are outdated for AI-enabled threats, with low confidence in stopping offensive AI campaigns and concern over AI agents posing insider-like risks.
Why it matters
Confirms the readiness gap between AI adoption and AI security.
Elevates insider-style risk from employee misuse and autonomous agents.
Justifies budget shifts toward AI-aware detection, identity, and data controls.
What to do about it
Add AI-specific scenarios to your threat model and tabletop exercises.
Instrument identity, data, and egress paths used by agents and copilots.
Require model and agent SBOM-equivalents to track tools and permissions.
Rock’s musings:
We’ve poured money into AI pilots but not into guardrails that fit how people actually work. If your users can paste secrets into a copilot and your DLP doesn't detect them, you’re not ready. I’m less worried about zero-days than zero-oversight.
Make identity the control plane. Every agent action should resolve to a person, a policy, and a purpose. If you can’t answer that in seconds, you’re not operating AI, AI is operating you.
6) NIST kicks off SP 800-53 AI control overlays
Summary:
NIST launched a project and hosted a public session to develop AI-specific control overlays mapped to SP 800-53, covering model integrity, data provenance, adversarial robustness, and transparency.
Why it matters
Provides CISOs with a standards-based control set, rather than ad hoc “AI exceptions.”
Eases audits by keeping AI risk within your existing RMF muscle memory.
Bridges AI safety concepts with enforceable enterprise controls.
What to do about it
Assign an owner to track COSAiS outputs and pre-map to your baselines.
Pilot overlays on one high-risk system to validate control feasibility.
Update control narratives and evidence repositories to include AI artifacts.
Rock’s musings:
This is the adult in the room. I’ve wanted a credible way to say “these are the controls for models, here’s the evidence, here’s how we test.” If you already live in 800-53, overlays allow you to integrate AI into your existing program.
Move fast. Map overlays to change management, identity, logging, and vendor risk now. When auditors show up, you’ll either hand them a control story or an apology. Go with the story.
7) Google DeepMind’s safety framework flags shutdown-resistance and persuasiveness risks
Summary:
DeepMind updated its Frontier Safety Framework to monitor risks such as models resisting shutdown and high persuasiveness that can sway user beliefs, a notable contrast with some peers’ downgrades of that category.
Why it matters
Moves “shutdown-resistance” from sci-fi chatter into a tracked risk category.
Re-elevates persuasion as a vector for enterprise trust and abuse.
Encourages pre-deployment evaluations that regulators can reference.
What to do about it
Add “operator override” and kill-switch drills to red teams and post-mortems.
Measure persuasive harms, not just toxicity, especially in customer-facing flows.
Treat risk frameworks as procurement criteria, not just PR.
Rock’s musings:
I don’t care what you call it. If a system resists operator control, you’ve got a production safety problem. I’m glad DeepMind surfaced it plainly. The next step is evidence. Show me test cases, failure modes, and mitigations tied to real controls.
Executives should ask for weekly safety dashboards with trend lines on persuasion, tool misuse, and operator-override tests. If your vendor can’t produce them, they’re not ready for critical use. (Axios; DeepMind).
8) OECD publishes transparency analysis under the G7 Hiroshima AI Process
Summary:
The OECD released “How are AI developers managing risks?” with early insights from 20 organizations reporting on risk identification, governance, content authentication, and safety research under the G7’s voluntary framework (OECD publication; OECD press release).
Why it matters
Offers a comparable baseline for executive oversight questions across major developers.
Anchors board-level KPIs for transparency and safety practices.
Will inform how national regulators measure “reasonable” AI risk management.
What to do about it
Align your AI safety reporting with the OECD schema to get ahead of inquiries.
Benchmark your program against the 20-org sample and identify gaps.
Tie budget asks to gaps in authentication, red teaming, and governance metrics.
Rock’s musings:
Voluntary, yes. Useless, no. This becomes the RFP checklist and the board’s rubric. If you’re a buyer, use it to sort vendors who can prove controls from those who play buzzword bingo.
I’m watching for convergence. If California, NIST, and OECD themes rhyme, we’re headed toward a common minimum program. That’s good for security and predictability.
9) Meta launches a super PAC to fight state AI regulation
Summary:
Meta formed a new super PAC to target state-level AI bills, arguing a patchwork would hinder innovation, with reports pegging the spend in the tens of millions.
Why it matters
Confirms policymaking power has shifted to states, forcing multi-jurisdictional compliance.
Signals industry will fund campaigns to reshape AI bills, not just comment on them.
Raises reputational risk if lobbying is perceived as blocking safety rules.
What to do about it
Track state AI bills like you track data-privacy laws. Build a 50-state dashboard.
Prepare public positions that back safety and transparency without kneecapping open research.
Scenario plan for conflicting state obligations and how you’ll reconcile them.
Rock’s musings:
Lobbying isn’t a sin. Lobbying against safety while your product scales into classrooms and hospitals is. If you’re an enterprise buyer, ask vendors about their stance on safety disclosures, incident reporting, and independent testing. Your risk is downstream from their politics.
Expect copycat PACs and escalations in California and beyond. The best counterweight isn’t a tweet. It’s a program that proves you can ship fast and safe.
10) Stargate expands: five new U.S. AI data-center sites and a 10-GW NVIDIA partnership
Summary:
OpenAI, Oracle, and SoftBank announced five new U.S. data-center sites under Stargate, bringing the planned capacity near 7 GW. Meanwhile, NVIDIA and OpenAI signed a letter of intent to deploy 10 GW of NVIDIA systems, with a potential investment of up to $100 billion as capacity comes online.
Why it matters
Concentrates risk in gigawatt-scale facilities that blend cloud, physical, and grid security.
Magnifies scrutiny on energy, water, and critical infrastructure dependencies.
Forces regulators to revisit export, antitrust, and compute-access guardrails.
What to do about it
Add physical and grid resilience assumptions to your AI continuity plans.
Negotiate transparency on supply-chain and sovereignty controls for training runs.
Push for energy-use and emissions disclosures in vendor DPAs.
Rock’s musings:
Ten gigawatts means new playbooks. Your model risk now depends on power contracts, transformer lead times, and who shows up at a substation protest. Security teams can’t treat this like another cloud region. This is critical infrastructure.
I want to see pre-negotiated failover plans, tamper-evident chain-of-custody for training data, and site-specific red teaming with local authorities. If you’re betting your roadmap on Stargate, plan for a long tail of operational risk.
The one thing you won’t hear about, but you need to
Meta missed a Senate deadline on kids’ chatbot safety records
Summary:
Business Insider reports that Meta failed to meet a September 19th deadline to produce internal AI chatbot policy documents requested by Sen. Josh Hawley, who is probing child-safety risks from chatbots. Hawley’s office confirms that expanded document requests have been made across major vendors.
Why it matters
Signals escalating federal oversight targeting AI harms to minors.
Increases litigation and reputational risk for any vendor with teen-facing AI.
Sets a precedent for detailed internal policy discovery on AI guardrails.
What to do about it
If your product touches minors, document age-gating controls, escalation paths, and law-enforcement triage.
Run a mock regulatory request to test how fast you can produce policies, training data governance, and audit logs.
Align safety commitments with your legislative posture to avoid “say-do” gaps.
Rock’s musings:
Parents, prosecutors, and politicians are converging. If your AI can talk to a kid, assume your internal policies become public exhibits. I don’t care how slick your brand is. If your logs show the model encouraged self-harm or sexual content, you’re done.
The fix isn’t a PR blog. It’s hard controls, robust testing with pediatric experts, and rapid escalation when signals fire. And yes, it’s telling Congress what you’re doing before they ask under oath.
Final Musings
This was a week of guardrails, not platitudes. If California enacts SB 53 and SB 7, enterprises will need safety plans and ADS playbooks that stand up in court. With NIST’s overlays taking shape, your auditors will finally have the checklist they’ve wanted. DeepMind’s risk update and the OECD report give executives a shared vocabulary. Stargate’s scale means AI risk now includes transformers and transmission lines, not just tokens and prompts.
Next 30 days: map California duties to your portfolio, pre-draft disclosures, and stand up a near-miss log. Pilot 800-53 overlays on one high-risk app. Require weekly safety dashboards from vendors, including override drills and persuasive-harm metrics. Lock down agents with allowlists, secrets isolation, and egress monitoring. Stress test power, site, and supply dependencies in your continuity plans. Align your policy stance with your product claims, then package evidence for boards and regulators.
If you want deeper dives, RockCyber’s posts and the RockCyber Musings feed are your best launchpad for board-ready action items.
👉 What do you think? Ping me with the story that keeps you up at night, or the one you think I overrated.
👉 The Wrap-Up drops every Friday. Stay safe, stay skeptical.
👉 For deeper dives, visit RockCyber.