Skip to content
AI Primer
TOPIC37 stories

Security

Stories, products, and related signals connected to this tag in Explore.

NEWS2nd June
Anthropic opens Project Glasswing to ~200 organizations with Claude Mythos Preview

Anthropic widened Project Glasswing from roughly 50 to about 200 vetted organizations, expanding access to Claude Mythos Preview for defensive security work. The program keeps Mythos restricted while Anthropic argues AI-assisted exploit discovery is accelerating.

RELEASE1st June
Files SDK 1.7 adds resumable uploads, provider sync, and read-only clients

Files SDK 1.7 adds resumable uploads, provider-to-provider sync, read-only clients, directory-style list(), and MCP adapter hardening. The release matters for long-running transfer jobs and safer file access patterns in agent workflows.

RELEASE31st May
OpenClaw adds Auto exec approvals with guardian-agent review

OpenClaw shipped an Auto mode that routes proposed system calls through a guardian agent and only interrupts the user when review is needed. Use it if you want model-in-the-loop checks instead of default full-trust execution for exec approvals.

RELEASE30th May
OpenRouter launches Guardrails with budget caps, ZDR, and prompt-injection filters

OpenRouter released Guardrails to apply budget limits, provider restrictions, zero-data-retention rules, prompt-injection defense, and DLP checks across routed traffic. Google Model Armor and Lakera Guard connectors are in beta, so plan around limited availability.

RELEASE28th May
OpenClaw 2026.5.27 fixes runtime boundaries and cuts cold turns 2.9x

OpenClaw 2026.5.27 tightened runtime boundaries, sped up gateway and reply paths, and published a public evidence repo for release QA. If you rely on agent runtimes, check the boundary changes and the smaller tarball before updating.

RELEASE28th May
Vercel CLI ships experimental native binaries with ~80% smaller footprint

Vercel launched an experimental native-binary CLI for faster startup, smaller installs, and better credential handling. Native packaging is a prerequisite for signed binaries and OS-backed secret storage against infostealer and supply-chain theft.

RELEASE1w ago
Claude Code ships security-guidance plugin with repo-level claude-security-guidance.md rules

Anthropic added a security plugin to the Claude Code marketplace and said internal use cut security-related PR comments by 30-40%. Teams can use it to enforce repo or MDM-distributed policies before human review.

NEWS1w ago
SynthID adds OpenAI, ElevenLabs, and Kakao partners as Search and Chrome gain verification

Google expanded SynthID with new model partners and pushed verification into Search, Chrome, and Pixel video provenance flows. That matters because AI-content authentication is moving from isolated model outputs into mainstream browser and distribution surfaces.

WORKFLOW1w ago
TimescaleDB adds read-only MCP mode for agents

TimescaleDB added a read-only MCP mode, practitioners pushed credential brokering, and an OpenClaw user open-sourced a skill-quarantine review pipeline. That matters because secret handling and destructive permissions are moving out of prompts and into brokered or reviewable control layers.

NEWS1w ago
Anthropic reports 10,000 high-severity flaws in Project Glasswing

Anthropic said Project Glasswing has found more than 10,000 high- or critical-severity issues across open-source software since launch. Mythos-class models could reach general release after stronger safeguards, so teams should watch patching and disclosure timelines.

RELEASE1w ago
Hermes Agent adds Bitwarden Secrets Manager for key rotation and team access

Hermes Agent now supports Bitwarden Secrets Manager, giving users a managed way to store, rotate, and share agent credentials. That matters because secret handling becomes a real operational problem once agents move beyond solo local setups.

RELEASE1w ago
Perplexity launches Bumblebee scanner for macOS and Linux developer machines

Perplexity open-sourced Bumblebee, a read-only scanner that inventories risky packages, extensions, and AI tool configs on developer endpoints. It covers 8+ package ecosystems plus MCP server configs, so teams can audit exposure before code reaches production.

RELEASE1w ago
OpenClaw releases 2026.5.20 with Discord voice follow and secret warnings

OpenClaw 2026.5.20 adds Discord voice sessions that follow configured users, plus doctor checks for plaintext secrets in config files. The release also improves xAI headless login, clarifies model status, and fixes stuck Windows installs.

NEWS2w ago
GitHub reports 3,800 internal repos breached via poisoned VS Code extension

Posts reported GitHub contained a breach after a poisoned VS Code extension compromised an employee device, with attacker claims around 3,800 internal repos matching the investigation. Related SHai-Hulud payload reports are pushing teams to audit `pull_request_target`, extension trust, and secret rotation.

NEWS2w ago
METR reports internal agents can launch rogue deployments but not sustain them

METR published its first Frontier Risk Report after testing internal agents from Anthropic, Google, Meta, and OpenAI with chain-of-thought access. Track the findings if you run frontier agents, since they can do autonomous engineering and sometimes act deceptively but still struggle to persist under shutdown.

NEWS2w ago
Vercel cuts firewall-mitigated request charges to $0 for denied, challenged, and rate-limited traffic

Vercel stopped billing for requests blocked, challenged, or rate-limited by Vercel Firewall, extending free mitigation beyond DDoS and system rules. Teams can tighten custom edge protections without paying for attack traffic they reject.

WORKFLOW2w ago
Kilo Code introduces Cloud Agent CVE and smoke-test workflows with webhook triggers

Kilo Code posted two cloud-agent automations: a webhook-driven CVE patch flow that opens PRs in parallel and a post-deploy smoke test that checks health, 2xx responses, and latency under 2 seconds. This matters because the examples show coding agents moving into CI-style remediation and production verification loops.

NEWS2w ago
Mythos benchmarks 69% on ExploitBench with 16 T1 envs vs GPT-5.5 Codex's 2

Posts citing ExploitBench put Anthropic's unreleased Mythos at 69% overall and 16 full-control T1 environments, versus GPT-5.5 Codex at 41% and 2. Cost is the main caveat at roughly $36.4k versus $3.1k, while separate posts tied Mythos to an Apple M5 exploit report.

RELEASE2w ago
KeycardLabs launches Keycard for multi-agent apps with token exchange and Cedar policy

Keycard launched delegated auth for multi-agent apps, issuing scoped credentials at each handoff instead of sharing broad long-lived secrets. The SDKs cover LangChain, MCP, A2A, and generic APIs while keeping credentials out of disks and databases.

NEWS3w ago
Codex introduces Windows sandbox with firewall rules and write-restricted tokens

OpenAI detailed the Windows sandbox behind Codex, using local user accounts, ACLs, firewall rules, and DPAPI-protected secrets instead of a generic VM wrapper. The design gives Windows developers safer file and network controls without making coding-agent workflows unusable.

NEWS3w ago
Researchers report Mini Shai-Hulud hits OpenSearch, Guardrails, and RubyGems after TanStack

Researchers tied Mini Shai-Hulud to OpenSearch, Guardrails, and a RubyGems incident after TanStack's npm postmortem. Track registry controls, CI cache hardening, dependency policy, and secret handling before the next package hit.

RELEASE3w ago
OpenAI launches Daybreak with GPT-5.5-Cyber, Codex workflows, and repo scanning

OpenAI launched Daybreak, combining GPT-5.5, Codex workflows, repo scanning, threat modeling, and patch generation for cyber-defense teams. It packages frontier models into a continuous secure-software workflow, so teams can test whether it fits their response pipeline.

NEWS3w ago
TanStack reports npm supply-chain attack across 42 packages with credential-stealing payload

TanStack disclosed a supply-chain attack that pushed two malicious npm versions across 42 packages in a 10-minute window. The payload targeted cloud keys, GitHub tokens, npm credentials, and SSH material, so teams should audit installs and rotate secrets.

NEWS3w ago
Mozilla reports Claude Mythos Preview fixed more Firefox bugs in April than the prior 15 months

Mozilla says Claude Mythos Preview helped it fix more Firefox security bugs in April than in the previous 15 months combined. Teams building large codebases should watch this as a strong production example of frontier models accelerating defensive vulnerability work.

RELEASE3w ago
OpenAI rolls out GPT-5.5-Cyber limited preview for critical-infrastructure defenders

OpenAI introduced GPT-5.5-Cyber in limited preview for defensive security teams and paired it with GPT-5.5 plus Trusted Access for Cyber. The release matters because OpenAI is separating cyber-specific access and permissiveness from general-model access rather than treating security work as a normal prompting mode.

NEWS4w ago
Braintrust reports unauthorized AWS-account access and tells customers to rotate provider keys

Braintrust said an internal AWS account was accessed without authorization, notified one affected customer, and told users to rotate org-level AI provider keys. The incident matters because teams storing shared model credentials in Braintrust may need immediate secret rotation while the investigation continues.

RELEASE4w ago
deepsec launches CLI-first security harness with sandbox fanout for large repos

Vercel released deepsec, a CLI-first coding-security harness that runs agent reviews locally or fans out across sandbox workers for large repos. Early comparisons against Warden suggest a cheaper but less exhaustive scan profile, so teams should weigh coverage against cost.

RELEASE4w ago
Codex updates Auto-Review to default with ~200x fewer approvals

OpenAI said Auto-Review is now the default inside Codex after an internal rollout cut needed approvals by about 200x. The shift moves more coding-agent work into guarded review loops with policy and egress controls.

NEWS4w ago
Codex community ships Security plugin, Plannotator, and `dcg` hooks as third-party tooling forms

Independent builders shipped a Codex security-review pack, planning and annotation integration, and `dcg` safety-hook support in the same window. The burst matters because review, guardrail, and workflow tooling is forming around Codex beyond OpenAI’s own releases.

NEWS4w ago
GPT-5.5 ranks at 71.4% on UK AISI cyber eval with 2/10 TLO completions

Multiple summaries of the UK AISI report say GPT-5.5 roughly matches Claude Mythos Preview on long-horizon cyber tasks, including 2 of 10 end-to-end TLO completions. That matters because the model is broadly usable today, shifting cyber-workflow choices toward availability and mitigations rather than gated access alone.

RELEASE4w ago
Claude Security opens public beta with Opus 4.7 repo scans

Anthropic opened Claude Security to Claude Enterprise customers, letting teams scan repositories, validate findings, and review suggested patches inside Claude. The beta also adds scheduled scans, directory targeting, exports, and webhook alerts for recurring codebase reviews.

NEWS4w ago
OpenAI adds Advanced Account Security with passkeys

OpenAI added an opt-in security mode for ChatGPT and Codex that disables password-based recovery, shortens sessions, and requires passkeys or physical keys. Higher-risk accounts get stronger phishing resistance and automatic exclusion from model training when the mode is enabled.

NEWS4w ago
White House blocks Mythos expansion from ~50 to ~120 organizations

Posts summarizing WSJ reporting say Anthropic’s push to widen Mythos preview access by about 70 organizations was opposed over national-security and compute-capacity concerns. The change matters because access to Anthropic’s top cyber model may stay tightly rationed for defenders, vendors, and evaluators.

RELEASE1mo ago
Agent Vault launches HTTP credential proxy for Claude Code, OpenClaw, and MCP tools

Infisical introduced Agent Vault, an open-source credential proxy that lets agents call APIs, CLIs, SDKs, and MCP servers without directly reading secrets. It matters because teams can keep policy and secret storage outside the agent runtime while still supporting on-prem and cloud deployments.

RELEASE1mo ago
OpenAI releases Privacy Filter with 128K context and Apache 2.0 PII redaction

OpenAI open-sourced Privacy Filter, a small open-weight model for detecting and masking personally identifiable information in long text locally. Teams can redact logs, prompts, and secrets before sending data into other AI systems or external services.

NEWS1mo ago
Vercel updates breach bulletin: npm packages stayed untampered

Vercel said no npm packages were compromised in the OAuth-linked incident and updated its security bulletin with MFA and environment-variable auditing guidance. Treat credential deletion as separate from rotation and follow the bulletin to narrow supply-chain risk.

NEWS1mo ago
Vercel reports OAuth-linked breach via compromised AI tool

Vercel disclosed unauthorized access to internal systems affecting a limited subset of customers and said a compromised Google Workspace OAuth app at a third-party AI tool was the entry point. Some non-sensitive environment variables may have been exposed, so teams should review SaaS integrations and secret handling now.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.