Fresh stories
Nous Research releases TST with 2-3x pretraining speedup at matched FLOPs
Nous Research introduced Token Superposition Training, which bags tokens early in pretraining before returning to next-token prediction. The team says TST cuts wall-clock training 2-3x at matched FLOPs while leaving the deployed model unchanged.

LangChain launches SmithDB, LangSmith Engine, and Sandboxes at Interrupt
LangChain unveiled SmithDB, LangSmith Engine, Managed Deep Agents, and GA sandboxes at Interrupt. The stack gives agent teams a purpose-built trace database, autonomous failure triage, and managed execution environments for production workflows.

Cursor launches cloud development environments with rollback and scoped secrets
Cursor added reusable cloud development environments for agents with multi-repo setup, rollback, and scoped secrets. The update moves cloud agents closer to laptop-style setups while keeping long-running work isolated and auditable.


Nous Research releases TST with 2-3x pretraining speedup at matched FLOPs
Nous Research introduced Token Superposition Training, which bags tokens early in pretraining before returning to next-token prediction. The team says TST cuts wall-clock training 2-3x at matched FLOPs while leaving the deployed model unchanged.

Anthropic adds $20-$200 monthly Claude Agent SDK credits starting June 15
Anthropic will move Claude Agent SDK, claude -p, GitHub Actions, and third-party agent apps onto separate monthly credits on June 15. Watch the new bucket closely, since it changes the cost model for autonomous runs and subscription-backed harnesses.

Notion launches Developer Platform with External Agents API and Workers
Notion opened a developer platform with an External Agents API plus Workers, webhooks, and a headless CLI. The release lets external agents query Notion, extend workflows, and stay in sync with other systems.

LangChain launches SmithDB, LangSmith Engine, and Sandboxes at Interrupt
LangChain unveiled SmithDB, LangSmith Engine, Managed Deep Agents, and GA sandboxes at Interrupt. The stack gives agent teams a purpose-built trace database, autonomous failure triage, and managed execution environments for production workflows.
Cline launches SDK and hits 74.2% on Terminal-Bench 2.0
OpenAI offers 2 free months of Codex to enterprise switchers
Cursor launches cloud development environments with rollback and scoped secrets
Codex introduces Windows sandbox with firewall rules and write-restricted tokens
Top storiesthis week
Google introduces Gemini Intelligence on Android with browser use, AppFunctions, and Rambler
Google unveiled Gemini Intelligence at the Android Show with cross-app task automation, Gemini in Chrome, Rambler voice cleanup, custom widgets, and AppFunctions. The rollout moves Gemini into core Android workflows on Pixel and Galaxy devices this summer.


Perceptron releases Mk1 with 2 FPS video reasoning, 32K context, and $0.15 per 1M input
Perceptron launched Mk1, a multimodal model for video and embodied reasoning with native 2 FPS video, 32K context, and structured spatial outputs. OpenRouter access and the low input price make it usable for deployment, not just demos.

Researchers report Mini Shai-Hulud hits OpenSearch, Guardrails, and RubyGems after TanStack
Researchers tied Mini Shai-Hulud to OpenSearch, Guardrails, and a RubyGems incident after TanStack's npm postmortem. Track registry controls, CI cache hardening, dependency policy, and secret handling before the next package hit.

Claude Opus 4.7 opens fast mode with ~2.5x speed as Cursor, v0, Droid, and OpenRouter add support
Anthropic rolled fast mode for Opus 4.7 into Claude Code and tools including Cursor, v0, Droid, Conductor, and OpenRouter. Use it where latency matters, but watch pricing: Cursor disclosed a 6x multiplier and others treat it as premium.

SophontAI releases Medmarks v1.0 with 30 medical benchmarks and 61 models
SophontAI released Medmarks v1.0, expanding its open medical LLM evaluation suite to 30 benchmarks and 61 models alongside a technical report. It gives teams a larger open baseline for medical post-training and model selection, with more benchmarks and model coverage still planned.









