Skip to content
AI Primer

Explore what's new in AI

Where people deep in AI come to stay current.

Filters

Category

Tags

Breaking

Local users report DeepSeek V4 Flash, Qwen 3.6, and Gemma 4 at 40-200 tok/s on Macs and 3090s

Developers posted new local-model measurements for DS4, Qwen 3.6, and Gemma 4: about 40 tok/s on an M3 Ultra, 70+ tok/s on MacBooks with MPS, and 120-200 tok/s for Qwen3.6-27B on a single RTX 3090. The numbers suggest coding-capable local runs are moving from demos toward regular use.

Local users report DeepSeek V4 Flash, Qwen 3.6, and Gemma 4 at 40-200 tok/s on Macs and 3090s
New
Gemma·10th May·6 min read
Breaking

ElevenLabs cuts Flash TTS 55%, Scribe 45%, and Agents 20% with pay-as-you-go billing

ElevenLabs lowered self-serve pricing for ElevenAPI and ElevenAgents and added pay-as-you-go billing. The biggest listed drops are to $0.05 per 1,000 tokens for Flash TTS, $0.22 for Scribe v2 speech-to-text, and $0.08 per minute for agent calls.

ElevenLabs cuts Flash TTS 55%, Scribe 45%, and Agents 20% with pay-as-you-go billing
New
Voice Agents·7th May·3 min read
Anthropic doubles Claude Code 5-hour limits after SpaceX Colossus 1 compute deal

Anthropic doubles Claude Code 5-hour limits after SpaceX Colossus 1 compute deal

Anthropic said a SpaceX compute deal will add 300+ MW and 220,000+ NVIDIA GPUs, and it doubled Claude Code 5-hour limits across paid plans. It also raised Opus API ceilings; users should still watch the unchanged weekly caps.

Claude Code6th May·4 min read
See all stories →

Briefs forMay 11

AI Primer mascot

Daily AI Digest

Get the best stories delivered
to your inbox

Skills Spotlighttop by stars

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.