updateJune 10, 2026

Engineers compare Fable 5, Codex, and local models on token bills

New posts compare Fable 5's $10/$50 pricing and fast spend caps with large Codex and Claude token bills, while local-model advocates push hybrid routing and smaller open weights. The comparison is increasingly framed around cost per completed task, not benchmark scores alone.

5 min read

Engineers compare Fable 5, Codex, and local models on token bills

TL;DR

Anthropic's launch post puts Claude Fable 5 at $10 per million input tokens and $50 per million output tokens, double the list price Simon Willison quoted for the Opus 4.x line in his first-impressions post.
According to fresh HN discussion, engineers are describing Fable 5 as better at high-level reasoning, plan review, and ugly debugging sessions than Opus 4.8, while the main HN discussion roundup keeps circling back to the same catch, it burns through budget fast.
In Simon Willison's product-market-fit post, Willison estimated roughly $2,180.16 in 30-day Claude Code and Codex usage from a $200 combined Max and Pro subscription spend, which is why flat monthly plans still look like a bargain for heavy agent users.
The local-model camp in the HN summary of Local AI Needs to be the Norm is making a different argument entirely: route narrow tasks to smaller local models, keep tool use and retrieval close to the machine, and escalate only the hard calls to frontier APIs.
Anthropic buried two expensive footnotes outside the headline pricing, a new classifier-fallback path and a mandatory 30-day retention policy for Mythos-class traffic, in its model docs and launch post.

You can read Anthropic's launch post, skim the API model guide, and then compare that vendor framing with Simon Willison's hands-on writeup, his earlier coding-agent cost post, and the HN local-first argument. The weird bit is that all three camps are talking about the same thing now: not benchmark supremacy, but how many expensive agentic runs you can afford before the model choice stops penciling out.

Fable 5's price card

Anthropic shipped Claude Fable 5 and Claude Mythos 5 as the same underlying model with different access and safeguards. Fable 5 is public through the API, while Mythos 5 starts behind Project Glasswing and a trusted-access program.

Anthropic Launches Claude Fable 5 and Restricted Claude Mythos 5 Models

Anthropic has released Claude Fable 5, a Mythos-class model optimized for general use, and Claude Mythos 5, a version with specialized safeguards removed for specific use cases. Claude Fable 5 is available to the public via the Claude API. Claude Mythos 5, which features enhanced cybersecurity capabilities, is currently restricted to Project Glasswing partners and will soon be available to selected biology researchers under a restricted trusted access program. Both models are priced at $10 per million input tokens and $50 per million output tokens.

The public sticker price is simple:

Input: $10 per million tokens
Output: $50 per million tokens
Context window: 1 million tokens
Max output: 128,000 tokens

Anthropic's model guide confirms the same numbers. In Simon Willison's first-impressions post, Willison notes that this is twice the Opus 4.5 through 4.8 price band, even before you get to any long-running agent loop.

What engineers say the extra spend buys

The strongest pro-Fable argument in the evidence pool is not a benchmark chart. It is engineers saying the model changes how they work through a hard problem.

Fresh discussion on Claude Fable 5

Today's fresh signal is mostly hands-on reports from people using Fable 5 in Claude Code and Claude.ai. Several commenters say it materially improves higher-level reasoning, plan review, and complex bug work; one describes it as strong enough to set up its own testing lab for a Windows process lifecycle issue, and another says it can find directional simplifications after Opus and Codex have exhausted obvious fixes. The other new theme is friction: people report hitting spend or usage limits very quickly, and some argue the model is too expensive to be a default choice. A separate comment questions Anthropic's benchmark changes and moving scores into the PDF, while another raises legal concerns about the model's data-retention and access policies.

Initial impressions of Claude Fable 5

I didn't have early access to today's Claude Fable 5 release, but I've spent the past ~5.5 hours putting it through its paces. My initial impressions are that this is something of a beast. It's slow, expensive and has been quite happily churning through everything I've thrown at it so far. As is frequently the case with current frontier models the challenge is finding tasks that it can't do. First, let's review the key characteristics. Anthropic claim that Claude Fable 5 offers the same performance as Claude Mythos 5, except with much more strict guardrails in place to prevent it being used for harmful things. Those guardrails trigger often enough that the Claude API has new mechanisms for letting you know when you hit them, and even has a new option to request it falls back to another model automatically if something gets rejected. Claude Mythos 5 is out today as well, Anthropic say it "Shares Claude Fable 5's capabilities without the safety classifiers". The models have a 1 million token context window, 128,000 maximum output tokens and a knowledge cut-off date of January 2026. They are priced at twice the price of Claude Opus 4.5/4.6/4.7/4.8: $10/million input tokens and $50/million output tokens. There's no increase in price for longer context usage. Other than that the upgrade guide is substantially thinner than the similar guide for Opus 4.8. The big model smell The best way to describe Fable is that it feels big. Not just in terms of speed and cost, but also in how muc

Three recurring claims show up in the HN thread and Willison's post:

Better high-level reasoning and plan review than Opus 4.8, per fresh HN discussion
More willingness to pivot toward larger simplifications instead of local patching, according to the HN discussion roundup
Stronger long-horizon debugging, including the report in fresh HN discussion that it built a whole test harness around a Windows lifecycle bug before proposing fixes

That is Christmas-come-early material for coding-agent nerds. It also sounds expensive on purpose: the model is being praised for doing more work, over longer traces, with bigger context and more elaborate self-checking.

The subscription arbitrage

One reason sticker price is not the whole story is that many heavy users are not paying list API rates for every run. They are getting frontier-model usage through monthly plans, then back-solving what that usage would have cost at token rates.

I think Anthropic and OpenAI have found product-market fit

Simon Willison posits that OpenAI and Anthropic have achieved product-market fit as of April 2026, driven by the adoption of coding and general-purpose agent products like Claude Code/Cowork and Codex. This shift is characterized by significant enterprise spending on API usage, leading to reports of companies facing unexpected costs. Willison contrasts this with the earlier consumer-focused success of ChatGPT, noting that current revenue is now substantial enough to potentially cover the high operational costs of these frontier labs.

Discussion around I think Anthropic and OpenAI have found product-market fit

Thread discussion highlights: - simonw on Real-world token usage and perceived value: Simonw shares very large Claude Code and OpenAI Codex token totals and says he believes he got real value for the API price, even at full price. - trjordan on Skepticism about required spending: Argues the labs may need around $1T/year in token spend to recoup their hardware buildouts, and says current productivity gains may be too small to justify that scale. - binary0010 on Open-source models as a cheaper alternative: Questions how OpenAI and Anthropic keep customers when models like GLM-5.1 and other open-source systems are already good enough for heavy agentic work and much cheaper.

In Simon Willison's post, he ran ccusage on his laptop and estimated about $2,180.16 in Claude Code and Codex tokens over 30 days while paying $100 for Anthropic Max and $100 for OpenAI Pro. The HN thread around that post adds the same pattern from other users: these tools chew through credits on test suites, compiler runs, and large agentic loops, but the perceived value can still beat pay-as-you-go if you live in them all day.

That is why the main Fable 5 HN thread reads less like a launch thread and more like a cost-accounting session. Engineers are comparing models by completed work per dollar, not by who won one more vendor benchmark.

Local routing

The local-first counterargument is not that frontier models are weak. It is that too many product teams are sending cheap tasks to expensive models because chat APIs are the default abstraction.

Local AI needs to be the norm

The engineering takeaway is that many product features may be better built as local, task-specific inference pipelines rather than cloud chat integrations. The discussion highlights narrow-model design, tool use/RAG, OS-level APIs, and hybrid routing as the likely implementation path, while also flagging real constraints around memory, storage, and model refresh economics.

Discussion around Local AI needs to be the norm

Thread discussion highlights: - wrxd on small specialized models: local models to succeed they need to be "good enough" ... able to do a small task well ... tool use ... did way more to solve hallucinations than getting a bigger model - FrasiertheLion on OS/browser model APIs: bullish on standardized local APIs that ship with the browser or platform ... split along two axes: does the task touch private data, and does it need frontier intelligence? - try-working on hybrid local/cloud routing: building a protocol and router runtime for hybrid local/cloud AI ... assign roles to models based on tasks, capabilities and observed performance

The local-first stack described in Local AI Needs to be the Norm and its HN thread breaks that into a routing problem:

Use small local models for narrow tasks that only need to be good enough
Lean on tool use and retrieval to reduce hallucinations instead of scaling the base model first
Keep privacy-sensitive work on device when possible
Route only the hard, frontier-intelligence calls to cloud models

HN commenters add two practical constraints to that story. According to the HN discussion of local-first design, standardized OS or browser model APIs would make hybrid routing easier, while the same thread's hardware discussion points out that memory, VRAM, and latency still make serious local inference painful for many teams.

Policy and billing footnotes

Anthropic's own docs surface two less glamorous details that showed up in the HN criticism almost immediately.

Claude Fable 5

For AI engineers, the thread is mainly a practical read on where Fable 5 sits in the model ladder: commenters describe better high-level reasoning, plan review, and complex debugging than Opus 4.8, but at a much higher cost and with tight spend/usage limits. The discussion also highlights benchmark skepticism and policy friction, which matters if you're choosing a model for production workflows or comparing vendor claims.

First, Fable 5 can refuse requests through new safety classifiers, and Anthropic's cookbook guide says integrators need to handle direct refusals, optional server-side fallback, and revised billing behavior. Anthropic also says blocked requests are not billed for input tokens, and fallback requests can be billed like cache reads in some flows.

Second, Anthropic's launch post says Mythos-class models now require 30-day retention for all traffic on first- and third-party surfaces. Fresh HN discussion flagged that policy friction alongside benchmark skepticism on day one, which means the bill engineers are comparing is not only dollars and tokens. It also includes which requests can be sent at all, and under what data-handling terms.