Skip to content
AI Primer
breaking

Anthropic reports Claude writes 90% of its code with lighter scaffolding

Anthropic said Claude now writes more than 90% of its code, while Alex Albert pushed lighter scaffolding, real-trace evals and self-check loops. Teams using Claude Design, Projects and skill descriptions should keep layout, routing and long-running agent work aligned.

6 min read
Anthropic reports Claude writes 90% of its code with lighter scaffolding
Anthropic reports Claude writes 90% of its code with lighter scaffolding

TL;DR

You can read the written interview, watch Alex Albert's full talk, browse Anthropic's Claude Design post, and skim the big HN discussion. The interesting bit is how consistent the advice is across all four surfaces: simplify the harness, keep memory and routing explicit, and let the agent see enough of its own output to correct course.

90% of the code

The most arresting number here is simple. Anthropic says Claude writes more than 90% of its code, per minchoi's summary of the linked Economic Times report.

The second half of the claim matters more than the first. That same report, as relayed by minchoi's post, says the company is still hiring, which matches the broader pattern in the rest of the evidence: once the model handles more execution, the work shifts into routing, review, eval design, tool access, and system upkeep.

Model and harness

In Peter Yang's interview, Alex Albert describes Anthropic as designing the model and the harness together. In that thread, Albert says the team thinks about how Claude is exposed across API, Claude Code, and Cowork, because each surface changes the end-user experience.

He also says Anthropic uses Claude to cluster user feedback, find recurring themes, and turn synthetic versions of user problems into evals, as summarized in petergyang's post and expanded in the written interview. That is a cleaner description of modern product work than most launch posts manage.

One more detail from the interview thread is easy to miss: Albert treats model character as a product issue, not a branding issue, because long-running agents make judgment calls. Once a model is acting across hours, files, and tools, personality turns into operational behavior.

Lighter scaffolding

Peter Yang's five takeaways from Albert's talk read like a teardown of the over-prompted app stack. The full talk and written notes line up with the thread.

  1. Shrink scaffolding, per the opening post.
  2. Audit systems for prompt slop and legacy workarounds, per the bonsai-tree post.
  3. Let the model decide when to think versus act, per the reasoning post.
  4. Build self-check loops, including computer-use QA for coding agents, per the feedback-loop post.
  5. Build evals from real customer traces, not generic benchmark theater, per the evals post.

The throughline is that stronger models punish stale scaffolding. A brittle wrapper that used to rescue weaker outputs can turn into token waste and routing noise once the base model gets better.

Skills route on descriptions

The most useful creator-facing detail in the evidence is not a benchmark. It is the claim in aakashgupta's routing post that Claude only scans the name and description of installed skills at session start, and loads the full instructions only after it decides a skill is relevant.

That makes the description field routing logic. Gupta's fixes across the routing post and the 75-test audit are concrete:

  • Describe the skill in plain language, not a cute one-liner.
  • Add trigger phrases a real user would type.
  • Put exclusions in the description, not buried in the body.
  • Point to the neighboring skill when a task belongs elsewhere.
  • Use an output template, or Claude will improvise a new format.
  • Prefer one strong input-output example over a pile of rules.
  • Keep skill files short enough that the bottom half still gets read.

That is boring infrastructure work, but it is the part that decides whether a reusable creative workflow actually fires.

Projects, memory, and style

What individual creators are doing with Claude is more specific than the usual "just add context" advice. Aakash Gupta's 10-step card maps a stack that starts with Cowork instead of chat, then layers skills, MCP connectors, a routing file, example-based memory, Claude Code, GitHub sync, and mobile Dispatch.

The Japanese thread from 7_eito_7's post lands on three smaller tactics that fit the same pattern:

  • Ask Claude to ask questions first, so the brief gets clarified before generation, per the thread opener.
  • Store role, goals, tone, output preferences, and disliked phrasing in Projects, per the Projects post.
  • Feed Claude three to five past pieces so it learns rhythm, vocabulary, line breaks, and endings, per the style-learning post.

Gupta's infographic adds one more detail from the card: the stack is meant to compound. Skills get better after connector setup, routing only scales once knowledge is split into files, and example-driven memory only starts paying off after enough good and bad examples accumulate.

Claude Design handoff

Y
Hacker News

Claude Design

1.2k upvotes · 762 comments

The clearest downstream effect for creatives showed up a month earlier in Anthropic's Claude Design announcement and the heavy HN discussion. In the HN summary, people describe Claude Design as a fast way to sketch layout directions, refine them with inline comments, and hand off polished artifacts instead of blank canvases.

Y
Hacker News

Discussion around Claude Design

1.2k upvotes · 762 comments

Three HN examples in the discussion roundup are worth pulling out as separate patterns:

  • A "backend-in-a-prompt" setup that spins up InstantDB and injects credentials.
  • A non-designer workflow where Claude Design explores layouts and Claude Code carries the result into implementation.
  • A design-to-implementation handoff where an app gets mocked visually first, then rebuilt in the business stack with existing tests and playbooks.

That last pattern is the real ending here. Anthropic's own team is talking about model-plus-harness, users are rebuilding their routing and memory layers, and Claude Design shows the same idea on the visual side: the valuable artifact is no longer just the output, it is the system around the output that makes the next pass faster.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 3 threads
TL;DR2 posts
Lighter scaffolding3 posts
Projects, memory, and style3 posts
Share on X