Claude Code supports 21-agent pipelines at Code with Claude San Francisco
At Code with Claude San Francisco, builders showed Claude Code running 21-agent app pipelines across Figma, Jira, Confluence, and TestFlight. Users should watch for reliability strain as posts and conference recaps tie recent slowdowns to Anthropic's reported 80x growth.

TL;DR
- At Anthropic's Code with Claude event, the official agenda put Claude Code, GitHub-scale builds, and Managed Agents at the center of the day, while ClaudeDevs' rate-limit update simultaneously announced doubled 5-hour limits and the removal of peak-hour reductions for Pro and Max.
- The most concrete creator workflow in the evidence came from aakashgupta's 21-agent breakdown, which claims a hockey-rules iOS app went from idea to TestFlight in one session using Claude Code plus Confluence, Jira, Figma Make, Simulator, and TestFlight.
- According to aakashgupta's org-chart post, the trick was not one giant prompt but a staffed roster, with a system analyst, spec architect, designers, implementation agents, a test architect, and a product council.
- Reliability strain sat next to the demos all week: petergyang's event quote said Dario Amodei reported 80x growth in usage and revenue earlier this year, while levelsio's slowdown report and om_patel5's Reddit screenshot captured users describing Claude Code as slow, limit-prone, and instruction-forgetful.
- The most useful implementation detail may be aakashgupta's markdown-file graphic: each agent is just a markdown file, which turns the reusable part of the workflow into a portable folder instead of a single brittle conversation.
You can browse the Code with Claude agenda, the xAI post on the Anthropic compute partnership, and the short YouTube clips linked from aakashgupta's workflow post and aakashgupta's TestFlight thread. The weirdest split in the evidence is simple: the conference showed Claude Code behaving like a production pipeline, while the same week also produced a flood of posts about degraded speed, revised limits, and emergency compute expansion.
Code with Claude
Code with Claude was not pitched as a generic developer conference. ClaudeDevs' event kickoff put four sessions on the board: keynote, What's new in Claude Code, Building on Claude at GitHub scale, and Get to production faster with Managed Agents.
By mid-event, bcherny's event post paired the in-person builder feedback with two concrete policy changes: doubled 5-hour limits across paid Claude Code plans, and no more peak-hours limit reduction for Pro and Max.
Those changes landed alongside Anthropic's compute expansion. xai's partnership post said SpaceXAI would provide Anthropic access to Colossus 1 for extra Claude capacity, and the linked xAI partnership post framed it as additional compute for Claude.
The 21-agent roster
The best evidence from the event week is not a benchmark. It is an org chart. In aakashgupta's workflow post, Gabor Mayer's Claude Code setup is described as 21 specialized agents rather than one coding chat.
The roles listed across the workflow post and the roster graphic break down into a familiar software team shape:
- System analyst, writes specs and requirements
- Product spec architect, checks whether the spec is actually clear
- CTO agent, makes technical decisions
- UX flow architect, builds clickable flows
- Designer and brand agents, keep screens and visual consistency aligned
- Implementation agents, write the code
- Code maintainability or "Spaghetti" agent, catches circular references, naming problems, and comment quality
- Performance agent, handles optimization
- Test architect, sets quality gates
- Product council, reviews privacy and data-handling constraints
That system analyst role shows up as the linchpin more than once. The OCR in the roster graphic says the agent asks clarifying questions one at a time, uses Atlassian and Figma MCP tools, and outputs Confluence docs plus Jira tickets with dependencies mapped.
Confluence, Jira, Figma, TestFlight
The workflow only becomes interesting once the handoffs are concrete. aakashgupta's event recap and his main thread describe the same chain: spec into Confluence, tickets into Jira, screens out of Figma Make, implementation in Claude Code, validation in Simulator, then TestFlight.
Across the posts, the pipeline has a few repeated mechanics:
- A system analyst ingests the spec and breaks it into requirements.
- Confluence stores the docs.
- Jira gets populated with frontend tickets, often with design links attached.
- Figma Make generates screens from brand guidance.
- Claude Code builds the screens and app logic.
- Simulator validates the result.
- TestFlight becomes the shipping surface.
aakashgupta's 72-minute infographic adds a sharper claim: idea to App Store submission in 72 minutes, across four sprints and three parallel tracks, with the coding phase described as the fastest part. The same graphic says screenshots attached to tickets were the unlock for getting brand-faithful builds instead of generic output.
Context compression
The most persuasive explanation for why this setup exists comes from aakashgupta's context-compression post. He argues that one long Claude conversation eventually summarizes itself, and details drop out. His example is specific: brand guidelines specified orange, Figma Make kept orange in the sheet, and the downstream output lost it.
That post frames the 21-agent structure as a context-management hack:
- each agent gets a fresh context
- each agent gets scoped permissions
- each agent gets one job
- no single context window has to carry the whole company
aakashgupta's Spaghetti agent clip makes the same point from the quality side. A dedicated maintainability agent checks circular references, naming conventions, and comments after changes, which is a narrower task than asking one session to plan, implement, review, and police itself.
Compute strain showed up in public
The event week also contained a less magical story. petergyang's event quote said Dario Amodei told the audience Anthropic saw 80x growth earlier this year on usage and revenue, and ClaudeDevs' update raised limits the same day.
Outside the event, users were already connecting performance complaints to traffic. levelsio's slowdown report described Claude Code as slow and risky enough to stop coding for the night, levelsio's follow-up speculated that European evening and American daytime overlap might explain the degradation, and DannyLimanseta's reply said the slowdown seemed to coincide with the US waking up.
A harsher version came from om_patel5's Reddit screenshot thread, which summarized a bug report from a longtime engineer claiming Opus 4.7 turned seconds into 30-second waits, minutes into 45-minute runs, and made Claude ignore instructions like timeout preferences and no-auto-commit rules. That is still a user report, not a product note, but it matches the broader pattern of posts about slower behavior and tighter limits during the same window.
Agent files
The last useful detail is the least glamorous one. According to aakashgupta's markdown-file graphic, each member of the 21-agent team is just a markdown file that defines role, behavior, constraints, tools, and output.
That turns the reusable asset into a folder:
system-analyst.md- design and brand agent files
- QA and test files
- maintainability and performance files
- review and council files
the same graphic explicitly pitches those files as portable across projects, with every workaround and painful lesson encoded into the next run. That is a more concrete takeaway than the usual multi-agent hype. The app can change, the stack can change, but a folder of agent definitions is easy to copy, edit, and version.