Ramp reports business AI token spend at 13x January 2025 levels
Ramp data and operator reports said enterprise AI token spending is rising far faster than budget controls and procurement cycles. Teams should plan for routing, cheaper defaults, and spend caps to become core engineering infrastructure.

TL;DR
- arakharazian's Ramp summary said the average business is spending 13x more on AI tokens than in January 2025, while Ramp's May 2026 AI Index also put Anthropic ahead of OpenAI in business adoption.
- Simon Willison's post argued coding agents already look like product-market fit, and Simon Willison's Weblog put his own last-30-day usage at about $2,180 worth of tokens for $200 in subscriptions.
- emollick's budgeting post and his follow-up both described the same problem inside large organizations: tokens have become mandatory for coding work faster than budgeting and control processes have adapted.
- Cost control is already showing up as routing and gateway infrastructure, according to GergelyOrosz's routing thread, Hacubu's LangSmith gateway retweet, and kimmonismus's Ramp recap on the rise of inference platforms selling access to cheaper open models.
- levie's enterprise thread added a second-order effect: once agents touch production systems, companies need new implementation roles for observability, access control, workflow redesign, and repeated rework every time models change.
You can read Ramp's AI Index, Simon Willison's product-market-fit post, and the sprawling Hacker News thread. The interesting split is that the top-line adoption story came with a cost-discipline warning inside Ramp's own framing, while operators on X and HN were already talking about routing layers, spend caps, and entire AI budgets getting torched early.
Ramp's index turned adoption into a cost story
Ramp's May 2026 index gave Anthropic 34.4% of business adoption versus OpenAI's 32.3%, but the more useful number was the spend curve: average business token spend was up 13x from January 2025, according to arakharazian's post linking back to Ramp's report.
That same summary said tokens are still less than 2% of total business spend even for the highest spenders. Ramp's framing, as relayed by arakharazian and kimmonismus, was that this share is small today but the growth rate is the part that looks hard to sustain.
kimmonismus also highlighted the contradiction inside the report: Anthropic was winning the business-adoption snapshot at the same moment inference platforms offering cheaper open models were among the fastest growers on Ramp's platform.
Budgeting broke before governance existed
Ethan Mollick's most concrete point was temporal. Tokens went from something companies did not budget for a year ago to something many teams now treat as required for coding, per his first post.
His follow-up claimed some large organizations had already blown through their entire token budget in the first couple months of the year, according to Mollick's follow-up. A third Mollick post pushed the next problem down to the manager level: who gets the expensive model, by skill level, by project, or by some other rule.
That same budget-cap logic showed up in TheEthanDing's CIO and CFO anecdote, which described enterprises waiting for someone to normalize explicit LLM spend ceilings in the $1,000 to $3,000 per user per month range. The point was not that one cap has won, but that token allocation is turning into a policy problem rather than a quiet API line item.
Routing and gateways are becoming the control plane
The operator playbook in the evidence pool was unusually consistent:
- Route simpler workloads to cheaper providers, as GergelyOrosz put it.
- Use cheaper open models through vendors like Fireworks and Baseten, again per that thread.
- Make the default model the cheap one, also from GergelyOrosz.
- Add a gateway layer so cost controls are enforced centrally, which Hacubu's retweet of Harrison Chase said is one of the main benefits customers get from LangSmith's LLM gateway.
Simon Willison's post explains why that infrastructure category is suddenly hot. He estimated that his own last 30 days of Claude Code and Codex use would have cost about $2,180 at API prices, while flat-rate plans cost him $200. In the accompanying Hacker News thread, commenters split between people saying they would gladly pay full price for the value and people arguing the economics invite open-weight substitution and smarter routing.
The HN discussion summary captured both sides cleanly: one branch focused on real value from heavy token use, while another focused on open-source competition and skepticism that premium model vendors keep those margins once cost discipline arrives.
AI deployment is creating a new implementation layer
The last useful reveal came from levie's thread, which shifted the story from token bills to org design. He argued that once enterprises move from chat-plus-search to agents wired into production systems, the work multiplies into data protection, access controls, observability, workflow redesign, human review points, and repeated upgrades whenever model capabilities change.
That thread also sketched who does the work. Some companies are repurposing internal IT talent, according to levie, but he also described demand for internal FDE-like roles, vendor-side applied AI architecture teams, and new services firms built around AI implementation. Rising token spend is only one line on the invoice, the bigger story is how much new operating machinery enterprise AI is dragging in behind it.