Skip to content
AI Primer
update

Claude Code users report 7,000-session regressions and OpenClaw disconnects

Fresh Claude Code evidence includes a leaked npm source map, a 7,000-session read:edit analysis, and reproducible OpenClaw-triggered disconnects. Watch for hidden repo-scanning, anti-distillation, and session-routing controls to affect real coding sessions.

6 min read
Claude Code users report 7,000-session regressions and OpenClaw disconnects
Claude Code users report 7,000-session regressions and OpenClaw disconnects

TL;DR

  • The GitHub issue behind the 7,000-session report analyzed 6,852 Claude Code sessions, 17,871 thinking blocks, and 234,760 tool calls, then tied the regression to a shift from research-first behavior toward edit-first behavior.
  • In Anthropic's April 23 postmortem, the company said the API and inference layer were unaffected, but three product-layer changes did degrade Claude Code, Claude Agent SDK, and Claude Cowork until fixes landed in v2.1.116.
  • The source-map leak report says npm package version 2.1.88 shipped with a .map file whose sourcesContent exposed Claude Code's unobfuscated TypeScript, while the HN discussion surfaced anti-distillation fake_tools injection and prompt-logging regexes.
  • Theo Browne's reported reproduction said a repo containing OpenClaw strings could trigger immediate disconnects or drain usage, and the HN reproduction thread included a commit-message repro that sent session usage to 100%.
  • The OpenClaw subscription thread showed Anthropic also split third-party harness usage away from Claude subscriptions on April 4, moving those flows to separately billed extra usage or API keys.

You can read Anthropic's own postmortem, skim the original 7,000-session GitHub issue, and see the OpenClaw policy email reproduced on HN. The leaked code also spawned a reverse-engineering map in this architectural breakdown, while Claude's current docs still spell out the exposed control surface for effort levels and thinking defaults.

7,000 sessions, three bugs

Analysis of Claude Code Quality Regression in Complex Engineering Tasks (Issue #42796)

GitHub issue #42796 documents a significant quality regression in Claude Code identified in early 2026, where the model became less effective at complex engineering tasks. User-led analysis of nearly 7,000 session files suggested that the rollout of a thinking content redaction feature (redact-thinking-2026-02-12) correlated with a shift from research-first to edit-first behavior, evidenced by a 70% reduction in the read-to-edit ratio. The issue served as a reference point for later discussions regarding model performance, tool usage, and auditability in the Claude Code community.

Discussion around Issue: Claude Code is unusable for complex engineering tasks with Feb updates

Thread discussion highlights: - noxa on report author explains methodology: You can watch for these yourself - they are strong indicators of shallow thinking... read:edit ratio shifts, thinking character shifts before the redaction, post-redaction correlation... I had logs back to a bit before the degradation started and was able to mine them. - bcherny on Anthropic response: `redact-thinking-2026-02-12` ... hides thinking from the UI... It *does not* impact thinking itself, nor does it impact thinking budgets or the way extended reasoning works under the hood. It is a UI-only change. - thrtythreeforty on reported regression in Opus 4.6: I noticed this almost immediately when attempting to switch to Opus 4.6. It seems very post-trained to hack something together; I also noticed that "simplest fix" appeared frequently and invariably preceded some horrible slop.

The report behind issue #42796 put numbers on the vibe shift users had been describing for weeks: 6,852 session files, 17,871 thinking blocks, and 234,760 tool calls across January to March. Its core claim was behavioral, not benchmark-shaped. Claude Code was reading less, editing sooner, and abandoning more long engineering tasks after the February reasoning changes.

noxa's methodology note in the HN thread said the strongest signals came from read:edit ratios, shifts in visible thinking before redaction, and log history from before the degradation started. Anthropic's later postmortem said the investigation eventually landed on three separate product changes:

  • On March 4, Claude Code's default reasoning effort moved from high to medium, then reverted on April 7.
  • On March 26, a cache-related change started clearing old reasoning every turn after an idle-session threshold, then got fixed on April 10 in v2.1.101.
  • On April 16, a prompt instruction capped text between tool calls at 25 words and final responses at 100 words, then got reverted on April 20.

That postmortem matters because it resolved the biggest open question in the original report. Anthropic said the API and inference layer were not degraded, but the Claude Code harness and adjacent product layers were.

Effort levels and hidden memory loss

Issue: Claude Code is unusable for complex engineering tasks with Feb updates

Relevant as a report of possible quality regression in an AI coding assistant, including claims about reasoning redaction, tool-usage shifts, and practical mitigation knobs like effort level and adaptive-thinking settings.

Discussion around Issue: Claude Code is unusable for complex engineering tasks with Feb updates

Thread discussion highlights: - noxa on report author explains methodology: You can watch for these yourself - they are strong indicators of shallow thinking... read:edit ratio shifts, thinking character shifts before the redaction, post-redaction correlation... I had logs back to a bit before the degradation started and was able to mine them. - bcherny on Anthropic response: `redact-thinking-2026-02-12` ... hides thinking from the UI... It *does not* impact thinking itself, nor does it impact thinking budgets or the way extended reasoning works under the hood. It is a UI-only change. - thrtythreeforty on reported regression in Opus 4.6: I noticed this almost immediately when attempting to switch to Opus 4.6. It seems very post-trained to hack something together; I also noticed that "simplest fix" appeared frequently and invariably preceded some horrible slop.

The regression thread turned up two different kinds of control surface: documented effort controls, and a memory bug users could only infer from behavior. robeym's workaround comment in the HN thread pinned reliability gains on CLAUDE_CODE_EFFORT_LEVEL=max, CLAUDE_CODE_DISABLE_BACKGROUND_TASKS=1, and especially CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1.

Claude's docs corroborate part of that surface. The model configuration page says effort levels tune adaptive reasoning, lists low, medium, high, xhigh, and max for newer models, and notes that Opus 4.6 and Sonnet 4.6 only exposed low, medium, high, and max. The settings docs add alwaysThinkingEnabled, persistent effortLevel, and the CLAUDE_CODE_EFFORT_LEVEL override.

The stranger part was the stale-session bug Anthropic described later. In the postmortem, the company said a session idle for over an hour could start sending clear_thinking_20251015 with keep:1 on every later turn, which progressively discarded prior reasoning and also forced cache misses that likely burned usage faster.

Source maps, fake tools, and prompt logging

Chaofan Shou (@Fried_rice) reports Claude Code source code leaked via an npm source map file

Chaofan Shou (@Fried_rice) publicly disclosed that Anthropic’s Claude Code npm package included a source map file in the published artifact, which allowed extraction of the full, unobfuscated TypeScript source contained in the map’s sourcesContent. The issue was reported as a “map file in their npm registry” exposing the code.

Discussion around Claude Code's source code has been leaked via a map file in their NPM registry

Thread discussion highlights: - cedws on anti-distillation defenses: Anthropic's anti-distillation defense in Claude Code injects `anti_distillation: ['fake_tools']` into API requests, causing decoy tool definitions to be slipped into the model's system prompt. - bkryza on prompt logging regex: A commenter points to a regex for detecting negative sentiment in user prompts that is then logged, and notes the specific words appear to be ones to avoid. - jakegmaths on Bun production build: One commenter thinks the leak may be caused by a Bun bug that exposes source maps in production, and says Claude Code uses Bun, so a build step may be generating maps unexpectedly.

According to the original leak report, Claude Code version 2.1.88 shipped a .map file in npm that exposed the unobfuscated TypeScript inside sourcesContent. This reverse-engineering writeup sizes the recovered snapshot at more than 512,000 lines across 1,902 files and organizes it into subsystems like prompts, permissions, tools, agents, and context management.

The HN thread about the leak is where the most operational details surfaced fast:

  • A commenter linked in the discussion summary said Claude Code injects anti_distillation: ['fake_tools'] into API requests, which then adds decoy tool definitions to the system prompt.
  • Another comment surfaced via the same thread pointed to a regex that flags negative-sentiment words in prompts for logging.
  • A third comment, again collected in the HN discussion, speculated the leak may have come from Bun-related production source-map behavior because Claude Code uses Bun.
  • The main HN thread also noted that deprecating the package as "Unpublished" did not actually remove the published version from npm.

Only the first of those claims is visible in public docs today. Anthropic's own engineering posts on long-running harness design and application-development harnesses openly frame Claude's coding product as a layered harness around the base model. The leak gave outside engineers the implementation details.

OpenClaw triggers

Reports Indicate Claude Code Restricts or Extra-Charges Users Based on 'OpenClaw' Mentions in Git Commits

Developer Theo Browne reported that Claude Code behavior is affected by the presence of "OpenClaw" in Git repository data, specifically within JSON blobs in commit history. Users have observed that having these strings in their commit history can cause Claude Code to refuse requests or trigger "extra usage" charges, often resulting in immediate session termination or depletion of usage quotas. Community testing suggests this is triggered by the string itself being present in the Git environment that the AI tool scans.

Discussion around Claude Code refuses requests or charges extra if your commits mention "OpenClaw"

Thread discussion highlights: - abdullin on reproduction: I reproduced this on my account... commit -m "'{\"schema\": \"openclaw.inbound_meta.v1\"}'" ... Immediate disconnect and session usage went to 100% - stingraycharles on implementation critique: it looks like this OpenClaw and Hermes ban was implemented incredibly poorly; it looks like a simple regex. - threecheese on abuse / arms race: It looks a lot like an arms race, and we are getting caught in the middle of it.

Theo Browne's report said Claude Code behavior could change if OpenClaw appeared in Git history or repo metadata, including sudden refusals, extra usage charges, or immediate session termination. abdullin's reproduced test in the HN thread claimed that committing a JSON blob containing openclaw.inbound_meta.v1 caused an instant disconnect and sent session usage to 100%.

The community read that as string matching, not nuanced policy enforcement. [stingraycharles's comment collected in src:8|the discussion summary] called it "a simple regex," while threecheese's comment in the same thread described users as getting caught inside an abuse-prevention arms race.

That incident lines up awkwardly well with what the source leak had already revealed. Repo content, system-prompt policy, and usage accounting were all closer to the harness than many users assumed.

Third-party harness billing

Tell HN: Anthropic no longer allowing Claude Code subscriptions to use OpenClaw

The engineering takeaway is that subscription-backed Claude usage is being split away from third-party harnesses, pushing heavier agent workflows toward metered API billing or local-first/MCP architectures. Several commenters read this as a signal to build around vendor-neutral interfaces, reduce dependence on wrapped CLI sessions, or switch to alternative models for high-volume agent workloads.

Discussion around Tell HN: Anthropic no longer allowing Claude Code subscriptions to use OpenClaw

Thread discussion highlights: - _pdp_ on open-source alternatives: "The solution as usual is open source..." The commenter argues teams should shift away from expensive Claude-based agents and use smaller or open alternatives for focused coding tasks. - davesque on capacity abuse and fairness: Supports Anthropic’s move, arguing community tools like OpenClaw are consuming capacity intended for human usage of native products and degrading service quality for subscribers. - mariopt on token burn / pay-per-token economics: Claims coding-plan abuse by OpenClaw has made another vendor’s coding plan unusable and says heavy 24/7 agent usage should be on the API, not a flat subscription.

The HN thread about the policy email reproduced Anthropic's notice that, starting April 4 at 12pm PT, Claude subscriptions would no longer cover third-party harnesses including OpenClaw. Those sessions could still run with a Claude login, but only through separately billed extra usage, or with a Claude API key.

That left a clean product split:

  • Native Claude products, including Claude Code and Claude Cowork, stayed inside the subscription.
  • Third-party harnesses moved to pay-as-you-go billing.
  • Anthropic offered a one-time extra-usage credit equal to the monthly subscription price, according to the reproduced email.
  • The discussion thread framed the practical divide as local-first MCP setups versus remote or wrapped harnesses that proxy Claude sessions for long-running automation.

The engineering argument in the HN discussion was less about branding than about workload shape. Several commenters treated OpenClaw-style 24/7 agent runs as API economics hiding inside a flat subscription, while others argued the change would push teams toward vendor-neutral or local-first interfaces instead.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

Share on X