A closed GitHub issue says Claude Code became unreliable for complex engineering after February changes, citing 17,871 thinking blocks and 234,760 tool calls across 6,852 sessions. Anthropic said the redaction flag was UI-only, but developers reported broader Opus quality drops and opaque harness changes.

redact-thinking-2026-02-12 is a UI-only header, and that showThinkingSummaries: true restores summaries in the interface.You can read the original issue, the HN thread, Anthropic's settings docs, and the Claude Code release notes. The odd part is how many different knobs surfaced at once: hidden thinking summaries, /effort, adaptive thinking flags from the discussion, and a changelog entry saying thinking summaries were no longer generated by default.
[MODEL] Claude Code is unusable for complex engineering tasks with the Feb updates Β· Issue #42796 Β· anthropics/claude-code
1.3k upvotes Β· 716 comments
Issue: Claude Code is unusable for complex engineering tasks with Feb updates
1.3k upvotes Β· 716 comments
The complaint started in issue #42796, opened April 2 and closed April 6, with unusually heavy engagement for a product bug report: hundreds of reactions and a large Hacker News thread.
The core claim was narrow and testable. According to the issue text, the analysis covered January through March session logs and tied the regression window to the rollout of thinking-content redaction. The author's conclusion was blunt: Claude Code had regressed to the point that it could not be trusted for complex engineering work.
That claim landed because it was not just vibes. The issue attached concrete counts, 17,871 thinking blocks, 234,760 tool calls, and 6,852 session files, which gave the argument more weight than the usual "it feels worse this week" post.
Discussion around Issue: Claude Code is unusable for complex engineering tasks with Feb updates
1.3k upvotes Β· 716 comments
In the HN thread, Anthropic engineer bcherny said redact-thinking-2026-02-12 only hides thinking from the UI. The comment adds that the header does not change thinking budgets or extended reasoning behavior under the hood, and that users can opt out with showThinkingSummaries: true.
That matters to the issue's method more than to its symptoms. According to that same explanation, if users analyze locally stored transcripts after the header is set, they will not see raw thinking in those files even though the model still used it internally. In other words, the visible transcript got thinner whether or not the underlying reasoning did.
The thread did not end there. Other top comments said thinking depth had already dropped before the redaction change, while Gergely Orosz argued the bigger problem was zero transparency around harness changes that only become obvious after a workflow breaks.
The most useful part of the discussion is the symptom list, because it points past a single header dispute:
CLAUDE.md guardrails over time and had seen noticeable degradation in Opus outputs and thinking.That mix makes this look less like one broken toggle and more like a classic coding-agent headache: model behavior, harness policy, and transcript visibility all changed close enough together that users could not cleanly separate them.
The official Claude Code release notes add a few details that explain why users were reaching for settings in the thread. One changelog entry says the default effort level changed from medium to high for several user tiers, another says thinking summaries were no longer generated by default, and a third mentions an autocompact thrash-loop fix.
The settings docs also expose alwaysThinkingEnabled as a configurable option. That does not prove the regression report, but it does show how many moving parts were live around the same workflow: effort level, hidden summaries, transcript persistence, and compaction behavior.