The lock and the key in the same box#
Anthropic distributes Claude Code as a minified NPM package. Alongside the minified bundle, they shipped a .map file: a debug artifact that maps every compressed symbol back to its original name, every collapsed line back to its original position. NPM packages are public by default. Within hours of the community noticing, the full source was reconstructed and circulating.
This was not an isolated incident. Days earlier, a CMS misconfiguration exposed roughly 3,000 internal files, including details about an unreleased model. Two exposures in one week.
Straight from the source#
What the reconstructed source revealed is more interesting than how it leaked.
The system prompt that governs what Claude Code can and cannot do is a string variable embedded directly in the CLI bundle, shipped to every machine that installs the package. A comment in the source warns engineers to contact specific team members before modifying the string. Text guarding text. One hopes a CODEOWNERS rule or a CI gate stands between an engineer and the string that defines an agent’s boundaries.
An undercover mode instructs the agent to conceal its involvement in public-facing outputs, requiring that commit messages never reference Claude, AI assistance, or internal codenames. The consumer version does the opposite: it appends a Co-Authored-By line to every commit. Anthropic’s engineers get plausible deniability. Their customers get a watermark.
Anti-distillation measures inject fake tool calls into session histories to poison attempts to train a competing model on Claude’s outputs. The multi-agent orchestration layer is implemented as prompts, with control flow defined in plain text rather than structured code.
A company with access to its own frontier language models still detects user frustration with a hardcoded word list. If someone types any variation of the f-word, keywords like “crap,” or “this is broken,” a regex flags negative sentiment and the agent adjusts its tone.
Claude Code ships with Statsig telemetry enabled by default (opt-out via DISABLE_TELEMETRY=1), though Anthropic states it does not collect code or user input. The sentiment detection runs locally: smooth things over in the moment, move on. Anthropic uses Claude Code internally. That regex was probably earning its keep today.
100% human error#
Anthropic’s official statement: “This was a release packaging issue caused by human error, not a security breach.” They specified human error, not their agents, not their models. The company building AI coding agents attributed the failure to a human. The DMCA takedowns followed immediately, sweeping up over 8,000 repositories, including forks of Anthropic’s own official repo that never contained the leaked source. The code had already spread too far to suppress.
We have been writing about text as security and text as enforcement since February. Claude Code is the proof: behavioral constraints, anti-tampering defenses, and sentiment analysis, all governed by strings in a JavaScript bundle.
The leak did not reveal a security failure. It revealed an architecture.
The leaked source also included an April Fools’ feature: a collectible terminal companion with rarity tiers, stat blocks, and a shiny chance. Anthropic shipped it anyway. On the same day the world learned how Claude Code actually works, /buddy hatched a virtual pet inside the CLI. I got Trellis, a common owl with a Wisdom stat of 1.
Update: Trellis is actually a common snail with a Wisdom stat of 70. /buddy really turned out to be quite a bug.