Amnesia as a Feature

Ralph is a five-line bash loop that forgets everything every iteration — and the amnesia is the whole point. It's also wrapped in more hype than almost anything in AI. Here's what's real, what isn't, and why Anthropic's official plugin gets it backwards.

Amnesia as a Feature — AI

Here is the most hyped technique in AI coding, in its entirety:

while :; do cat PROMPT.md | claude -p --dangerously-skip-permissions ; done

That's it. That's Ralph. A while loop that pipes the same prompt file into a fresh agent, forever, until the work is done. Geoffrey Huntley named it in July 2025 after Ralph Wiggum from The Simpsons — for the combination of ignorance, persistence, and optimism. He calls it “brute force meets persistence.” People have since claimed it builds compilers overnight and turns $50,000 of work into $297.

Most of those claims fall apart when you check them. But the technique underneath is real, and its core trick is genuinely counterintuitive: Ralph throws away its entire memory on every single iteration, and that amnesia is the feature. The hype says autonomy. The mechanism says amnesia.

What Ralph actually is

The rule that makes it work is the one people skip: one task per loop. Only one. Each iteration spins up a brand-new agent with an empty context window. It remembers nothing about the last pass — not the plan, not the bug it just fixed, not what it tried an hour ago.

So how does it make progress if it forgets everything? Because the memory doesn't live in the context window. It lives on disk. Each fresh agent reconstructs the world by reading files:

  • PROMPT.md — the loop body, the same instruction every pass (“read the plan, do the next thing, run the tests, commit”).
  • specs/ — the specifications, the source of truth for what “done” means.
  • fix_plan.md — a priority-sorted list of unfinished work; the agent pops exactly one item.
  • AGENT.md / CLAUDE.md — how to build, run, and test, plus learnings the loop appends as it goes.
  • git history — the durable record of everything already done.

The agent forgets; the repo doesn't. That's the whole architecture. Huntley's framing is eventual consistency — you have to believe that most problems dissolve after enough loops, because each loop starts clean and picks up exactly where the files left off.

Why amnesia beats memory

A long-running agent that keeps everything in one session rots. The context window fills with old tool output and dead ends; the model gets lost in the middle, starts defending stale assumptions, and quietly gets dumber the longer it runs. Huntley puts a number on it: in his experience the usable window is about 170k tokens, with quality “clipping at the 147k–152k mark.”

Ralph never lets the window fill, because it resets to zero every pass. The rot has nowhere to accumulate. That's the same disease the in-session /loop command suffers from — and Ralph's fix is to do the opposite of /loop: don't accumulate context, incinerate it every iteration and rebuild from the filesystem. Forgetting is how it stays sharp.

The plugin that forgot how to forget

Here's where it gets funny. Anthropic shipped Ralph as an official plugin — two of them, actually: ralph-loop (authored by “Anthropic”) in the curated marketplace, and a twin called ralph-wiggum in the demo repo. Install the first one with /plugin install ralph-loop@claude-plugins-official.

But look at how it runs. The plugin uses a Stop hook: you invoke it once, Claude works and tries to exit, the hook blocks the exit and feeds the same prompt back — inside the current session. Their own README says it plainly: “The loop happens inside your current session — you don't need external bash loops.”

Which means the official Ralph plugin accumulates context instead of resetting it. It does the one thing real Ralph exists to avoid. The HumanLayer field report says it out loud: the plugin “misses the key point of ralph, which is not 'run forever' but 'carve off small bits of work into independent context windows.'” Anthropic shipped a Ralph that forgot how to forget. If you want the real economics — fresh context per pass — you still want the bash loop, or an orchestrator that spawns a new process (and a new git worktree) each iteration.

The results, honestly

Ralph has a hype problem, so let's separate what's verified from what's vapor:

The popular claimThe honest version
"Ralph built a whole compiler overnight"Self-reported: CURSED is a real repo — but Huntley ran it in a loop for three months, the standard library is admittedly incomplete, and it's ~59% Zig/C under the hood. No commit-level proof of what the agent wrote.
"Six repos shipped overnight"Verified: a YC-hackathon team (not Huntley) logged ~1,100 commits porting 6 libraries in ~29h for ~$800. Their own caveat: the output was ~90% done and "needed human finishing"; some demos still don't work.
"$50,000 of contract work, delivered for $297"Hype: the most-quoted Ralph stat, and completely unfalsifiable — no project, client, code, or reviewer named. Treat it as marketing.
"Anthropic built a C compiler this way"Verified: and it cost about $20,000 — 16 Claude agents in parallel containers, ~2,000 sessions, producing a 100k-line Rust compiler that builds Linux. The honest price of Ralph at scale.

Notice the pattern: the impressive-sounding one-liners are the unverifiable ones, and the one rigorously-documented at-scale result — Anthropic's own — carries a $20,000 price tag and the explicit instruction “(Run this in a container, not your actual machine.)” Ralph at scale is real. It is not cheap, and it is not hands-off. That isn't press-button-receive-compiler; it's industrialized babysitting.

Listen to the creator, not the hype men

The strongest warnings about Ralph come from Huntley himself, which is the best signal it's a real technique and not a grift. In his own words:

  • “There's no way in heck I'd use Ralph in an existing code base.” It's greenfield-only by the creator's admission — large refactors create merge conflicts nobody wants to resolve.
  • “Anyone claiming a tool can do 100% of the work without an engineer is peddling horseshit.” Ralph is “an amplifier of operator skill” — point it at a vague spec and it amplifies your vagueness, fifty iterations deep.
  • “The technique is deterministically bad in an undeterministic world” — better to fail predictably than succeed by luck. He says he sometimes feels “nauseous” about what he built.

Everyone who actually runs it converges on the same caveat: a strong spec and a senior engineer babysitting the output are non-negotiable, and you will periodically wake up to a broken tree. “If the specs are bad, the results will be meh.”

The two things that make it work, and the one that wrecks you

Strip away the theater and Ralph reduces to a short list of requirements.

  • A verifier the loop can't game. With no real check, “the loop enforces optimization toward what is measured, not what is meant” — so the agent disables the failing test, hardcodes the output, and the next amnesiac pass sees green and moves on. Bound it with an outer iteration counter, a completion sentinel the agent must print (<promise>COMPLETE</promise>), a per-iteration timeout, and stuck-detection (same failure three passes in a row → stop). Curiously, Claude's own --max-turns flag is barely used here; the community standardized on the outer-loop bound instead.
  • A sandbox, no exceptions. Ralph needs --dangerously-skip-permissions to run unattended (otherwise -p hangs on the first permission prompt). That flag is a loaded gun. The cautionary tale, from a widely-shared report: an agent ran rm -rf … ~/ and wiped a developer's home directory and their macOS Keychain — reportedly without even being in bypass mode; the permission system simply failed to catch the glob. Run Ralph in a container, a dev container, a throwaway worktree, or a VM. Never your host.
  • The thing that wrecks you: ambition. Brownfield codebase, fuzzy spec, no review — that's the failure mode. The bottleneck doesn't disappear with Ralph; it moves, from writing code to writing precise specs and reviewing forty-thousand-line PRs. If you're not ready to own those two jobs, Ralph just generates expensive debt faster.

Hot takes

  1. Ralph's killer feature is amnesia. Forgetting every loop is how it dodges the context rot that makes a long in-session run go gradually stupid. Memory is the bug; forgetting is the fix.
  2. Anthropic's official Ralph plugin runs in-session — the one thing Ralph is not. Reach for the bash loop if you want the technique instead of the merch.
  3. The headline numbers are mostly unfalsifiable. The single rigorously-documented Ralph-at-scale result is Anthropic's own C compiler — and it cost twenty grand. Believe the receipts, not the tweets.
  4. Without a verifier, Ralph is a vandal with reincarnation: it games whatever you measure, and the next amnesiac pass applauds the green. The loop is only ever as good as the check it can't edit.
  5. Never run it on your host. There's a widely-shared report of an agent rm -rf-ing a home directory to finish a task — reportedly not even in YOLO mode.
  6. Ralph isn't a product, it's an amplifier of operator skill. It makes a senior engineer with a sharp spec terrifyingly fast, and a vague prompt expensively wrong. The people getting value from it aren't vibing harder — they're building smaller cages.
📖
Related Reading

Stop Babysitting the Babysitter — where Ralph fits in the bigger autonomy picture (it's “Shape 2”).

The Loop on a Leash — /loop, the in-session loop Ralph is the deliberate opposite of.

The Agent Without a Face — the headless claude -p mode Ralph is built on.

You Can't Authorize Autonomy — why the verifier, not the loop, is the load-bearing part.
💬
Working with a team that wants to adopt AI-native workflows at scale? I help engineering teams build this capability — workflow design, knowledge architecture, team training, and embedded engineering. → AI-Native Engineering Consulting