Auto Research Loop (Karpathy)
Auto Research Loop (Karpathy)
A minimal loop primitive Andrej Karpathy proposed (≈Q2 2026) for letting an LLM agent improve a system overnight without supervision. Three ingredients, ~10 lines of program.md:
- An artifact the agent edits (e.g.
tune_train.pyfor Karpathy's nano-LM tuning;SKILL.mdfor Simon Scrapes's skill self-improvement; a prompt, a retrieval config, a CSS rule, etc.) - A measurable metric (Karpathy:
val_bpb; Simon: pass rate over Binary Eval Assertions; could equally be wall-clock latency, recall@k, eval rubric score) - A keep/revert rule + a never-stop directive
The 10-line core (paraphrased from Karpathy's program.md):
- Read the file. Make a change.
- Run the experiment.
- Read out the results.
- If the value improved → advance the branch and keep the commit.
- If the value is worse → reset to where we started.
- Never stop. Do not pause to ask the human if you should continue.
The human might be asleep. Keep working indefinitely until manually
stopped or until you've improved the results to the point where
there are no additional gains to be made.
The structural elegance
The loop is shockingly small. The novelty isn't the algorithm (every ML researcher does this manually); it's the framing of the human as the bottleneck that should sleep. Combined with a metric that can be evaluated cheaply, you get compounding improvement without operator-presence cost.
Conditions for the loop to work
- Cheap-to-run experiment — if each evaluation takes 6 hours, you get one revert/keep decision per session and the loop devolves into a slow grid search
- Trustworthy metric — see Binary Eval Assertions. Subjective metrics under optimization pressure are exactly where Reward Hacking gets surfaced (per Meta-Agent Challenge (Autonomous Agent Development Benchmark)'s 8-class taxonomy)
- Reversible state — git commits work because they're atomic; loops over external systems (database migrations, deploys) need explicit transaction boundaries
- Bounded edit surface — if the agent can edit anything, it tends to drift into unrelated regions of the code; constrain to one file at a time
Variants beyond skills
The same primitive transparently applies to:
| Artifact | Metric | Use case |
|---|---|---|
SKILL.md |
Pass rate over binary assertions | Simon's video |
Hyperparams in train.py |
Validation loss / BPB | Karpathy's original |
| Prompt template | LLM-judge rubric / golden-set agreement | Prompt optimization |
| Retrieval config | recall@k / answer faithfulness | RAG tuning |
CLAUDE.md ingest operation |
Touch-count + frontmatter validity + index-update presence | This vault (suggested) |
| Marketing copy | CTR / engagement rate | Real-money A/B — bandit instead of revert |
Relationship to neighboring concepts
- Agentic Loop — the auto-research loop is a meta-loop layered over agentic loops: each "experiment" is one or more ReAct executions; the auto-research loop wraps them with keep/revert + a fixed budget of changes
- Recursive Self-Improvement — the auto-research loop is the most accessible practical rung on the RSI ladder. Skill-level auto-research doesn't touch weights, doesn't construct meta-agents, but it does demonstrate "agent reads its own scaffolding, changes it, validates the change empirically" — which is the structural shape RSI generalizes to
- Self-Evolving Agents — auto-research is the operator-side analog of the research-side mechanisms (Progressive MCGS, Retrospective Memory, etc.) tracked under that thread. It's how an end user gets self-evolution without a paper attached
- Effective Feedback Compute — binary assertions are an instance of informative + valid + non-redundant + retained feedback. Auto-research without good evals is just optimizing noise
- OODA Loop — Boyd's faster-cycling-beats-raw-capability frame; auto-research is OODA over your own scaffolding, with the agent as both pilot and engineer
What this vault could borrow
The wiki's own operations — ingest, query, lint, journal, crm — are skills. Each has implicit binary assertions baked into CLAUDE.md that go unchecked today:
ingestshould touch index.md (true/false)ingestshould append a log.md entry starting with## [YYYY-MM-DD](true/false)ingestshould leave the raw file inraw/processed/(true/false)- Every wiki page should have valid YAML frontmatter (true/false)
- Every wikilink should resolve to an existing page basename (true/false — a lint-like check)
A /loop running these against recent ingests would surface schema drift without manual review.
Cross-links
- Origin · Andrej Karpathy
- First applied example in this vault · Build Self-Improving Claude Code Skills (Simon Scrapes)
- Eval discipline · Binary Eval Assertions · LLM as Judge · Effective Feedback Compute
- Theory · Recursive Self-Improvement · Self-Evolving Agents · Meta-Agent
- Loop family · Agentic Loop · OODA Loop
- Risk · Reward Hacking