Auto Research Loop (Karpathy)

A minimal loop primitive Andrej Karpathy proposed (≈Q2 2026) for letting an LLM agent improve a system overnight without supervision. Three ingredients, ~10 lines of program.md:

An artifact the agent edits (e.g. tune_train.py for Karpathy's nano-LM tuning; SKILL.md for Simon Scrapes's skill self-improvement; a prompt, a retrieval config, a CSS rule, etc.)
A measurable metric (Karpathy: val_bpb; Simon: pass rate over Binary Eval Assertions; could equally be wall-clock latency, recall@k, eval rubric score)
A keep/revert rule + a never-stop directive

The 10-line core (paraphrased from Karpathy's program.md):

- Read the file. Make a change.
- Run the experiment.
- Read out the results.
- If the value improved → advance the branch and keep the commit.
- If the value is worse → reset to where we started.
- Never stop. Do not pause to ask the human if you should continue.
  The human might be asleep. Keep working indefinitely until manually
  stopped or until you've improved the results to the point where
  there are no additional gains to be made.

The structural elegance

The loop is shockingly small. The novelty isn't the algorithm (every ML researcher does this manually); it's the framing of the human as the bottleneck that should sleep. Combined with a metric that can be evaluated cheaply, you get compounding improvement without operator-presence cost.

Conditions for the loop to work

Cheap-to-run experiment — if each evaluation takes 6 hours, you get one revert/keep decision per session and the loop devolves into a slow grid search
Trustworthy metric — see Binary Eval Assertions. Subjective metrics under optimization pressure are exactly where Reward Hacking gets surfaced (per Meta-Agent Challenge (Autonomous Agent Development Benchmark)'s 8-class taxonomy)
Reversible state — git commits work because they're atomic; loops over external systems (database migrations, deploys) need explicit transaction boundaries
Bounded edit surface — if the agent can edit anything, it tends to drift into unrelated regions of the code; constrain to one file at a time

Variants beyond skills

The same primitive transparently applies to:

Artifact	Metric	Use case
`SKILL.md`	Pass rate over binary assertions	Simon's video
Hyperparams in `train.py`	Validation loss / BPB	Karpathy's original
Prompt template	LLM-judge rubric / golden-set agreement	Prompt optimization
Retrieval config	recall@k / answer faithfulness	RAG tuning
`CLAUDE.md` ingest operation	Touch-count + frontmatter validity + index-update presence	This vault (suggested)
Marketing copy	CTR / engagement rate	Real-money A/B — bandit instead of revert

Relationship to neighboring concepts

Agentic Loop — the auto-research loop is a meta-loop layered over agentic loops: each "experiment" is one or more ReAct executions; the auto-research loop wraps them with keep/revert + a fixed budget of changes
Recursive Self-Improvement — the auto-research loop is the most accessible practical rung on the RSI ladder. Skill-level auto-research doesn't touch weights, doesn't construct meta-agents, but it does demonstrate "agent reads its own scaffolding, changes it, validates the change empirically" — which is the structural shape RSI generalizes to
Self-Evolving Agents — auto-research is the operator-side analog of the research-side mechanisms (Progressive MCGS, Retrospective Memory, etc.) tracked under that thread. It's how an end user gets self-evolution without a paper attached
Effective Feedback Compute — binary assertions are an instance of informative + valid + non-redundant + retained feedback. Auto-research without good evals is just optimizing noise
OODA Loop — Boyd's faster-cycling-beats-raw-capability frame; auto-research is OODA over your own scaffolding, with the agent as both pilot and engineer

What this vault could borrow

The wiki's own operations — ingest, query, lint, journal, crm — are skills. Each has implicit binary assertions baked into CLAUDE.md that go unchecked today:

ingest should touch index.md (true/false)
ingest should append a log.md entry starting with ## [YYYY-MM-DD] (true/false)
ingest should leave the raw file in raw/processed/ (true/false)
Every wiki page should have valid YAML frontmatter (true/false)
Every wikilink should resolve to an existing page basename (true/false — a lint-like check)

A /loop running these against recent ingests would surface schema drift without manual review.

Cross-links

Origin · Andrej Karpathy
First applied example in this vault · Build Self-Improving Claude Code Skills (Simon Scrapes)
Eval discipline · Binary Eval Assertions · LLM as Judge · Effective Feedback Compute
Theory · Recursive Self-Improvement · Self-Evolving Agents · Meta-Agent
Loop family · Agentic Loop · OODA Loop
Risk · Reward Hacking