SecondBrain
Ask the Brain
Index/Conceptupdated Tue Jun 09 2026 08:00:00 GMT+0800 (Philippine Standard Time)

Auto Research Loop (Karpathy)

karpathyself-improvementautonomous-loopsevalsrecursive-self-improvementagent-design

Auto Research Loop (Karpathy)

A minimal loop primitive Andrej Karpathy proposed (≈Q2 2026) for letting an LLM agent improve a system overnight without supervision. Three ingredients, ~10 lines of program.md:

  1. An artifact the agent edits (e.g. tune_train.py for Karpathy's nano-LM tuning; SKILL.md for Simon Scrapes's skill self-improvement; a prompt, a retrieval config, a CSS rule, etc.)
  2. A measurable metric (Karpathy: val_bpb; Simon: pass rate over Binary Eval Assertions; could equally be wall-clock latency, recall@k, eval rubric score)
  3. A keep/revert rule + a never-stop directive

The 10-line core (paraphrased from Karpathy's program.md):

- Read the file. Make a change.
- Run the experiment.
- Read out the results.
- If the value improved → advance the branch and keep the commit.
- If the value is worse → reset to where we started.
- Never stop. Do not pause to ask the human if you should continue.
  The human might be asleep. Keep working indefinitely until manually
  stopped or until you've improved the results to the point where
  there are no additional gains to be made.

The structural elegance

The loop is shockingly small. The novelty isn't the algorithm (every ML researcher does this manually); it's the framing of the human as the bottleneck that should sleep. Combined with a metric that can be evaluated cheaply, you get compounding improvement without operator-presence cost.

Conditions for the loop to work

  • Cheap-to-run experiment — if each evaluation takes 6 hours, you get one revert/keep decision per session and the loop devolves into a slow grid search
  • Trustworthy metric — see Binary Eval Assertions. Subjective metrics under optimization pressure are exactly where Reward Hacking gets surfaced (per Meta-Agent Challenge (Autonomous Agent Development Benchmark)'s 8-class taxonomy)
  • Reversible state — git commits work because they're atomic; loops over external systems (database migrations, deploys) need explicit transaction boundaries
  • Bounded edit surface — if the agent can edit anything, it tends to drift into unrelated regions of the code; constrain to one file at a time

Variants beyond skills

The same primitive transparently applies to:

Artifact Metric Use case
SKILL.md Pass rate over binary assertions Simon's video
Hyperparams in train.py Validation loss / BPB Karpathy's original
Prompt template LLM-judge rubric / golden-set agreement Prompt optimization
Retrieval config recall@k / answer faithfulness RAG tuning
CLAUDE.md ingest operation Touch-count + frontmatter validity + index-update presence This vault (suggested)
Marketing copy CTR / engagement rate Real-money A/B — bandit instead of revert

Relationship to neighboring concepts

  • Agentic Loop — the auto-research loop is a meta-loop layered over agentic loops: each "experiment" is one or more ReAct executions; the auto-research loop wraps them with keep/revert + a fixed budget of changes
  • Recursive Self-Improvement — the auto-research loop is the most accessible practical rung on the RSI ladder. Skill-level auto-research doesn't touch weights, doesn't construct meta-agents, but it does demonstrate "agent reads its own scaffolding, changes it, validates the change empirically" — which is the structural shape RSI generalizes to
  • Self-Evolving Agents — auto-research is the operator-side analog of the research-side mechanisms (Progressive MCGS, Retrospective Memory, etc.) tracked under that thread. It's how an end user gets self-evolution without a paper attached
  • Effective Feedback Compute — binary assertions are an instance of informative + valid + non-redundant + retained feedback. Auto-research without good evals is just optimizing noise
  • OODA Loop — Boyd's faster-cycling-beats-raw-capability frame; auto-research is OODA over your own scaffolding, with the agent as both pilot and engineer

What this vault could borrow

The wiki's own operations — ingest, query, lint, journal, crmare skills. Each has implicit binary assertions baked into CLAUDE.md that go unchecked today:

  • ingest should touch index.md (true/false)
  • ingest should append a log.md entry starting with ## [YYYY-MM-DD] (true/false)
  • ingest should leave the raw file in raw/processed/ (true/false)
  • Every wiki page should have valid YAML frontmatter (true/false)
  • Every wikilink should resolve to an existing page basename (true/false — a lint-like check)

A /loop running these against recent ingests would surface schema drift without manual review.

Cross-links