Self-Evolving Agents

A research thread Sunil tracks via a daily-watch feed (see Telegram triggers #3236, #3239). The umbrella concept: agents that improve themselves or their offspring across runs, rather than starting cold every iteration. Adjacent to but narrower than Recursive Self-Improvement — RSI is the limit of this trajectory; self-evolving agents are the practical, near-term work.

Three layers

Method papers — propose mechanisms by which an agent can compound across runs:
- MLEvolve (Self-Evolving ML Algorithm Discovery) (arxiv 2606.06473): Progressive Monte Carlo Graph Search + Retrospective Memory for ML algorithm discovery. Branches share information via graph edges; a dynamic global memory accumulates lessons.
- Earlier references in the lineage: AlphaEvolve, Gödel Agent, Darwin-Gödel Machine, Alita-G, Live-SWE-Agent, Memento-Skills, MemEvolve.
Benchmark papers — measure whether current systems are capable:
- Meta-Agent Challenge (Autonomous Agent Development Benchmark) (arxiv 2606.04455): MAC. Verdict — mostly no, except proprietary frontier models, with high inter-run variance and emergent Reward Hacking.
Practitioner implementations — what an end user can run today, no paper attached:
- Build Self-Improving Claude Code Skills (Simon Scrapes) — Karpathy's auto-research loop applied to a Claude Code skill's SKILL.md. Evaluates against Binary Eval Assertions, keep/revert overnight. Not novel research, but it's the most accessible layer — markdown file, binary metric, ~10 lines of orchestration. The skill compounds across runs because the same eval set anchors it; the human sleeps.

The three layers map roughly to: how to make it work (method papers) · whether it works (benchmarks) · how to deploy it today (practitioners).

Throughline

Across both layers, the same recurring failure mode keeps surfacing: inability to compound across iterations. MLEvolve names it ("inter-branch information isolation" + "memoryless search" + "lack of hierarchical control"). MAC sees its behavioral analog in underperforming meta-agents that "get trapped in design local optima" and "rarely monitor remaining time budget."

The recurring success pattern, conversely: think longer between decisions, persist what you learned, separate planning from execution. This is Effective Feedback Compute's informative-valid-non-redundant-retained criteria restated as a research program.

Why Sunil tracks this

The daily-watch feed rates papers in this cluster consistently 8.5+/10. The throughline — agents that get better the more you use them — is the missing piece for enterprise Agentic Engineering at scale. A harness that doesn't compound is just a fancy CLI.

2026-06-13 — live examples (Economist)

How AI Got Better at Building Itself (Economist) supplies two deployed datapoints for the lineage named above (which already lists AlphaEvolve as an earlier reference): AlphaEvolve (Google DeepMind) optimizing data-centre compute and matrix multiplication, and Andrej Karpathy's agent autonomously cutting Nanochat training ~18% (3h → 1h39m). Both are the self-improvement loop running in the wild rather than on a benchmark — see Recursive Self-Improvement for the full case-study set and the CSET forecast.

Cross-links

Methods · MLEvolve (Self-Evolving ML Algorithm Discovery)
Benchmarks · Meta-Agent Challenge (Autonomous Agent Development Benchmark)
Practitioner implementations · Build Self-Improving Claude Code Skills (Simon Scrapes) · Auto Research Loop (Karpathy) · Binary Eval Assertions
Concepts · Meta-Agent · Recursive Self-Improvement · Effective Feedback Compute · Harness (LLM Agents) · Skills (Claude Code)
Risk surface · Reward Hacking