SecondBrain
Ask the Brain
Index/Sourceupdated Sat May 09 2026 08:00:00 GMT+0800 (Philippine Standard Time)

Andrej Karpathy on Agentic Engineering (Sequoia AI Ascent)

karpathyvibe-codingagentic-engineeringsoftware-3.0jagged-intelligenceverifiability

Andrej Karpathy on Agentic Engineering (Sequoia AI Ascent)

Andrej Karpathy interviewed by Stephanie Zhan at Sequoia AI Ascent 2026. A year after coining "vibe coding," he argues the field has progressed from raising the floor (vibe coding) to raising the ceiling (Agentic Engineering). Also lays out Software 3.0, Jagged Intelligence, and the "ghosts not animals" framing.

Key claims

  • December 2024 was a stark transition. With latest models, code chunks just came out fine — Karpathy stopped correcting and started trusting. His side-projects folder exploded. "You really had to look again."
  • Software 3.0: programming-by-prompting. Software 1.0 = explicit rules. Software 2.0 = learned weights. 3.0 = LLM as programmable computer; the context window is your lever. Examples: OpenClaw install is a copy-paste blob for the agent, not a shell script. Menu Gen → Nano Banana eliminates the whole app.
  • Vibe Coding vs Agentic Engineering: vibe coding raises the floor (anyone can build software). Agentic engineering raises the ceiling — preserving the professional quality bar while going much faster. He thinks the speedup for top practitioners is well above 10×.
  • Jagged Intelligence: state-of-the-art models can refactor 100K-line codebases or find zero-days, but tell you to walk to a 50m car wash. Pattern: capabilities follow what's verifiable AND what labs care about. (Chess in GPT-3.5→4 jumped because data was added, not because of generic capability progression.)
  • Verifiability is the wedge for founders. Where you can build RL environments, you can fine-tune to specialty capability the labs haven't covered. He hints at "one domain that is very [valuable]" but doesn't name it on stage.
  • Ghosts, not animals. LLMs are statistical simulation circuits — pre-training substrate + RL appendages. Yelling at them doesn't help. They lack intrinsic motivation, fun, curiosity. Useful framing: don't anthropomorphize.
  • Taste/judgment is the bottleneck. The agent codes; you hold the spec, the user-ID design, the architecture. He's wary of "plan mode" as a panacea — wants explicit specs/docs co-written with the agent. Generated code is often "bloaty, copy-paste, awkward abstractions" — works but ugly. RL hasn't been pointed at aesthetics yet.
  • You can outsource your thinking but not your understanding. Tweet that "blew his mind." His own LLM Wiki Pattern use is partly to keep understanding in the loop — every read article re-projected into a personal wiki.
  • Hiring is broken for agentic engineers. Whiteboard puzzles ≠ this work. Real test: "build a Twitter clone with agents, secure it, then I run 10 codecs against it for X hours and try to break it." Watch how someone uses tools, builds big projects, configures their setup.
  • Long-term extrapolation: neural-first computers — diffusion-rendered UIs, raw video/audio in, classical CPUs as co-processors. Currently inverted; will likely flip piecewise.

Cross-source resonance

Cross-links