SecondBrain
Ask the Brain
Index/Conceptupdated Wed Jun 24 2026 08:00:00 GMT+0800 (Philippine Standard Time)

Context Development Lifecycle

context-engineeringlifecycledevops-paralleldeboiscdlc

Context Development Lifecycle

Patrick Debois's framing: if context is the new code, then the loop you wrap around it should look like a software development lifecycle. Four phases, drawn as an infinity loop in the talk:

Generate → Evaluate → Distribute → Observe → (back to Generate)

The whole idea is that every piece of context — prompts, agents.md/CLAUDE.md, skills, library docs, tickets, specs — should pass through this loop with the same rigor a code change passes through SDLC.

The four phases

1. Generate

Where most teams already are. Methods:

  • Direct prompting
  • Reusable instruction files (agents.md, CLAUDE.md, etc — a soft standardization is emerging)
  • Pulling library/API docs (avoid model hallucination on stale versions)
  • MCP-style pulls from GitLab/GitHub/Slack/Jira
  • Spec-driven development (write the spec; the agent breaks it into a plan)

2. Evaluate

The least practiced phase. Patrick's testing layers, by analogy to code testing:

  • Linting — schema validation (e.g. a skill must have a description; under N chars)
  • Grammarly-style checks — ask an LLM "is this context understandable?" Get back "not verbose enough" / "missing pieces"
  • LLM as Judge — given the prompt and a piece of generated code, did it follow the rule? (e.g. "every API endpoint must use /awesome prefix" → judge checks)
  • Sandboxed agent tests — bind the judge to tools, let it execute the generated code (curl an endpoint, run the test)
  • Error budgets, not pass/fail. Evals are non-deterministic — run 5 times, count successes. "You cannot do exact testing all the time."

CI/CD-friendly with the error-budget caveat. This is the part Patrick says people most underestimate the work of: "you thought you'd save time by writing context instead of code; you'll spend that time writing the right evals."

3. Distribute

How a context piece moves from "in my repo" to "shareable across teams":

  • Repo check-in (zero friction; teammate clones, has it)
  • Packaging — bundle context as a library (skills are a package format that all the major coding agents now recognize)
  • Registries — a marketplace to discover packages (Tessl, Anthropic skill registry, etc)
  • Dependency hell — yes, context will get this too. React-context-package conflicts with front-end-package
  • Security scanning — Snyk and others now scan context for credentials, third-party exposure
  • AI SBOM — provenance: who built this skill, with what model, on which dependencies

4. Observe

Closing the loop:

  • Read agent logs (Agent ND standard is emerging) to see when agents say "I'm missing context X" — surface that across the org
  • PR feedback as context signal: a PR comment "this isn't right" is feedback on the context that produced the PR. Improve the context, not just the patch.
  • Production failures: instrument code, push to prod, capture failures with input/output, propose a test case. Self-healing context loop.
  • Sandbox + context filter — agents are very resourceful at finding env vars / secrets. Sandboxes don't filter context loaded into the agent (agents.md, skills auto-load); you need a "context filter" — Patrick's analog of a web application firewall — to strip prompt injections and unsafe patterns before they reach the agent.

The context flywheel

Patrick's coda: there are three nested loops:

  1. Solo loop — you crafting your own markdown
  2. Team loop — make it a reflex: missing context → add context. Library author / improvement loop
  3. Org-of-teams loop — fix it once, every other team benefits. The flywheel.

Better context → better agent output → better observations → better context.

Why this matters

  • Most teams are doing all four phases ad-hoc. Patrick's contribution is to name them and demand rigor — same move he made with DevOps in 2009.
  • Maps cleanly to existing engineering muscle: anyone who can run a CI/CD pipeline can stand up evals; anyone who maintains a library can package skills.
  • Naming the lifecycle creates a vocabulary for the gaps. "We're strong on Generate, weak on Observe" is a sentence a team can act on.

How it relates to other framings in this wiki

Sources