SecondBrain
Ask the Brain
Index/Comparisonupdated Sat May 09 2026 08:00:00 GMT+0800 (Philippine Standard Time)

CLI vs API vs MCP

agentstoolingcliapimcpclaude-code
Confidence
83/100
Corroborated
Evidence5/5
Triangulation4/5
Reasoning5/5
Groundedness4/5
4 sources4 independent outletsupdated 55d ago
Judge’s rationale & how this score was produced

Best evidence hygiene in the vault: the unverified 35x/72% benchmark is explicitly footnoted 'verify before quoting', every quote and number (55K-token GitHub MCP, 132K-to-2K School demo, Next.js failure mode) matches its source page, and the page's core move is reconciling three genuinely conflicting positions rather than flattening them. Triangulation is good but not perfect — the MCP-bloat claim rests on two sources, and the API column is largely the page's own inference.

What would raise confidence: Tracking down and linking the original Printing Press benchmark (or an independent token/reliability measurement) so the page's one flagged-unverified number becomes verified.

Score = 70% LLM judge (four dimensions above, graded by Claude against the cited sources on Thu Jun 11 2026 08:00:00 GMT+0800 (Philippine Standard Time)) + 30% deterministic metrics (source count, outlet diversity, recency). Levels: 85+ High confidence · 70–84 Corroborated · 50–69 Emerging · <50 Exploratory.

CLI vs API vs MCP

How LLM agents (esp. Claude Code) talk to external tools. Three sources in this wiki argue about this; the picture is more nuanced than a flat tier list.

Side-by-side

Dimension CLI API MCP
Built for Agents Code/humans Tool discovery
Output Short, pre-formatted text (~200 tokens) Raw JSON (often huge) Tool descriptions + JSON results
Context bloat None — output digested before reaching agent Full JSON in context Tool descriptions in context every turn, even when unused
Discovery Lazy (you ask the CLI what it does) Out-of-band (docs) Eager (all tools loaded upfront)
Auth One-time, CLI holds the token Per-request Per-server (MCP server manages OAuth, refresh, IDs)
Multi-user / audit Hard to retrofit Hard to retrofit Built in
Backend Local SQLite mirror possible → no rate limits, no round trips Network round trip per call Server must stay running
Composability Pipe / chain commands natively Manual Each tool call independent
Cited benchmark¹ 100% reliability 35× more tokens, 72% reliability on harder tasks
GitHub MCP server² 80 tools = ~55K tokens injected per session

¹ Benchmark cited by Nate Herk (AI Automation); original source not linked. Verify before quoting. ² Per CLI vs MCP (IBM Technology) — even if you only use 1–2 of 80 tools, all definitions load.

Three positions in this wiki

Source Position Best summary
Printing Press (Nate Herk video) CLI > API > MCP flat tier list Default to CLI; build one if you have to. MCP loses on tokens.
CLI vs MCP (IBM Technology) Use both CLI when commands map to the job; MCP when raw tool gaps are wide, or when auth/audit/multi-user matters.
Boris Cherny on Coding Is Solved (Sequoia AI Ascent) It doesn't matter "To the model, it's just tokens." Pick whatever fits. Computer use as a catch-all when nothing else exists.

These can be reconciled: Printing Press's argument is correct for single-user local dev workflows. IBM's nuance applies when you cross enterprise boundaries (auth, audit, multi-user). Boris's view is correct long-term, as the model gets better at picking.

Concrete failure mode (from CLI vs MCP (IBM Technology))

Fetching modelcontextprotocol.io (a Next.js SPA):

  • MCP Fetcher: 1 call, ~250 tokens, seconds. Done.
  • CLI curl: gets only the JS bundle. Agent improvises — strips HTML, looks for embedded JSON, eventually writes a Python script to reverse-engineer Next.js's streaming format. Several minutes, 2000+ tokens.

Quotable: "If the agent ever starts reverse-engineering a JavaScript framework just to read a webpage, that's a good sign it picked the wrong one."

When each wins

CLI wins for:

  • File ops, git, text processing, scripts
  • Knowledge already in training data
  • Composability via pipes
  • Single-user local dev

API wins when:

  • No CLI exists and CLI would be too much work
  • You explicitly want raw JSON for downstream processing

MCP wins when:

  • Raw tool returns the wrong shape (rendered web pages, complex APIs)
  • Auth needs server-side management
  • Per-user access control + audit trails are required (org-level deployment)

The "context bloat" problem with MCP

Every loaded MCP server injects its tool list and descriptions into the agent's context on every turn. Run /context in Claude Code with several MCP servers loaded and the cost is visible. Even idle MCPs cost tokens.

The "JSON bloat" problem with APIs

APIs return whatever the upstream service returns — often a 100K-token JSON blob when you wanted three fields. With a CLI, the binary digests the response and emits 200 tokens. The 132K → 2K compression in the Printing Press (Nate Herk video) School demo is the canonical example.

Caveats

  • CLIs don't bypass upstream rate limits or quotas (e.g. YouTube API daily comment cap still applies).
  • CLIs that wrap auth-gated sites still need credentials stored somewhere — same secrets-management problem as APIs, in a different file.
  • Trends to watch: skill systems that load full-tool definitions on demand (per OpenClaw) port the "lazy discovery" win to MCP-shaped systems.

Sources