CLI vs API vs MCP
▶Judge’s rationale & how this score was produced
Best evidence hygiene in the vault: the unverified 35x/72% benchmark is explicitly footnoted 'verify before quoting', every quote and number (55K-token GitHub MCP, 132K-to-2K School demo, Next.js failure mode) matches its source page, and the page's core move is reconciling three genuinely conflicting positions rather than flattening them. Triangulation is good but not perfect — the MCP-bloat claim rests on two sources, and the API column is largely the page's own inference.
What would raise confidence: Tracking down and linking the original Printing Press benchmark (or an independent token/reliability measurement) so the page's one flagged-unverified number becomes verified.
Score = 70% LLM judge (four dimensions above, graded by Claude against the cited sources on Thu Jun 11 2026 08:00:00 GMT+0800 (Philippine Standard Time)) + 30% deterministic metrics (source count, outlet diversity, recency). Levels: 85+ High confidence · 70–84 Corroborated · 50–69 Emerging · <50 Exploratory.
CLI vs API vs MCP
How LLM agents (esp. Claude Code) talk to external tools. Three sources in this wiki argue about this; the picture is more nuanced than a flat tier list.
Side-by-side
| Dimension | CLI | API | MCP |
|---|---|---|---|
| Built for | Agents | Code/humans | Tool discovery |
| Output | Short, pre-formatted text (~200 tokens) | Raw JSON (often huge) | Tool descriptions + JSON results |
| Context bloat | None — output digested before reaching agent | Full JSON in context | Tool descriptions in context every turn, even when unused |
| Discovery | Lazy (you ask the CLI what it does) | Out-of-band (docs) | Eager (all tools loaded upfront) |
| Auth | One-time, CLI holds the token | Per-request | Per-server (MCP server manages OAuth, refresh, IDs) |
| Multi-user / audit | Hard to retrofit | Hard to retrofit | Built in |
| Backend | Local SQLite mirror possible → no rate limits, no round trips | Network round trip per call | Server must stay running |
| Composability | Pipe / chain commands natively | Manual | Each tool call independent |
| Cited benchmark¹ | 100% reliability | — | 35× more tokens, 72% reliability on harder tasks |
| GitHub MCP server² | — | — | 80 tools = ~55K tokens injected per session |
¹ Benchmark cited by Nate Herk (AI Automation); original source not linked. Verify before quoting. ² Per CLI vs MCP (IBM Technology) — even if you only use 1–2 of 80 tools, all definitions load.
Three positions in this wiki
| Source | Position | Best summary |
|---|---|---|
| Printing Press (Nate Herk video) | CLI > API > MCP flat tier list | Default to CLI; build one if you have to. MCP loses on tokens. |
| CLI vs MCP (IBM Technology) | Use both | CLI when commands map to the job; MCP when raw tool gaps are wide, or when auth/audit/multi-user matters. |
| Boris Cherny on Coding Is Solved (Sequoia AI Ascent) | It doesn't matter | "To the model, it's just tokens." Pick whatever fits. Computer use as a catch-all when nothing else exists. |
These can be reconciled: Printing Press's argument is correct for single-user local dev workflows. IBM's nuance applies when you cross enterprise boundaries (auth, audit, multi-user). Boris's view is correct long-term, as the model gets better at picking.
Concrete failure mode (from CLI vs MCP (IBM Technology))
Fetching modelcontextprotocol.io (a Next.js SPA):
- MCP Fetcher: 1 call, ~250 tokens, seconds. Done.
- CLI curl: gets only the JS bundle. Agent improvises — strips HTML, looks for embedded JSON, eventually writes a Python script to reverse-engineer Next.js's streaming format. Several minutes, 2000+ tokens.
Quotable: "If the agent ever starts reverse-engineering a JavaScript framework just to read a webpage, that's a good sign it picked the wrong one."
When each wins
CLI wins for:
- File ops, git, text processing, scripts
- Knowledge already in training data
- Composability via pipes
- Single-user local dev
API wins when:
- No CLI exists and CLI would be too much work
- You explicitly want raw JSON for downstream processing
MCP wins when:
- Raw tool returns the wrong shape (rendered web pages, complex APIs)
- Auth needs server-side management
- Per-user access control + audit trails are required (org-level deployment)
The "context bloat" problem with MCP
Every loaded MCP server injects its tool list and descriptions into the agent's context on every turn. Run /context in Claude Code with several MCP servers loaded and the cost is visible. Even idle MCPs cost tokens.
The "JSON bloat" problem with APIs
APIs return whatever the upstream service returns — often a 100K-token JSON blob when you wanted three fields. With a CLI, the binary digests the response and emits 200 tokens. The 132K → 2K compression in the Printing Press (Nate Herk video) School demo is the canonical example.
Caveats
- CLIs don't bypass upstream rate limits or quotas (e.g. YouTube API daily comment cap still applies).
- CLIs that wrap auth-gated sites still need credentials stored somewhere — same secrets-management problem as APIs, in a different file.
- Trends to watch: skill systems that load full-tool definitions on demand (per OpenClaw) port the "lazy discovery" win to MCP-shaped systems.
Sources
- Printing Press (Nate Herk video) (CLI-favoring vendor view)
- CLI vs MCP (IBM Technology) (nuanced "use both")
- Boris Cherny on Coding Is Solved (Sequoia AI Ascent) ("doesn't matter")
- Agentic AI in the Enterprise (Praveen Akkiraju, CXOTalk) (granular OAuth via API; abstracted via MCP; policy
.mdfiles)