CLI vs API vs MCP

How LLM agents (esp. Claude Code) talk to external tools. Three sources in this wiki argue about this; the picture is more nuanced than a flat tier list.

Side-by-side

Dimension	CLI	API	MCP
Built for	Agents	Code/humans	Tool discovery
Output	Short, pre-formatted text (~200 tokens)	Raw JSON (often huge)	Tool descriptions + JSON results
Context bloat	None — output digested before reaching agent	Full JSON in context	Tool descriptions in context every turn, even when unused
Discovery	Lazy (you ask the CLI what it does)	Out-of-band (docs)	Eager (all tools loaded upfront)
Auth	One-time, CLI holds the token	Per-request	Per-server (MCP server manages OAuth, refresh, IDs)
Multi-user / audit	Hard to retrofit	Hard to retrofit	Built in
Backend	Local SQLite mirror possible → no rate limits, no round trips	Network round trip per call	Server must stay running
Composability	Pipe / chain commands natively	Manual	Each tool call independent
Cited benchmark¹	100% reliability	—	35× more tokens, 72% reliability on harder tasks
GitHub MCP server²	—	—	80 tools = ~55K tokens injected per session

¹ Benchmark cited by Nate Herk (AI Automation); original source not linked. Verify before quoting. ² Per CLI vs MCP (IBM Technology) — even if you only use 1–2 of 80 tools, all definitions load.

Three positions in this wiki

Source	Position	Best summary
Printing Press (Nate Herk video)	CLI > API > MCP flat tier list	Default to CLI; build one if you have to. MCP loses on tokens.
CLI vs MCP (IBM Technology)	Use both	CLI when commands map to the job; MCP when raw tool gaps are wide, or when auth/audit/multi-user matters.
Boris Cherny on Coding Is Solved (Sequoia AI Ascent)	It doesn't matter	"To the model, it's just tokens." Pick whatever fits. Computer use as a catch-all when nothing else exists.

These can be reconciled: Printing Press's argument is correct for single-user local dev workflows. IBM's nuance applies when you cross enterprise boundaries (auth, audit, multi-user). Boris's view is correct long-term, as the model gets better at picking.

Concrete failure mode (from CLI vs MCP (IBM Technology))

Fetching modelcontextprotocol.io (a Next.js SPA):

MCP Fetcher: 1 call, ~250 tokens, seconds. Done.
CLI curl: gets only the JS bundle. Agent improvises — strips HTML, looks for embedded JSON, eventually writes a Python script to reverse-engineer Next.js's streaming format. Several minutes, 2000+ tokens.

Quotable: "If the agent ever starts reverse-engineering a JavaScript framework just to read a webpage, that's a good sign it picked the wrong one."

When each wins

CLI wins for:

File ops, git, text processing, scripts
Knowledge already in training data
Composability via pipes
Single-user local dev

API wins when:

No CLI exists and CLI would be too much work
You explicitly want raw JSON for downstream processing

MCP wins when:

Raw tool returns the wrong shape (rendered web pages, complex APIs)
Auth needs server-side management
Per-user access control + audit trails are required (org-level deployment)

The "context bloat" problem with MCP

Every loaded MCP server injects its tool list and descriptions into the agent's context on every turn. Run /context in Claude Code with several MCP servers loaded and the cost is visible. Even idle MCPs cost tokens.

The "JSON bloat" problem with APIs

APIs return whatever the upstream service returns — often a 100K-token JSON blob when you wanted three fields. With a CLI, the binary digests the response and emits 200 tokens. The 132K → 2K compression in the Printing Press (Nate Herk video) School demo is the canonical example.

Caveats

CLIs don't bypass upstream rate limits or quotas (e.g. YouTube API daily comment cap still applies).
CLIs that wrap auth-gated sites still need credentials stored somewhere — same secrets-management problem as APIs, in a different file.
Trends to watch: skill systems that load full-tool definitions on demand (per OpenClaw) port the "lazy discovery" win to MCP-shaped systems.

Sources

Printing Press (Nate Herk video) (CLI-favoring vendor view)
CLI vs MCP (IBM Technology) (nuanced "use both")
Boris Cherny on Coding Is Solved (Sequoia AI Ascent) ("doesn't matter")
Agentic AI in the Enterprise (Praveen Akkiraju, CXOTalk) (granular OAuth via API; abstracted via MCP; policy .md files)