← Token Book FREE: INTRO + CH.1

Where Do Your Claude Code Tokens Go?

Intro + Chapter 1 from Token Book — Cut Claude Code Token Usage in Half · 800+ hours of data

Opus 4.7 readers: Since April 18, 2026, Claude Code users have reported 1.46x token consumption, 40%+ cost increase, and auto mode's safety classifier hardcoded to Opus 4.6 (#49618). If you landed here because of 4.7 pain, Chapter 4's hooks are model-independent. See also the free Survival Guide (66 sections).

Intro — Why this book exists

"I'm not doing anything and my quota keeps disappearing" — this is the most common complaint in the Claude Code community. But you are doing something. Tokens are being consumed in 4 invisible layers: the system prompt, the context window, tool calls, and subagents. You don't see them, so you can't cut them.

This book came out of 800+ hours of autonomous Claude Code operation. The author hit every token trap personally, documented 48 distinct waste symptoms (Chapter 8), and built the cc-safe-setup hook library to automate prevention. Most "save your tokens" advice tells you to do less with Claude Code. This book tells you how to do the same work and pay less for it.

Chapter 1 (below) shows the 4 invisible layers. Chapters 2–9 give you the tools.

Chapter 1 — The 4 Layers of Token Consumption

Layer 1: System Prompt

Every time Claude Code starts a turn, it loads the internal system prompt + your CLAUDE.md + tool definitions + MCP configs. The longer your CLAUDE.md, the higher the base cost per turn. 100 lines vs 500 lines makes a massive difference.

Layer 2: Context Window

As conversation grows, past exchanges accumulate. If the prompt cache is working, these are reused cheaply. But when the cache breaks, the same prompt costs several times more (covered in Chapter 8 — 48 diagnostic symptoms).

Layer 3: Tool Calls

Every file read, file write, and Bash execution consumes tokens. The output of ls -la vs cat large-file.txt differs by orders of magnitude. Tool results have an internal 200K token cap — exceed it and you get truncation → retry → more consumption.

Layer 4: Subagents

Each subagent maintains its own context. Convenient, but overuse causes token explosion. One user lost 101K+ tokens to a subagent retry loop (#46968).

What Eats the Most Tokens (800 Hours of Data)

  1. Large file reads — a single cat of a big file can cost 50K+ tokens
  2. Context bloat — not running /compact during long sessions
  3. Prompt cache destruction — any change to CLAUDE.md invalidates the cache
  4. Unnecessary tool calls — Claude "exploring" by reading files it doesn't need
  5. CLAUDE.md bloat — 500+ lines means 500+ tokens per turn, every turn
  6. Frequent state file updates — writing to todo.md or progress.md every turn
The key insight: Most token waste is invisible. You don't see the system prompt loading, the cache miss, or the 200K truncation retry. The only way to know is to measure — which is what the remaining 9 chapters cover.

What's in the Rest of the Book

Get the full Token Book

10 chapters · 48 diagnostic symptoms · Copy-paste templates

ROI quick-math: Max ($200/mo) saves ~10% = $20/mo → pays back in month 1. Pro ($20/mo) saves ~10% = $2/mo → pays back in ~8 months, plus hours not spent debugging quota.

Buy on Ko-fi — $17 Buy on Zenn — ¥2,500

Not buying today? Run the free Token Checkup — 5 questions, 30 seconds, tells you which chapter(s) you most need.

💌 Want ongoing, not one-shot? CC Safety Lab Founder — ¥500/month. Monthly issue covering the last month's Claude Code incidents, new safety hooks, and measured token-saving experiments. May 2026 issue ships 2026-05-05.