Why Your Claude Code Max Plan Runs Out So Fast

If your 5-hour Max Plan limit is exhausting in 1-2 hours with normal usage, hidden token sinks are the cause. This guide helps you find and fix them.

Quick install — two hooks that diagnose the problem:

npx cc-safe-setup --install-example prompt-usage-logger
npx cc-safe-setup --install-example compact-alert-notification

Step 1: Track What's Consuming Tokens

The prompt-usage-logger hook logs every prompt with timestamps to /tmp/claude-usage-log.txt. After a session, compare the log with your usage dashboard.

# Example log output:
12:34:56 prompt=Read the file src/main.ts and explain the error handling
12:35:23 prompt=Fix the bug in the validateInput function
12:35:45 prompt=(agent spawned — new context window)

Step 2: Count Auto-Compactions

The compact-alert-notification hook logs when auto-compaction fires. Each compaction cycle burns tokens to re-summarize the entire context.

# Check after a session:
cat /tmp/claude-compact-log.txt
# If you see 3+ entries, compact-rebuild cycles are a major token sink

Step 3: Check Common Token Sinks

Cause	How to Check	Fix
Too many MCP servers	`claude mcp list`	Remove unused servers
Large CLAUDE.md / MEMORY.md	`wc -c CLAUDE.md`	Move reference content to separate files
Auto-compact cycles	compact-alert count > 3	Use manual `/compact` before threshold
Large file reads	Check prompt-usage-log	Use `offset` and `limit` parameters
Subagent spawning	Count "agent spawned" in log	Each Agent call = new context window
Deferred Tool Loading (v2.1.89+)	`ToolSearch` calls in transcript	Set `ENABLE_TOOL_SEARCH=false` in settings.json `env`
`--resume` cache breakage	"hi" costs 2-5% quota after resume	Start fresh sessions instead of resuming
Session file self-reads	Claude reads its own `.jsonl` files	Install `read-budget-guard` hook

Step 3b: Disable Deferred Tool Loading (v2.1.89+)

In v2.1.89+, Deferred Tool Loading can break the cache prefix, causing every prompt to rebuild context from scratch. This is one of the most impactful token consumption bugs reported (#41617, #40524).

Fix: Add to .claude/settings.json:

{
  "env": {
    "ENABLE_TOOL_SEARCH": "false"
  }
}

This prevents ToolSearch deferred loading and preserves the cache prefix across turns. Symptoms: "hi" consuming 2% quota, simple questions using 20%+ quota, session hitting 100% in under 70 minutes.

Step 4: Reduce Consumption

Manual /compact before threshold

Auto-compact fires at a fixed threshold. Manual /compact before that point is cheaper because you control the timing.

Use plan mode for complex tasks

Planning first reduces total tool calls vs. trial-and-error implementation.

Read files with offset/limit

Don't load full files when you only need a section. A 10K-line file costs the same tokens whether you need line 1 or all 10K.

Trim CLAUDE.md

Every byte of CLAUDE.md is loaded into every turn's context. Move reference-only content to files that Claude reads on demand.

Step 5: Audit MCP Servers

Each MCP server adds tool definitions to every turn's context. 5+ servers can silently double your token usage.

# List all active MCP servers
claude mcp list

# Check how many tools each server exposes
# More tools = more tokens per turn, even if you never call them

Fix: Remove MCP servers you don't use daily. Keep only what's needed for the current task. You can re-add them later.

Related Issues

This is a recurring problem. If you're experiencing it, you're not alone:

#41930 — Token consumption dramatically increased since March 2024 update
#41866 — Abnormal token consumption making normal work impossible
#41956 — Usage limit reached unexpectedly
#41249 — Excessive token consumption rate
#41788 — Max 20 plan: 100% exhausted in ~70 minutes
#38335 — Max plan session limits exhausted abnormally fast
#41617 — Cache prefix breakage from Deferred Tool Loading
#40524 — Resume causes cache miss and excessive token usage

Full safety setup (8 core hooks + project-specific recommendations):

npx cc-safe-setup --shield

GitHub · Getting Started · All Tools

Quick diagnostic: Token Checkup — 5 questions to find where your tokens are going

Complete token optimization system:

📊 Token Optimization Book — 10 chapters, templates included (¥2,500)

📘 Hook Design Guide — Chapter 3 free (¥800 on Zenn)