← cc-safe-setup

Claude Code Extended Thinking Consumed 16 Million Tokens in 25 Minutes

April 20, 2026 · 3 min read · by yurukusa

TL;DR: Extended thinking can silently consume millions of tokens with no output. Install thinking-stall-detector.sh to get warned when thinking exceeds 5 minutes. Takes 30 seconds.

What Happened

A user on Sonnet 4.6 asked Claude Code to work on a task. The model entered its extended thinking phase — and never came out.

25 minutes later, 16 million tokens were consumed. No useful output was produced. The user's entire quota was gone. They requested a refund. (#51092)

This is not an isolated incident. A related pattern (#49884) shows Opus 4.7 reading a single file, then freezing for 30+ minutes before the API terminates the request. No output, but tokens consumed.

Why This Happens

Extended thinking has no built-in token ceiling. When the model enters a reasoning loop without finding a resolution:

  1. The model generates thinking tokens internally
  2. No tools are called, so no hooks fire
  3. From the outside, the session looks like it's "working"
  4. Tokens are consumed at full rate
  5. This continues until the context window limit is hit or the API times out

The critical gap: during thinking, there's no way for hooks to intervene. The thinking happens between tool calls, not during them.

The Fix: Thinking Stall Detector

30-second install:
npx cc-safe-setup --install-example thinking-stall-detector

This hook can't stop thinking mid-stream (nothing can). But it detects the aftermath: if more than 5 minutes pass between consecutive tool calls, it warns you that a thinking stall occurred. This gives you a signal to:

Manual Installation

Save as ~/.claude/hooks/thinking-stall-detector.sh:

#!/bin/bash
INPUT=$(cat)
TOOL=$(echo "$INPUT" | jq -r '.tool_name // empty' 2>/dev/null)
STATE_FILE="/tmp/cc-thinking-stall-last-call"
WARN_SECS="${CC_STALL_WARN_SECS:-300}"
NOW=$(date +%s)
LAST=$(cat "$STATE_FILE" 2>/dev/null || echo "$NOW")
echo "$NOW" > "$STATE_FILE"
GAP=$((NOW - LAST))
if [ "$GAP" -ge "$WARN_SECS" ]; then
    MINUTES=$((GAP / 60))
    echo "⚠️ Thinking stall: ${MINUTES}m with no tool activity." >&2
    echo "Check /cost. If tokens spiked, restart the session." >&2
fi
exit 0

Add to ~/.claude/settings.json:

{
  "hooks": {
    "PreToolUse": [{
      "matcher": "",
      "hooks": [{"type": "command",
        "command": "bash ~/.claude/hooks/thinking-stall-detector.sh"}]
    }]
  }
}

For API Users

If you're using the Claude API directly, you have more control:

For Max/Pro plan users in Claude Code, these API-level controls aren't available — the hook-based detector is your best defense.

Prevention Checklist

40 token drain symptoms diagnosed

The extended thinking drain is Symptom 40 of 40 documented in the Token Book. Each symptom includes root cause, hook-based fix, and prevention steps.

Token Book — $17 Details + Free Chapters Free Token Checkup (30 sec)

Related