thinking-stall-detector.sh to get warned when thinking exceeds 5 minutes. Takes 30 seconds.
A user on Sonnet 4.6 asked Claude Code to work on a task. The model entered its extended thinking phase — and never came out.
25 minutes later, 16 million tokens were consumed. No useful output was produced. The user's entire quota was gone. They requested a refund. (#51092)
This is not an isolated incident. A related pattern (#49884) shows Opus 4.7 reading a single file, then freezing for 30+ minutes before the API terminates the request. No output, but tokens consumed.
Extended thinking has no built-in token ceiling. When the model enters a reasoning loop without finding a resolution:
The critical gap: during thinking, there's no way for hooks to intervene. The thinking happens between tool calls, not during them.
npx cc-safe-setup --install-example thinking-stall-detector
This hook can't stop thinking mid-stream (nothing can). But it detects the aftermath: if more than 5 minutes pass between consecutive tool calls, it warns you that a thinking stall occurred. This gives you a signal to:
/cost immediatelySave as ~/.claude/hooks/thinking-stall-detector.sh:
#!/bin/bash
INPUT=$(cat)
TOOL=$(echo "$INPUT" | jq -r '.tool_name // empty' 2>/dev/null)
STATE_FILE="/tmp/cc-thinking-stall-last-call"
WARN_SECS="${CC_STALL_WARN_SECS:-300}"
NOW=$(date +%s)
LAST=$(cat "$STATE_FILE" 2>/dev/null || echo "$NOW")
echo "$NOW" > "$STATE_FILE"
GAP=$((NOW - LAST))
if [ "$GAP" -ge "$WARN_SECS" ]; then
MINUTES=$((GAP / 60))
echo "⚠️ Thinking stall: ${MINUTES}m with no tool activity." >&2
echo "Check /cost. If tokens spiked, restart the session." >&2
fi
exit 0
Add to ~/.claude/settings.json:
{
"hooks": {
"PreToolUse": [{
"matcher": "",
"hooks": [{"type": "command",
"command": "bash ~/.claude/hooks/thinking-stall-detector.sh"}]
}]
}
}
If you're using the Claude API directly, you have more control:
max_tokens to cap total output (including thinking)thinking parameterFor Max/Pro plan users in Claude Code, these API-level controls aren't available — the hook-based detector is your best defense.
thinking-stall-detector.sh/cost every 10-15 minutes during active workThe extended thinking drain is Symptom 40 of 40 documented in the Token Book. Each symptom includes root cause, hook-based fix, and prevention steps.
Token Book — $17 Details + Free Chapters Free Token Checkup (30 sec)