Updated 2026-06-06 · independent · not affiliated with Anthropic

Claude Code blocked your legitimate work as a "cyber" policy violation?

If "This request triggered cyber-related safeguards" or "appears to violate our Usage Policy" stopped you mid-task on your own legitimate engineering — and then every message after it was blocked too — you're hitting a documented false-positive cluster. Here's why one block kills the whole session, how to keep working today, and how to argue back the tokens it burned.

The usage-policy / "cyber" classifier over-triggers on ordinary, vendor-documented engineering: embedded firmware flashing and eFuse provisioning (#64405), routine sysadmin audit commands (#61185), hardening your own software (#63751), and the session getting stuck in a blocked state afterward (#62071). It's well past anecdote: The Register tracked the escalation from ~2–3 reports/month in mid-2025 to 30+ in April 2026 alone. Reaction and comment counts move over time; the linked issues are the source of truth.

The short version. The classifier doesn't just judge your latest message — sensitivity rises with the accumulated security-adjacent context in the session. A firmware or security session naturally piles up that vocabulary, so eventually a completely benign sentence ("flash the remaining boards", "continue") trips it. Once it does, every later turn re-reads the now-"poisoned" context and re-blocks. /clear or a fresh session is the only escape — which kills any in-flight work and re-bills you to rebuild context. You can't fix the classifier, but you can stop feeding it and argue the billing back.

June 2026 update — the cluster widened past "cyber". In June 2026 (Claude Code 2.1.161–2.1.167), users report the same false-positive firing on work with no security or sensitive vocabulary at all: ordinary CRUD bug-fixing on your own Node/Mongoose API (#65867), fintech/trading-app development against a testnet (#65873), and pharmacometrics/PK SaaS (#65846, #65866) — in some reports it tripped on a bare git worktree status turn, or literally "hi", carrying zero domain content. A full per-account transcript audit (#65873) stamps every blocked event at 2.1.161 with the earliest on 2026-06-01 and the same block firing across four unrelated projects — so the window opens at or before 2.1.161, or it is server-side and not tied to a CLI release at all. Reports span Windows, macOS and Linux, which points the same way: a server/model-side change rather than anything OS- or vocabulary-specific.

Practical consequence: when it fires on plain, non-security turns, the "stop feeding it security vocabulary" mitigations below have nothing to reduce. Run the which-failure-mode test first to learn whether you can self-recover or have to escalate.

Why one block kills the whole session

The failure is not "one bad message." It's context accumulation: as a session collects security-shaped tokens (eFuse, secure-boot, iptables, /etc/shadow, exploit/audit terminology — even when entirely legitimate), the running context looks more and more like the thing the classifier is trained to stop. A single large security-shaped tool output can do it in one shot (#61185: a 17,000-line blocklist cat). After the first trip, the poisoned context is re-sent every turn, so even "ok" gets blocked (#62071, #63751). That's why the session, not the message, is what dies.

How to keep working today

Pin Sonnet for the security-adjacent work

The over-trigger signature is Opus-family-wide; operators report the same workflows pass far more often on a Sonnet variant. Switch with /model before the firmware/audit/security session. If Sonnet passes the exact message Opus blocked, that's both an immediate unblock and clean evidence for your report.

Don't let the context accumulate

Keep heavy secure-boot / eFuse / exploit-analysis discussion in a separate session from the execution turns, and keep execution terse ("flash board 3"). Read large security files with head -200 / grep -c instead of dumping the whole thing into context. Less accumulated security shape = fewer trips.

First, find out which failure mode you're in

A plain "new session" can't tell you, because it reuses the same ~/.claude.json and the same account. There are two very different causes and the recovery differs. Interactive version: 3 questions → recover-in-place vs escalate-only, with the exact commands for your case.

Cause	Sign	Recovery
Replayed local state	The block only returns when you `--continue`/`--resume` the poisoned session — the flagged turn is in that session's transcript	Start a genuinely fresh session, or resume from a copy of the transcript truncated to before the flagged turn — recoverable in place
Account / server-side	Even a fresh session, and a fresh git worktree, stay blocked (#65866)	No client-side action clears it; escalate with the `req_…` IDs

The one test that separates them — because it resets the two things a "new session" keeps — is to start fresh with config moved aside (back up, don't delete):

mv ~/.claude.json ~/.claude.json.bak
mv ~/.claude ~/.claude.bak     # has projects/history — back it up
# re-auth, reopen the project, run one benign turn

Works in the clean slate → you were replaying local state; recover by clearing just the offending session's transcript, then restore your backup selectively. Still blocked → it's account/server-side, and the req_… IDs from the blocked turns are what let Anthropic trace it independent of the (benign) content.

If you run hooks, the free MIT cc-safe-setup repo ships four advisory-only hooks built for exactly this cluster — none of them block, they just warn or break the bleed:

npx cc-safe-setup

Hook	What it does
`aup-large-tool-output-warner`	PreToolUse: warns before a `cat`/`find` on a security-shaped path dumps a large output that can flip the classifier — suggests a size-capped variant
`aup-retry-loop-guard`	PostToolUse: detects 3+ blocks in a short window on the same tool and tells you to swap/restart before you re-ingest context on retries — stops the double-billing loop
`aup-block-pattern-logger`	Logs each block (timestamp / model / pattern) so you can show the classifier-shift over time when you report
`aup-false-positive-helper`	SessionStart advisory naming the cluster and the swap/refund paths

The billing harm — and the refund framing that lands

A blocked request that still consumes tokens, then forces a full-context-rebuild restart, charges you twice: the work doesn't happen and you pay to rebuild the context the kill-switch threw away. That is a defensible refund case — but the framing matters. Don't ask for a generic refund; frame it as:

"Tokens consumed by a classifier false-positive and the forced session rebuild it caused — not by requested output," with the request IDs from the blocked turns attached.

Capture the request ID shown with each block (the req_… string) at the time it happens — it's the single most actionable thing you can give support, and the vendor-defect-cost framing is the one that tends to land versus "the filter is too aggressive."

This classifier keeps changing. The trigger surface, the version boundaries, and the workarounds shift release to release — what passes on one build blocks on the next. The free hooks and field guide above are the operator playbook; if you want the evolving picture (new trigger patterns, version boundaries, and defenses as they're found) tracked monthly instead of re-searching GitHub each time, that's what the Claude Code Safety Lab digest (¥500/mo) is for. Start free: the Cluster 9 field guide and the 4-question diagnostic that routes you to the highest-leverage path for your model and frequency.

How to report it so it actually helps

The product-native paths are inconsistent (/feedback and /bug are unavailable in some surfaces) and the Cyber Verification Program appeal often auto-declines individual developers, so a clean public repro plus a private form submission is currently the best combination. Add your case to the matching open issue rather than filing a new duplicate — #64405 (embedded/firmware), #61185 (sysadmin), #63751 (own-software hardening) — and 👍 the request for a first-class private false-positive report path (#64287). A clean repro with the exact two messages (one that passes, one that blocks) and the request IDs is far more actionable than "the filter is too cautious." Check the failure-mode cluster tracker to see if your variant is already documented.

FAQ

Why does it block "ok" or "continue" after the first false positive?

Because the block isn't about that message — it's the accumulated context being re-sent each turn. Once the session holds enough security-shaped content to trip the classifier, every subsequent turn re-reads it and re-blocks. The fix is a fresh session (or /clear), which is also why it costs you a context rebuild.

Will switching to Sonnet really get me unblocked?

Often, yes — the over-trigger signature is Opus-family-wide and the same workflows pass more often on Sonnet. It's a mitigation, not a guarantee; treat a Sonnet pass/Opus block on the identical message as both an unblock and a data point for your report.

Can I get the tokens refunded?

It's a defensible case when the tokens were consumed by the false-positive and the forced restart rather than by output you asked for. Frame it exactly that way, attach the req_… IDs from the blocked turns, and submit through support / the Cyber Verification Program false-positive form. The vendor-defect-cost framing lands better than a generic refund request.

Is this me doing something wrong?

No. Flashing your own boards, auditing your own systems, hardening your own software, and reviewing security are first-class, documented engineering. The classifier cannot reliably tell legitimate work from abuse yet — that's the defect (#64405). Keep a clean repro; you're not the outlier.

Independent reference by an operator running Claude Code 800+ hours, maintainer of cc-safe-setup (free MIT safety hooks). Issue numbers and reaction counts are as of 2026-06-06 and move over time; the linked issues are the source of truth. Not affiliated with Anthropic. This page describes user reports and operator-side mitigations — it is not legal or account advice; for Usage Policy questions, confirm against Anthropic's own documentation and support.