"Server is temporarily limiting requests (not your usage limit) · Rate limited" — what it actually means

You're mid-task in Claude Code and every request starts failing with "API Error: Server is temporarily limiting requests (not your usage limit) · Rate limited" — even though your usage meters show plenty of headroom (Session 40%, Weekly 20%). Sometimes you also see a "You've hit your usage limit" banner that contradicts those same meters. Here's what's happening, how to confirm it, and the (small) set of things actually in your control.

Short version: it's a server-side overload, not your plan

The message says it outright — "not your usage limit." On the public API this surfaces as HTTP 529 overloaded_error ("The API is temporarily overloaded"), which the official errors docs describe as happening "when APIs experience high traffic across all users." It's a capacity condition on Anthropic's side, shared across everyone — nothing in your config, your account, or your usage caused it, and your 40%/20% meters confirm you're not actually limited.

Confirm it in 10 seconds

Tell the three lookalikes apart

Three different conditions show up with similar-looking "rate limited / usage" wording. They have different causes and different fixes:

What you seeWhat it isWhat you can do
"Server is temporarily limiting requests (not your usage limit)"529 overload — server capacity, all usersBack off and retry; it clears when capacity frees up. Not fixable client-side.
"You've hit your usage limit" and your meters really are near 100%Real plan limit — your session/weekly quotaWait for the window to reset, or reduce consumption. See usage limits guide.
Failures cluster right when you fire heavy parallel / sub-agent work429 acceleration limit — a sharp spike in your trafficThe docs' fix: ramp up gradually, keep a steady pattern. Throttle parallel fan-out.

The one true trap is the middle row's banner appearing while your meters show headroom — that's a display mislabel of a 529, not a real limit. Your meters (via /usage) are the authoritative source; believe them over the banner.

What you can and can't do

A 529 is not fixable from your side

It's Anthropic's capacity, shared across all users. No setting, retry-config, or plan change makes the overload go away. The honest answer is: back off and retry with a short delay, and it resolves on its own as load drops. These incidents are typically transient (minutes), not a state your account is stuck in.

The parts that are in your control:

"Does a failed/rate-limited request still consume my quota?"

This is the common follow-up, and the honest answer is: don't take anyone's word for it — measure it. Note /usage, trigger the error, check /usage again. A 529 that fails before any tokens are generated shouldn't bill tokens; but auto-retries that re-send context do reprocess tokens. If your meter advances across a window of pure errors with no successful output, that's a concrete bug worth reporting with a request-id attached.

The "an image could not be processed and was removed" error

If you also see this, it's unrelated to the rate limiting — it's image handling (an oversized or unsupported image in context; very large images can also hit the 32 MB request cap as a 413). Treat it as a separate issue.

Related

Prevent the costly Claude Code failures you can control

npx cc-safe-setup

Free hooks that stop destructive operations, secret leaks, and runaway cost loops before they run. They can't fix a server-side overload, but they catch the failures that are actually on your side.

GitHub · npm

Based on Anthropic's public errors documentation. Behavior can change between versions; check the official docs for the latest. cc-safe-setup is an independent open-source project, not affiliated with Anthropic.

All Tools