cc-safe-setup · team cost governance

Claude Code team cost governance: the gaps the official controls don't cover

The official spend limits and usage reporting are real and worth turning on. But on a team, three gaps let one developer's runaway quietly burn the shared quota — and none of the official controls prevent it. Here's what they cover, what they don't, and how to close the difference.

The failure mode at team scale

One mis-configured developer or repo doesn't just hurt that one person — it burns the shared quota or budget for the whole team. The public issue tracker has a steady stream of exactly this, and it's expensive:

A Max 20x subscriber's weekly limit gone in 2–2.5 days, forcing a cancellation (#65678).
A plugin silently spending ~$93 on a developer's API key over four days (#65543).
A 1M-context model accidentally pinned in settings, draining the limit on every session and surviving /clear (#65283).
A single run burning millions of tokens before anyone noticed.

At one developer this is an annoyance. Across twenty, it's a recurring budget line and a support burden.

What the official controls do cover

Credit where due — turn these on first; they're the baseline:

Org & per-member spend limits on usage credits (Team and seat-based Enterprise), plus a monthly usage-credit limit on Pro/Max via /usage-credits. The system checks you're within the limit before each request.
Month-to-date usage visibility per member, and /usage in the CLI with a breakdown across skills, subagents, plugins, and MCP servers.
Workspace spend limits on the auto-created "Claude Code" workspace for API/Console usage.

If you run on the API and pay in dollars, that covers a lot. The gaps below are where it stops.

The three gaps they don't cover

1. Subscription seats run on an included quota with no dollar cap

Pro, Max, and Team seats run on an included weekly / 5-hour usage allowance, not API dollars. The spend caps that exist govern usage-credit overage — the consumption after the included allowance — not the included quota itself. Per Anthropic's own docs, these limits are "not caps on base subscription seat costs." So when a runaway burns through the included weekly quota, no dollar limit stops it; the developer simply gets hard-blocked and their work stops for the rest of the week. Nothing prevents or refunds the wasted burn.

2. Bedrock / Vertex / Foundry usage is invisible to the Console

Straight from the Claude Code cost docs: "On Bedrock, Vertex, and Foundry, Claude Code does not send metrics from your cloud." The official recommendation is to bolt on a third-party gateway (LiteLLM) to track spend by key. If your org runs Claude Code through a cloud provider — which many enterprises do for data-residency reasons — the native org-level spend management can't see that usage at all.

3. Every control is post-hoc

The native limits and the excellent free tools like ccusage all act after the tokens are spent: they report what was consumed, or hard-stop once a budget is hit. None of them prevent the specific runaway from consuming the quota in the first place. The 1M pin keeps reprocessing a huge context every turn; the retry loop keeps firing; the plugin keeps billing — and you only find out when the limit is gone.

The net: if your team is on subscription seats or on Bedrock/Vertex, the dollar-based controls don't bound the quota that actually gets burned — and even where they do, they stop you after the waste, not before. The only lever left is preventing the burn locally, before it consumes the quota.

Closing the gaps: operator-side prevention

A small set of PreToolUse / SessionStart hooks can catch the common runaways before they consume the quota. These are free and MIT-licensed in cc-safe-setup; the relevant ones for cost:

Runaway	What prevents it
A `[1m]` model pinned in settings, draining the limit across `/clear` (#65283)	`persisted-1m-model-advisor.sh` — flags the pin at session start and names where it's set
A tool loop firing indefinitely (e.g. a research agent fetching URL after URL, #65684)	`webfetch-runaway-guard.sh` — caps a runaway fetch loop and tells the agent to summarize
A plugin/automation silently billing an API key (#65543)	`subscription-api-billing-warner.sh` — warns when usage is routing to API billing instead of the plan
A session quietly burning tokens far above your normal rate	`session-cost-alert.sh` / `quota-anomaly-detector.sh` — alert on abnormal burn before the limit is gone

Wiring one up is a few lines in settings.json:

{ "hooks": { "SessionStart": [
  { "matcher": "", "hooks": [
    { "type": "command", "command": "~/.claude/hooks/persisted-1m-model-advisor.sh" } ] } ] } }

They're advisory or scoped blocks, fail open, and don't touch ordinary work — they only fire on the specific runaway. They don't replace the official limits; they cover the window the official limits leave open.

The team problem: enforcement

Per-developer hooks solve this for one machine. The team problem is making sure every developer and repo actually has them — one person who forgets, or a brand-new repo, reopens the gap for the whole org. A committed .claude/settings.json baseline plus the free CI gate covers a repo; central, audited enforcement across every repo is the piece a per-developer tool can't give you.

Is org-wide enforcement of cost-prevention hooks worth building? I'm gauging interest before I build it — a 👍 or a one-line note on the discussion directly shapes what comes first: Would a 'cc-safe-setup for Teams' tier be useful? The individual hooks stay free, forever.

Start here

cc-safe-setup on GitHub (free, MIT) Team rollout playbook

If you'd rather have the baseline designed, enforced, and audited for you, see the cc-safe-setup team services (法人向け・日本語).

By the maintainer of cc-safe-setup — free, MIT safety hooks for Claude Code. Official behavior cited from Anthropic's Claude Code cost docs and spend-limit help; incident references link to the public anthropics/claude-code issue tracker.