Troubleshooting

Diagnose and fix common token-hub errors — 401 invalid token, 402 insufficient balance, 429 rate limit, 502 upstream down, 504 timeout.

Every error response uses the OpenAI shape:

{ "error": { "type": "...", "code": "...", "message": "..." } }

Start by logging the full body — the message field usually tells you the exact upstream reason.

401 authentication_error

The request reached us but the key did not authenticate. Causes in rough order of frequency:

Header is missing or malformed. It must read exactly Authorization: Bearer sk-th_....
Key was revoked from /keys. Revocation takes effect within seconds.
Key belongs to a deleted account.
Someone pasted a truncated key (the UI copies the full string — use the copy button).

Fix: issue a new key at /keys, update the client, and do not commit keys to source control. If you see 401 immediately after creating a new key, wait 2–3 seconds for cache propagation and retry.

402 insufficient_balance

Your balance is below the estimated max cost of the request at upstream list price. We reject the call before forwarding to the provider, so no tokens are consumed.

Fix: top up at /topup. Balance updates within ~15 seconds. If you want to avoid hitting this mid-burst, watch your balance via GET /v1/account/balance (returns current credits in USD) and top up at a threshold.

429 rate_limit_exceeded

You exceeded 60 RPM per key or 600 RPM per account. The response includes Retry-After: N seconds.

Fix pattern:

import time, requests

def call(payload, key, retries=3):
    for i in range(retries):
        r = requests.post(url, json=payload, headers={"Authorization": f"Bearer {key}"})
        if r.status_code != 429:
            return r
        wait = int(r.headers.get("Retry-After", "1"))
        time.sleep(wait + i * 0.5)  # small jitter
    return r

For steady-state traffic above 60 RPM, contact support@sandboxclaw.com for a higher limit rather than retrying harder.

502 upstream_error

token-hub received an error from the upstream provider (Anthropic, OpenAI, Google, DeepSeek, etc.). The error body includes the upstream name and its reason.

Fix steps:

Retry with exponential backoff (1s, 2s, 4s). Most 502s are transient.
If the error persists for one model family, fall back to a sibling: claude-3-5-haiku instead of claude-3-5-sonnet, or gpt-4o-mini instead of gpt-4o.
If several families fail at once, the issue is likely on our side — check support@sandboxclaw.com for an incident mail.

The request is not billed when we return 502.

504 upstream_timeout

The upstream provider accepted the request but did not respond within the timeout window (120s non-streaming, 600s streaming).

Common causes:

Prompt is very long and the model is slow (Claude Sonnet with a 150K-token prompt).
max_tokens is very large and the model is taking full budget.
Upstream is under load.

Fix: reduce max_tokens, trim the prompt, or switch to a faster model (Gemini 2.0 Flash for long-context, GPT-4o-mini for short). Retrying a 504 is usually fine — it does not consume tokens.

Other useful codes

400 invalid_request_error — unknown model ID, malformed messages, prompt above the model’s context window. Fix the body.
403 permission_denied — the requested model is geo-restricted or disabled for your account. Contact support.
500 internal_server_error — our fault. Safe to retry; open a ticket if it persists.

When in doubt

Email support@sandboxclaw.com with the full request ID (header X-Request-Id) and the error body. We can look up the exact upstream response in the 7-day operational log.