Documentation
Start with the quickstart, then dive into the reference pages for rate limits, error codes, and SDK integration.
Guides
-
Quickstart
Go from zero to your first token-hub API call in under five minutes. Sign up, create a key, send a cURL request, and switch models by editing one string.
-
FAQ
Answers to the questions we get most often about token-hub — models, billing, SDK compatibility, data handling, rate limits, and failover.
-
Troubleshooting
Diagnose and fix common token-hub errors — 401 invalid token, 402 insufficient balance, 429 rate limit, 502 upstream down, 504 timeout.
Scenarios
View all →-
RAG pipeline with model mixing
Build a retrieval-augmented generation pipeline using token-hub. Embed documents with one model, retrieve from your vector store, and generate answers with Claude — all through one API key.
-
Chat app with streaming responses
Build a Next.js chat app that streams LLM output token-by-token through token-hub. Works with any supported model; swap providers without touching your client code.
-
Classify 10k support tickets cheaply
Run large-volume classification jobs through token-hub using DeepSeek or Qwen — cheap, fast, and easy to parallelize with asyncio + aiohttp.