Scenarios
Concrete, copy-paste-ready walkthroughs for the things developers actually build — retrieval, chat, classification — against one API key.
Chat app with streaming responses
Build a Next.js chat app that streams LLM output token-by-token through token-hub. Works with any supported model; swap providers without touching your client code.
- Next.js 14
- React
- Edge Runtime
- Server-Sent Events
Classify 10k support tickets cheaply
Run large-volume classification jobs through token-hub using DeepSeek or Qwen — cheap, fast, and easy to parallelize with asyncio + aiohttp.
- Python
- asyncio
- aiohttp
- DeepSeek V3
RAG pipeline with model mixing
Build a retrieval-augmented generation pipeline using token-hub. Embed documents with one model, retrieve from your vector store, and generate answers with Claude — all through one API key.
- Python
- LangChain
- Chroma
- DeepSeek embeddings
- Claude 3.5 Sonnet