Scenarios

Concrete, copy-paste-ready walkthroughs for the things developers actually build — retrieval, chat, classification — against one API key.

Chat app with streaming responses

Build a Next.js chat app that streams LLM output token-by-token through token-hub. Works with any supported model; swap providers without touching your client code.

Next.js 14
React
Edge Runtime
Server-Sent Events

Classify 10k support tickets cheaply

Run large-volume classification jobs through token-hub using DeepSeek or Qwen — cheap, fast, and easy to parallelize with asyncio + aiohttp.

Python
asyncio
aiohttp
DeepSeek V3

RAG pipeline with model mixing

Build a retrieval-augmented generation pipeline using token-hub. Embed documents with one model, retrieve from your vector store, and generate answers with Claude — all through one API key.

Python
LangChain
Chroma
DeepSeek embeddings
Claude 3.5 Sonnet