Chat app with streaming responses — token-hub scenarios

Streaming is the baseline for conversational UX. token-hub speaks the OpenAI-compatible SSE shape on /v1/chat/completions, so the server route you would write for an OpenAI-style backend can point at TokenHub by changing the base URL and bearer key.

This scenario keeps the TokenHub key on the server and uses the current public smoke-tested moonshot-v1-8k model.

Architecture

browser --fetch--> /api/chat
                       |
                       v
               token-hub /v1/chat/completions
                       |
                       v
               enabled upstream model channel

The route handler exists for two reasons: it hides your TokenHub key from the browser, and it gives you a place for product-side auth, rate limits, and logging before forwarding.

The server route

// app/api/chat/route.ts
export const runtime = "edge";

const TOKENHUB_URL = "https://llm.sandboxclaw.com/v1/chat/completions";

export async function POST(req: Request) {
  const { messages } = await req.json();

  const upstream = await fetch(TOKENHUB_URL, {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.TOKENHUB_KEY!}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "moonshot-v1-8k",
      messages,
      stream: true,
    }),
  });

  if (!upstream.ok || !upstream.body) {
    return new Response(`Upstream error: ${upstream.status}`, { status: 502 });
  }

  return new Response(upstream.body, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache, no-transform",
      "Connection": "keep-alive",
    },
  });
}

The client

Minimal React that reads SSE frames and appends deltas to the last assistant message:

// app/page.tsx
"use client";
import { useState } from "react";

type Msg = { role: "user" | "assistant"; content: string };

export default function Chat() {
  const [msgs, setMsgs] = useState<Msg[]>([]);
  const [input, setInput] = useState("");

  async function send() {
    const next: Msg[] = [...msgs, { role: "user", content: input }];
    setMsgs([...next, { role: "assistant", content: "" }]);
    setInput("");

    const res = await fetch("/api/chat", {
      method: "POST",
      body: JSON.stringify({ messages: next }),
    });
    if (!res.body) return;

    const reader = res.body.getReader();
    const decoder = new TextDecoder();
    let buffer = "";

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;
      buffer += decoder.decode(value, { stream: true });

      const frames = buffer.split("\n\n");
      buffer = frames.pop() ?? "";

      for (const frame of frames) {
        const line = frame.replace(/^data:\s*/, "");
        if (line === "[DONE]") return;
        try {
          const chunk = JSON.parse(line);
          const delta = chunk.choices?.[0]?.delta?.content ?? "";
          if (delta) {
            setMsgs((prev) => {
              const copy = [...prev];
              copy[copy.length - 1].content += delta;
              return copy;
            });
          }
        } catch {
          // Ignore keep-alive frames.
        }
      }
    }
  }

  return (
    <div className="mx-auto max-w-2xl p-6">
      <div className="space-y-3">
        {msgs.map((m, i) => (
          <div key={i} className={m.role === "user" ? "text-right" : ""}>
            <span className="inline-block rounded-lg bg-slate-100 px-3 py-2">
              {m.content}
            </span>
          </div>
        ))}
      </div>
      <div className="mt-4 flex gap-2">
        <input
          value={input}
          onChange={(e) => setInput(e.target.value)}
          className="flex-1 rounded border p-2"
          placeholder="Ask something..."
        />
        <button onClick={send} className="rounded bg-blue-600 px-4 text-white">
          Send
        </button>
      </div>
    </div>
  );
}

Gotchas

Keep the key server-side. Never ship TOKENHUB_KEY to the browser.
Handle [DONE] explicitly. Some streams send a trailing data: [DONE] frame; others close the connection.
Abort abandoned streams. Tie the fetch to an AbortController when users navigate away.
Treat model lists as live data. Show only models returned by the authenticated model list or documented as verified.