Skip to content

Lesson 04

ChatGPT-style UI with token streaming

Your bot no longer sits silent for 8 seconds: words appear as they are generated. Server-sent events + React.


Why streaming matters

A Claude response of 800 tokens takes ~6 seconds. If you wait for the full reply, the user stares at a spinner the whole time. If you stream, they see the first word in ~300ms.

The UX flips completely. Let’s go.

Streaming on the backend

Switch your endpoint to a ReadableStream that forwards SDK deltas:

// app/api/chat/route.ts
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

export async function POST(req: Request) {
  const { messages } = await req.json();

  const stream = new ReadableStream({
    async start(controller) {
      const encoder = new TextEncoder();

      const ms = await client.messages.stream({
        model: "claude-haiku-4-5",
        max_tokens: 1024,
        system: "You are a helpful assistant.",
        messages,
      });

      for await (const event of ms) {
        if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
          controller.enqueue(encoder.encode(event.delta.text));
        }
      }

      controller.close();
    },
  });

  return new Response(stream, {
    headers: {
      "Content-Type": "text/plain; charset=utf-8",
      "X-Accel-Buffering": "no",
    },
  });
}

Front-end that reads the stream

async function send() {
  if (!input.trim()) return;
  const next: Msg[] = [...messages, { role: "user", content: input }];
  setMessages(next);
  setInput("");

  setMessages([...next, { role: "assistant", content: "" }]);

  const res = await fetch("/api/chat", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ messages: next }),
  });

  if (!res.body) return;
  const reader = res.body.getReader();
  const decoder = new TextDecoder();
  let acc = "";

  while (true) {
    const { value, done } = await reader.read();
    if (done) break;
    acc += decoder.decode(value, { stream: true });

    setMessages((prev) => {
      const copy = [...prev];
      copy[copy.length - 1] = { role: "assistant", content: acc };
      return copy;
    });
  }
}

Each chunk from the backend is appended to acc and updates the last message. Visually, you watch it type.

Details that separate amateur from pro

1. Smart autoscroll

If the user is reading older messages, do not yank the scroll. Only autoscroll if they were already at the bottom.

useEffect(() => {
  const el = containerRef.current;
  if (!el) return;
  const wasAtBottom = el.scrollHeight - el.scrollTop - el.clientHeight < 100;
  if (wasAtBottom) {
    el.scrollTop = el.scrollHeight;
  }
}, [messages]);

2. Stop button

const abortRef = useRef<AbortController | null>(null);

async function send() {
  abortRef.current?.abort();
  const ctrl = new AbortController();
  abortRef.current = ctrl;

  await fetch("/api/chat", {
    method: "POST",
    signal: ctrl.signal,
    // ...
  });
}

function stop() {
  abortRef.current?.abort();
}

3. Markdown in replies

Claude often replies with markdown. Use react-markdown:

npm install react-markdown remark-gfm
import ReactMarkdown from "react-markdown";
import remarkGfm from "remark-gfm";

<ReactMarkdown remarkPlugins={[remarkGfm]}>{m.content}</ReactMarkdown>

4. Code blocks with syntax highlighting

npm install rehype-highlight highlight.js
import rehypeHighlight from "rehype-highlight";
import "highlight.js/styles/github-dark.css";

<ReactMarkdown rehypePlugins={[rehypeHighlight]}>...</ReactMarkdown>

You now have a respectable UI. Final lesson: deploy to Vercel.


LEVEL 2

Pro challenge

Pro challenge for this lesson

Same idea, no hints, graded by automated tests.

Unlock Pro Mode · €29 One-time payment · lifetime access · no subscription
LEVEL 3

Hard Mode

Hard Mode

Extreme variant of the challenge. Time-bound with extra constraints.

Unlock Pro Mode · €29 One-time payment · lifetime access · no subscription