Lesson 02

Conversation memory without a database

The bot has to remember what already happened in the chat. The messages array IS the memory — and where it breaks.

The big lie about LLMs

“Claude has no memory, you have to give it one.”

True. Each request is independent. The model remembers nothing between calls. “Memory” is simulated by resending the full history on every request.

The array is the memory

const history = [
  { role: "user", content: "My name is Ana." },
  { role: "assistant", content: "Nice to meet you, Ana." },
  { role: "user", content: "What's my name?" },
];

const res = await client.messages.create({
  model: "claude-haiku-4-5",
  max_tokens: 100,
  messages: history,
});
// → "Your name is Ana."

If you don’t send the first two messages on the third call, the model will not know your name. Period.

On the front: React state

"use client";
import { useState } from "react";

type Msg = { role: "user" | "assistant"; content: string };

export default function Chat() {
  const [messages, setMessages] = useState<Msg[]>([]);
  const [input, setInput] = useState("");
  const [loading, setLoading] = useState(false);

  async function send() {
    if (!input.trim()) return;
    const userMsg: Msg = { role: "user", content: input };
    const next = [...messages, userMsg];
    setMessages(next);
    setInput("");
    setLoading(true);

    const res = await fetch("/api/chat", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ messages: next }),
    });
    const data = await res.json();

    setMessages([...next, { role: "assistant", content: data.text }]);
    setLoading(false);
  }

  return (
    <div className="max-w-2xl mx-auto p-6">
      <div className="space-y-4 mb-6">
        {messages.map((m, i) => (
          <div
            key={i}
            className={`p-3 rounded-lg ${
              m.role === "user" ? "bg-blue-100 ml-12" : "bg-gray-100 mr-12"
            }`}
          >
            <p className="text-xs text-gray-500 mb-1">{m.role}</p>
            <p>{m.content}</p>
          </div>
        ))}
        {loading && <p className="text-gray-400">Typing…</p>}
      </div>
      <div className="flex gap-2">
        <input
          value={input}
          onChange={(e) => setInput(e.target.value)}
          onKeyDown={(e) => e.key === "Enter" && send()}
          className="flex-1 border rounded-lg px-3 py-2"
          placeholder="Type..."
        />
        <button onClick={send} className="bg-blue-600 text-white px-4 rounded-lg">
          Send
        </button>
      </div>
    </div>
  );
}

The problem: the history grows unbounded

Each turn adds tokens. If the user reaches 50 turns, you pay for the entire conversation on every new request.

Three strategies:

1. Sliding window

const recent = history.slice(-10);

Simple, brutal — usually enough for casual chats.

2. Summary + window

When the chat exceeds N messages, ask the model to summarize the older ones, replace them with the summary, and keep the latest N.

async function compressIfNeeded(messages: Msg[]): Promise<Msg[]> {
  if (messages.length < 20) return messages;

  const toCompress = messages.slice(0, -10);
  const summary = await client.messages.create({
    model: "claude-haiku-4-5",
    max_tokens: 300,
    messages: [{
      role: "user",
      content: `Summarize this conversation in 5 sentences:\n${JSON.stringify(toCompress)}`,
    }],
  });

  return [
    { role: "user", content: "Previous summary: " + (summary.content[0] as any).text },
    ...messages.slice(-10),
  ];
}

3. RAG (at scale)

If memory spans thousands of messages, store them in a vector store (Postgres+pgvector, Pinecone, etc.) and retrieve only what’s relevant per turn. Out of scope here, but it is the pro solution.

Up next

Tool use: we teach the bot to call real APIs when it lacks information.

LEVEL 2

Pro challenge

Pro challenge for this lesson

Same idea, no hints, graded by automated tests.

Unlock Pro Mode · €29 One-time payment · lifetime access · no subscription

LEVEL 3

Hard Mode

Extreme variant of the challenge. Time-bound with extra constraints.

Unlock Pro Mode · €29 One-time payment · lifetime access · no subscription

Previous Setup: Next.js + API key + environment variables Next Tool use: give your chatbot superpowers