Lesson 04

Chain-of-thought: how to make the model reason

Four words that double precision on logic, math, and planning tasks. And how to combine them with JSON prefill.

The trick that works almost everywhere

There is a famous 2022 paper from Google that says something almost trivial but brutally useful:

If you add “Think step by step” to your prompt, the model reasons better.

Sounds like a joke. It isn’t. Here is what happens:

Without chain-of-thought: the model generates the answer directly.
With chain-of-thought: the model writes the reasoning first, then the answer.

Because later tokens depend on earlier ones, writing the reasoning improves the final answer. The model “leans” on its own text.

Example: logic problem

Without chain-of-thought:

const res = await client.messages.create({
  model: "claude-haiku-4-5",
  max_tokens: 50,
  messages: [
    {
      role: "user",
      content:
        "A box has 3 red balls and 5 blue. I draw 2 without looking. Probability both are red?",
    },
  ],
});
// → Often wrong or returns an unjustified short answer.

With chain-of-thought:

const res = await client.messages.create({
  model: "claude-haiku-4-5",
  max_tokens: 400,
  messages: [
    {
      role: "user",
      content:
        "A box has 3 red balls and 5 blue. I draw 2 without looking. Probability both are red?\n\nThink step by step before giving the final answer.",
    },
  ],
});
// → Computes 3/8 × 2/7 = 6/56 = 3/28 ≈ 10.7%

The pro pattern: thinking inside tags

In production, you don’t want to return the whole reasoning to the user. Trick: ask the model to think inside <thinking>, then give the final answer.

const res = await client.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 800,
  system: `You are an assistant that reasons inside <thinking>...</thinking> tags before answering.
The final answer ALWAYS goes inside <answer>...</answer>.`,
  messages: [
    { role: "user", content: "What is 23 × 47?" },
  ],
});

const text = res.content[0].text;
const answer = text.match(/<answer>(.*?)<\/answer>/s)?.[1].trim();
console.log(answer); // → "1081"

Now the user only sees <answer> and the model had a place to think.

Killer combo: chain-of-thought + JSON prefill

When you need structured JSON and reasoning, mix both patterns:

const res = await client.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 1000,
  system: `You are a ticket classifier.
First reason inside <thinking>. Then return JSON with "category" and "priority".`,
  messages: [
    {
      role: "user",
      content: "Email: 'The website has been down for all our users for 20 minutes.'",
    },
    {
      role: "assistant",
      content: "<thinking>",
    },
  ],
});

const raw = "<thinking>" + res.content[0].text;
const json = JSON.parse(raw.split("</thinking>")[1].match(/\{.*\}/s)?.[0] ?? "{}");
console.log(json);
// → { category: "bug", priority: "high" }

When NOT to use chain-of-thought

Latency matters: it adds seconds to the response.
Cost matters: you pay 2x-3x the tokens.
Trivial task: for translation or summarization it adds nothing.

Golden rule: if the problem requires more than one logical step, pay for chain-of-thought. Otherwise, save the tokens.

Challenge

Design a prompt that takes a dynamic programming problem (e.g. “longest common subsequence between two strings”) and returns:

A reasoning on how to approach it.
The step-by-step algorithm.
The final TypeScript code.

All inside different tags so you can parse them later.

In the next and final lesson we build the capstone project: a structured data extractor with automated tests.

LEVEL 2

Pro challenge

Pro challenge for this lesson

Same idea, no hints, graded by automated tests.

Unlock Pro Mode · €19 One-time payment · lifetime access · no subscription

LEVEL 3

Hard Mode

Extreme variant of the challenge. Time-bound with extra constraints.

Unlock Pro Mode · €19 One-time payment · lifetime access · no subscription

Previous Zero-shot vs few-shot: the pattern worth more than a bigger model Next Capstone: a structured data extractor