Workshops/ Capable Series/ Day 01 · Foundations
01 / 04
Pradhya Day 01 · Foundations 7 units · ~35 min reading + 15 min hands-on

What the model actually does.

Two hours that replace a year of magical thinking. By the end you will have written your first prompt with all three ingredients, run your first API call from a terminal, and seen — in your own browser — why the context window matters.

By the end of Day 01 · Foundations
  • Explain in one sentence what an LLM is doing on every token
  • Predict when a long conversation will start to lose context
  • Recognize a hallucination and choose the right defense for it
  • Write a prompt with all three ingredients and run it from your terminal
§ 01.01 · Unit 01

What an LLM actually is.

Large Language Models are pattern-matchers trained to predict the next word. That framing is almost everything you need.

A model like Claude takes the text you give it and produces the most plausible next text, one token at a time. It has no memory between conversations, no live access to the internet unless given tools, and no preferences of its own.

your prompt "The coffee in..." model predict next token next token "Phoenix" loop · the new token becomes part of the next prompt
An LLM, drawn · text in, one token out, repeat

Watch a single sentence get assembled below — each highlighted token was chosen from a small set of plausible candidates.

Each yellow token is the one chosen. The faint ones beside it are runner-ups the model considered and rejected. Multiply this decision by every word in every response and you have the model.

Why this matters in your work

  • The model has only what you put in front of it. If you want it to know your style, your project, your constraints — you have to tell it.
  • It can sound certain about things it has no way to know. Pattern-plausible isn’t the same as true. Your judgment is the check.
  • It does not remember. Each conversation is a fresh sheet unless you save it, name it, or paste it back in next time.
Test for yourself Ask Claude what it knows about an obscure person from your industry. Notice when it’s confident versus when it hedges. The hedges are the model showing you its uncertainty. Trust them.

Beat the model at next-word prediction.

You’ll do
Predict Claude’s next word three times, then watch it actually predict. No setup — just claude.ai in a browser.
Steps
  1. Open claude.ai (free account is fine).
  2. On paper, write the single next word you think comes after each opener: (a) “The opposite of hot is”; (b) “In 2019 the company quietly”; (c) “My favorite thing about Mondays is”.
  3. Paste each opener into claude.ai, one at a time, and add: “Continue with exactly one word, then stop.”
  4. Write the word Claude returned next to your guess for all three.
Verify
You have three guess/result pairs on paper. At least one of Claude’s words differs from your prediction — that gap is the “small set of plausible candidates” from the animation above.

Stretch. The deterministic opener (a) should give the same word every run; the open-ended one (c) varies. Run each three times and confirm which is stable — that is temperature, previewed (§01.04).

§ 01.02 · Unit 02

The context window.

Everything the model can “see” right now is in its context window. Think of it as working memory for this conversation.

From the last unit: Now that you know the model is doing next-token prediction, here’s its biggest constraint — and how to design around it.

The context window holds your prompt, prior turns, any files Claude has read, and the response being generated. Once a conversation grows past the cap, older content falls outside the window — and the model loses track of it.

Drag the slider below. Watch what falls out the back.

Conversation length
turn 1turn 30turn 60
in context · current turn · dropped (out of window)

Validation: this interaction is based on Claude’s context-window docs: platform.claude.com/docs/en/build-with-claude/context-windows.

Practical moves

  1. Put the most important context at the top of long prompts — it stays visible longest.
  2. Re-state constraints if you’re three or more turns deep. The model hasn’t “forgotten”; it can no longer see them.
  3. When you sense drift — wrong tone, lost requirement — start fresh. A new conversation is cheaper than fighting an old one.
One-line rule The model is not being lazy when it loses the thread. It is being asked to remember more than it can carry.

Find where the thread snaps.

You’ll do
Make a long conversation lose an early instruction, then prove it lost it. No setup — claude.ai in a browser.
Steps
  1. In a new claude.ai chat, send: “For the rest of this chat, end every reply with the word PINEAPPLE. Confirm.”
  2. Have ~15 short back-and-forth turns about any topic (paste a long article, ask follow-ups — whatever fills the window fastest).
  3. Watch for the first reply that omits PINEAPPLE. Note the turn number.
  4. Then send: “What word did I ask you to end every reply with?”
Verify
You can state the turn number where PINEAPPLE first dropped. The instruction didn’t move — it fell out the back of the window, exactly like the grey blocks in the slider above.

Stretch. Re-paste the instruction at the point of failure (Practical move #2 above). Confirm PINEAPPLE returns for the next few turns — restating beats fighting an old thread.

§ 01.03 · Unit 03

Hallucinations.

The model sometimes produces confident, plausible-sounding text that is wrong. Not a bug — a consequence of how prediction works.

From the last unit: The model can only attend to what is in its window. When you ask about something outside that window, the next-token machine still produces plausible text. That is what hallucination is.

confidence → truth ↑ good answers hallucinations
The danger zone · confident but wrong

Why they happen

The model is predicting plausible text. When you ask for a fact it does not reliably know, the most plausible-sounding text is often almost-right text — a real citation that doesn’t exist, a quote attributed to the wrong author, a statistic close to but not the published number. It fills the gap rather than admit not knowing.

The three defenses

  1. Ground the model. Want a summary of a paper? Give it the paper. Want analysis of your contract? Paste the contract. Don’t ask the model to recall facts; give it the facts and ask it to think.
  2. Ask for sources, then check. If you cannot verify a claim, treat it as a hypothesis, not a fact.
  3. Reduce confident framing. Tell the model: “If you’re not sure, say so.” Models obey instructions to express uncertainty.
If you're not certain about a specific fact, say "I'm not sure" rather than guessing. For everything you do state confidently, cite which document or source it came from.
Rule of thumb Treat the model’s first answer like an intern’s first draft. Useful as a starting point. Check the load-bearing claims yourself.

Catch a hallucination in the act.

You’ll do
Ask Claude an obscure, checkable fact from your own field; verify it against a real source; then re-ask with grounding and watch the answer change. No setup — claude.ai in a browser.
Steps
  1. Pick a narrow fact you can verify in one click — a specific clause in a standard you work with, the year a niche regulation changed, a stat from a report you know. Ask Claude for it plainly and save its answer (answer A).
  2. Open the actual source (the standard, the report, the regulator’s page) and check the real value.
  3. Now paste the source text into a fresh chat and re-ask, prefixed with the grounding prompt from this unit (the copy-able block above). Save that answer (answer B).
  4. Compare A and B against the source.
Verify
Both answers are saved. Either you identified a concrete factual error in answer A (wrong number, invented citation, mis-dated event) or answer A honestly refused / hedged — and answer B, grounded in the pasted source, is correct. You have caught the gap between plausible and true.

Stretch. Add “cite the exact line you used” to the grounded prompt. A real citation you can point to in the source is the difference between recall and grounding.

§ 01.04 · Unit 04

Tokens and temperature.

Two technical knobs you don’t need to obsess over, but should understand once.

From the last unit: You’ve seen what the model does (U01), what limits it (U02), and how it fails (U03). Two technical knobs change how all three of those behave.

Tokens

The model reads and writes in chunks called tokens — roughly 4 characters or about three-quarters of a word in English. A 500-word email is around 700 tokens. The context window and the model’s price are both measured in tokens.

750 words ≈ 1,000 tokens.
Claude Opus 4.7 & Sonnet 4.6: 1,000,000 tokens — roughly a 2,500-page book in working memory at once.
Claude Haiku 4.5: 200,000 tokens.

Temperature

Controls how creative versus deterministic the model is. Low = consistent and predictable. High = varied and surprising.

0 → 0.3 0.4 → 0.7 0.8 → 1.0 code · classify most work brainstorm ⟵ deterministic creative ⟶
Temperature · pick the zone that matches the job
SettingUse for
Low (0 – 0.3)Code, factual extraction, classification — anywhere the same input should produce the same output.
Mid (0.4 – 0.7)Most everyday work. The default for chat products.
High (0.8 – 1.0)Brainstorming, creative writing — where you want variety across runs.

Try the 3 model tiers.

You’ll do
Run one identical prompt against Opus, Sonnet, and Haiku and fill a 3-row table. No setup — use the model picker in claude.ai (top of the chat).
Steps
  1. Pick a task with a clear right answer (e.g. “Extract the 3 dates from this paragraph as ISO YYYY-MM-DD” with a paragraph you supply).
  2. Send it on Opus. Note: correct? (yes/no) and how many seconds until the reply finished.
  3. Switch the model picker to Sonnet. Send the same prompt. Same two notes.
  4. Switch to Haiku. Send the same prompt. Same two notes.
Verify
Your table has 3 rows, each with a correct-yes/no and a seconds value. Name the cheapest tier (Haiku < Sonnet < Opus) that still scored “correct” — that, not the biggest model, is the right default for this task.

Stretch. In code (after §01.07), the cost gap is explicit: Haiku is $1/$5 per million in/out tokens, Sonnet $3/$15, Opus $5/$25. Most production should be the cheapest tier that clears your quality bar.

§ 01.05 · Unit 05

Models vs products.

“Claude” is a family of models. The Claude app, Code, Cowork, and Design are products built on top of those models.

From the last unit: You now have the mental model of how the model works. The next question is which Claude product to point it at.

Claude the model chat app Cowork Code Design the API
One model · many surfaces · pick by the job
ProductWhat it isReach for it when
Claude (chat) Conversational text in, text out. You need to think.
Claude Cowork Agentic. Reads files, takes action. You need to do.
Claude Code Command-line. Automation, scale, engineering.You need to automate.
Claude Design Visual prototyping with AI. You need to make.
The API Raw access from your own code. You need to build.

By the end of Day 04 you will have used Claude (chat) to think through a problem, Claude Code to scaffold an agent, and the API to make the agent run autonomously. Different products, same model.

Same task, two products.

You’ll do
Run one real task through two different Claude surfaces and feel the difference the table describes. No setup — the chat app, plus one other surface you can reach.
Steps
  1. Pick a task that touches a file or your screen — e.g. “summarize the key risks in this document” (have the file ready).
  2. Do it in Claude chat (claude.ai): paste the text, get the answer.
  3. Do the same task in a do-it surface — Cowork or Claude Code if you have access; if not, in chat enable a connector or upload the file directly so Claude reads it itself instead of you pasting.
  4. Note which surface needed you to fetch and paste, and which one reached the material on its own.
Verify
You can name, in one sentence each, which surface you’d pick “to think” versus “to do” — and point to the concrete step that made the difference (who fetched the file).

Stretch. Map the other three rows of the table to a task you actually have this week: when would you reach for Code (automate), Design (make), or the raw API (build)?

§ 01.06 · Unit 06

Anatomy of a prompt.

Almost every prompt that disappoints is missing one of three ingredients: context, goal, or constraint.

From the last unit: Whichever product you reach for, the input is a prompt. The same three ingredients make a prompt useful in any of them.

context goal constraint a good prompt who you are what you want what NOT to do
Three ingredients · all present = a useful answer
  1. Context — who you are, what you’re working on.
  2. Goal — what you want the response to do.
  3. Constraint — what NOT to do, or the shape of the answer.

Missing all three

Write me an email to my team.

The model has to guess who the team is, what news to deliver, what tone you want. So it returns the most generic possible email — and you blame the model.

All three present

I lead a 12-person product team and we just lost our biggest customer. I want to write the team an email today that's honest about the news, doesn't catastrophize, and ends with a concrete next step. Three short paragraphs, no jargon. Don't use the word "journey."

Now the model has a real job. Same model, same minute, twenty times the value.

Quick check before you hit enter If you can’t answer “what is this for?” out loud in a sentence, the prompt isn’t ready yet.

Pick the one prompt you’ll keep iterating.

You’ll do
Choose a task you’ll use Claude for repeatedly. Commit to refining its prompt over Day 2–4.
Steps
  1. Pick a real task from your week (not a hypothetical).
  2. Write the first-pass prompt as you’d normally type it.
  3. Save it in a file: prompts/<name>.md.
  4. You’ll come back to this in Day 2 (patterns), Day 3 (tools), and Day 4 (agent).
Verify
You have a working baseline saved. The next 3 days improve THIS prompt, not abstract examples.

Stretch. Pair: have a colleague also write a baseline for the same task. Compare on Day 4.

§ 01.07 · Hands-on · 15 min

Your first API call.

Every concept above becomes real when you call the model from your own code. Fifteen minutes. No frameworks.

From the last unit: Everything from Units 01–06 becomes real when you call the model from your own code. Here is the smallest possible program that does it.

your laptop python script prompt + API key tokens streamed back Claude API api.anthropic.com
The shape of every program that talks to a model
Get the code You don’t have to copy anything by hand. Right-click each link and choose “Save Link As…” into one folder, then run it from there: In your folder: pip install -r requirements.txt installs everything these scripts need.

Step 1 — Set up

Install the Anthropic Python SDK and set your API key as an environment variable. Get a key at console.anthropic.com — first $5 of usage is free.

# macOS / Linux
python3 -m venv .venv
source .venv/bin/activate
pip install anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

Step 2 — Make the call

Save this as hello_claude.py. Then run python hello_claude.py.

from anthropic import Anthropic

client = Anthropic()  # reads ANTHROPIC_API_KEY from your env

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=400,
    messages=[
        {
            "role": "user",
            "content": (
                "You are a senior product reviewer at a hard-news "
                "publication. You care about clarity and the absence "
                "of marketing language.\n\n"
                "Critique this sentence: \"We leverage AI to unlock "
                "transformative outcomes for our customers.\""
            ),
        }
    ],
)

print(response.content[0].text)

What just happened

  • You used Role + Goal + Constraint in a single message. The model knew who to be, what to do, and what to look for.
  • The model received exactly your text and nothing else. No memory of past conversations. No web. Just your prompt and its training.
  • You got back text — assembled token-by-token, like the animation at the top of this page, only faster.
Commitment card Before Day 02, run this script three more times — each time with a real piece of writing from your week. Notice when the critique is sharp, when it is generic. That gap is what Day 02 is for.

Make your first API call — and print what it cost.

You’ll do
Run hello_claude.py end-to-end from a fresh virtual environment, then make it print the exact cost of the call.
Steps
  1. In the folder where you saved the files (see Get the code above), create and activate the venv, then install: run the four lines from Step 1.
  2. Set your key: export ANTHROPIC_API_KEY="sk-ant-..." (get one free at console.anthropic.com).
  3. Run it: python hello_claude.py. Read the critique it prints.
  4. Add two lines to the bottom of the file and re-run to see the cost — Sonnet is $3 / $15 per million input / output tokens:
    u = response.usage
    print(f"cost: ${u.input_tokens/1e6*3 + u.output_tokens/1e6*15:.5f}")
Verify
Your terminal prints a multi-sentence critique and a line like cost: $0.005… — a real number, under one cent. You have called the model from your own code and measured the bill.

Stretch. Swap the prompt for a real piece of your own writing and re-run. The cost line moves with the token counts — longer input and output cost more.