What survives the session.
Claude has no memory by default. Three layers fix that: auto-memory in Claude Code, the project-scoped CLAUDE.md file, and the memory tool API for explicit programmatic memory. This practice teaches what each is for and when to reach for it.
- Configure auto-memory + CLAUDE.md + the memory tool API in the right combination
- File memories under the right type (user / feedback / project / reference)
- Run the weekly 10-minute memory hygiene practice
- Decide between memory, wiki, and RAG for a given use case
Why memory matters.
A new Claude session starts blank. It does not remember your preferences, your project conventions, your past corrections, or the project you were in five minutes ago. Memory is how you fix that — selectively, deliberately, with hygiene.
Catch one thing you re-typed this week.
- Open your last 3 Claude conversations. Find one fact, preference, or correction you typed in more than once.
- Write it on one line.
- Tag it with exactly one of the four types this practice uses:
user(who you are),feedback(how to respond),project(current work),reference(a durable external fact). The types are taught in full in § 13.02.01 — for now, pick the closest.
feedback: never add a trailing summary). That line is the first memory you’ll save in § 13.01.03.Stretch. Find five. If three or more land in feedback, you have been re-correcting Claude all week — exactly what memory is for.
The three memory layers.
Each layer has a different scope and a different audience. Knowing which to reach for is most of the skill.
Route five facts to the right layer.
- Write down five real facts, e.g. “I prefer terse replies”, “tests live in
/tests”, “our staging DB host”, “this agent must recall the user’s name across runs”, “I’m on Pacific time”. - For each, pick exactly one layer using the rule: personal-and-cross-session → L1 auto-memory; should be true for everyone on the repo → L2 CLAUDE.md; a program needs to read/write it at runtime → L3 memory tool API.
- Say the deciding reason out loud for each in one phrase (“team-wide, so L2”).
Stretch. Any fact you put in two layers is a maintenance trap — pick the narrowest scope that still reaches the audience that needs it, and delete the duplicate.
Auto-memory.
Claude Code’s built-in memory system. Lives at ~/.claude/projects/<dir>/memory/. Stores typed entries (user, feedback, project, reference). Loads on every session start.
# In any Claude Code session
> Remember: I prefer terse responses without trailing summaries.
> Remember: tests in this repo live in /tests not /__tests__.
Each entry is saved as a markdown file with frontmatter, and a one-line pointer is added to the MEMORY.md index that lives in this project’s memory folder — ~/.claude/projects/<project-dir>/memory/MEMORY.md, not your home directory. On the next session in this project, Claude reads that MEMORY.md before doing anything else.
Run the auto-memory round-trip.
- In a Claude Code session, send these three lines (one at a time is fine):
Remember: my name is Sam and I prefer terse replies with no trailing summary. Remember: this project deploys on Tuesdays and Thursdays only. Remember: the staging database host is db-staging.internal:5432.
- Confirm they landed: run
ls ~/.claude/projects/*/memory/in a terminal — you should see aMEMORY.mdand new entry files. - Fully quit and reopen Claude Code (or run
/clearto drop the conversation) so you start with an empty chat. - In the fresh session, ask: What do you remember about me and this project? List the deploy days and the staging host.
Stretch. Open the matching .md file under that memory/ folder and read its frontmatter — note which of the four types (user / feedback / project / reference) Claude filed each fact under.
CLAUDE.md as project memory.
A markdown file at the root of your repo. Everything in it is in Claude’s context at session start. The most powerful single file you can write for your team.
Treat CLAUDE.md the way you treat a new hire’s first-week onboarding doc. Tell it what it would take you a year to learn by reading the code.
Write a tight CLAUDE.md and prove it loaded.
- Open the project. List the 5 things a new hire would need to know to be productive on day 1.
- Each should be a fact, not a procedure: ‘tests live in
__tests__/’, not ‘to run tests you...’. (The Walk-away template at the bottom of this page is a ready-made skeleton.) - Save it as
CLAUDE.mdat the repo root. Keep it under 100 lines; push longer content into skills or referenced docs. - Put one distinctive, checkable fact in it — e.g. “the deploy script is
scripts/ship.sh”. Open a fresh Claude session and ask the question that fact answers.
Stretch. Track edits to CLAUDE.md. If it grows past 200 lines, refactor sections into skills.
The memory tool API.
For long-running agents that need to persist state across many turns and sessions. The Anthropic SDK provides primitives; the agent saves and recalls via tool calls.
# Pseudocode: an agent saving memory mid-conversation memory_tool = { "name": "memory", "description": "Read and write entries to long-term memory. Use save_memory when you learn something the user wants you to remember next time.", "input_schema": { "type": "object", "properties": { "op": {"type": "string", "enum": ["save", "recall", "list"]}, "key": {"type": "string"}, "value": {"type": "string"}, } } } # In the loop, the model emits: {"op": "save", "key": "user.preferred_voice", "value": "plain · no hype words"} # Your harness writes it to disk; loads at next session start.
The pseudocode above is a schema. The thing that makes it real is the harness underneath: a few lines that write the saved value to disk on one run and read it back on the next. Build that harness now — no SDK, no API key, just the standard library — so the save/recall contract is concrete before you wire it to a model.
Build the save-then-recall harness.
- Save this as
memory_harness.py:import json, os MEMORY_FILE = "memory.json" def load_memory(): if os.path.exists(MEMORY_FILE): with open(MEMORY_FILE) as f: return json.load(f) return {} def save_memory(mem): with open(MEMORY_FILE, "w") as f: json.dump(mem, f, indent=2) mem = load_memory() if "user.name" in mem: # RUN 2+: recall what a previous run stored print(f"recalled: I remember your name is {mem['user.name']}.") else: # RUN 1: the model emitted a save op; persist it mem["user.name"] = "Sam" save_memory(mem) print("saved: stored user.name. Run me again to recall it.")
- Run it twice:
python3 memory_harness.py— then exactly the same command again. - Between the runs, open
memory.jsonand confirm"user.name": "Sam"is on disk.
saved: ...; run 2 prints recalled: I remember your name is Sam. — the second process printed a fact the first process stored, with nothing in memory between them but the file.Stretch. Map this to the schema above: save_memory is the "op": "save" branch, load_memory is what runs at session start. Add a list branch that prints every key, and a second key under feedback. to prove the store holds more than one type.
The four memory types.
Claude Code organizes auto-memory into four typed categories. Knowing which type to file under is half the skill.
File twelve memories under the right type.
- Open sample_memories.txt (right-click → Save As, or read it in the browser). It holds 12 numbered one-line memories above an answer key.
- Cover the answer key. For each of the 12 lines, write down one type:
user,feedback,project, orreference. Use the tell: if it changes what you do next it’s feedback; if it goes stale when the work ships it’s project; if it’s about the person it’s user; everything else durable is reference. - Uncover the answer key at the bottom of the file and compare line by line. Mark each right or wrong.
Stretch. For every line you missed, write the one-phrase reason the key gives. The pairs people confuse most are project-vs-reference (dated milestone vs standing convention) and user-vs-feedback (who you are vs how to reply).
When to save · when to skip.
Bad memory practice fills up with noise. Good memory practice keeps the bar high. Three rules.
- Save the non-obvious. If the model would deduce it from the code, don’t save it. If the model would never guess it without you telling it, save it.
- Save the durable. “Today I’m working on X” is ephemeral. “My team prefers Y convention because last quarter we got burned by Z” is durable.
- Save the corrections. Every time you correct Claude on something you’ve corrected before, that’s a memory you should have saved earlier.
What NOT to save: code patterns (visible in the code), git history (visible in git log), debugging recipes (the fix is in the code), and conversation context for the current session (use a plan or task list instead).
Audit one of your tools for the right memory placement.
- List every piece of state the tool needs: schema, examples, prior results, user prefs.
- For each, decide: this-turn (in args), this-session (in system prompt with caching), across-sessions (in storage).
- Move anything across-sessions out of the prompt — into a
memory_recalltool or storage. - Re-measure token usage. It should drop.
Stretch. The same audit applies to every tool. Run it on the top 3 most-called tools.
Memory for agents.
For agents running 100+ turns, in-context memory hits a wall. The agent’s memory tool is what lets it span days, not just turns.
Pick a memory pattern for one new agent.
- Sketch the agent on paper: inputs, what it must remember, what it can forget.
- Pick a pattern: stateless (no memory), session (in-prompt history), store (DB/file), RAG (retrieved facts).
- Justify the pick in one sentence. If you can’t justify it, the pattern is wrong.
- Build the smallest version of that pattern. Don’t generalize early.
store — it must recall the user’s open ticket between days”). If you can’t name a fact that must survive a restart, the answer is stateless — and you just saved yourself a database.Stretch. If two patterns fit, pick the simpler one and build the smallest version — the § 13.01.05 harness is already a working store pattern you can extend. Memory complexity compounds.
Memory vs wiki vs RAG.
Three overlapping ideas. The differences matter. Pick the right one for the work.
| Memory | LLM Wiki | RAG | |
|---|---|---|---|
| Holds | Preferences, corrections, durable facts | Structured knowledge with cross-links | Arbitrary documents indexed for retrieval |
| Scope | The user / session | A domain or person | A corpus |
| Size | ~100 entries | 100s of pages | 1000s of chunks |
| Best for | “I prefer” / “don’t do X” | “What do I know about Y?” | “Search my docs for Z” |
Route four needs to memory, wiki, or RAG.
- Write these four needs down: (a) “remember I dislike emoji in replies”; (b) “answer questions across 4,000 PDFs of past contracts”; (c) “hold a cross-linked knowledge base about one client I keep adding to”; (d) “never re-run a migration I already ran”.
- For each, pick exactly one of memory / wiki / RAG, and underline the row in the table that decided it (Holds / Scope / Size / Best for).
- State the deciding row in one phrase per need (“1000s of chunks → RAG”).
a=memory, b=RAG, c=wiki, d=memory, each justified by a row in the table. If you put (b) in memory or (a) in RAG, re-read the Size row — that’s the line that separates them.Stretch. Name a fifth need from your own work where the choice is genuinely ambiguous, and write the one sentence that breaks the tie.
The hygiene practice.
Memory rots when you don’t prune. Adopt a 10-minute weekly hygiene practice.
- Every Friday, 10 minutes. Open MEMORY.md. Scan every entry.
- Three actions per entry. Keep · update · delete.
- Delete more than you think. Old project notes, old corrections that aren’t relevant anymore. Stale memory is worse than no memory.
- Promote durable entries to CLAUDE.md. If a piece of personal memory should be true for the whole team, move it to the project file.
Run the first prune now, then schedule it.
- Open your MEMORY.md (under
~/.claude/projects/<project-dir>/memory/) or your project’s CLAUDE.md. Scan every entry. - Tag each entry keep / update / delete. Actually delete the stale ones — old project state, corrections that no longer apply.
- Promote any durable personal entry that should be true for the whole team into CLAUDE.md.
- Create a recurring calendar event — “Friday · 10 min · memory prune” — with these same three actions in the notes.
Stretch. Hygiene compounds. The hour you spend pruning saves five hours of confused agents.
The memory file you keep forever.
Memory is wasted if Claude has to re-discover you every conversation. This CLAUDE.md template captures the kind of stable context that pays off every single session. Drop it in any project root; customize the bracketed parts.
# CLAUDE.md — what to remember about this project ## What this project is [One sentence. Not the README — the working description.] ## Who I am working with - [Name, role, what they care about, what they push back on] - [Repeat for 2-5 key people] ## Conventions I follow - Branch naming: [pattern] - Commit style: [imperative / conventional / freeform] - Tests: [run via X, target coverage Y] - Code review: [must have N approvals from Z] ## Files Claude should touch carefully - [path]: [why — e.g. "load-bearing config, breaking it cascades"] - [path]: [why] ## Things that look wrong but aren't - [pattern]: [why it's actually correct — saves Claude 5 explanations] - [decision]: [the reason we chose it over the obvious alternative] ## My preferences - I prefer [explicit/concise/etc] explanations - For PRs, draft messages should follow [pattern] - I dislike: [list things — emoji in code, hedge phrases, etc] ## What's currently in flight [A 2-3 line note about what you're actively working on this week. Update on Mondays. Lets Claude reason about your task without re-asking.]
The "currently in flight" section is the trick. The other sections are stable; this one rotates. Updating it once a week turns CLAUDE.md from a static config file into a living context that catches Claude up to your week in three lines.