From SKILL.md to shipped.
Claude Agent Skills are the cleanest distribution unit for AI capability: a folder with a SKILL.md, optional scripts and references, and a frontmatter description that tells Claude when to use it. This practice takes you from the empty folder to a skill running on real work by the end.
- Author a SKILL.md from frontmatter to body that loads progressively
- Decide what goes in scripts/ vs references/ vs assets/
- Iterate on a skill’s description until it triggers correctly on 9/10 test inputs
- Ship one skill to your ~/.claude/skills/ and use it on real work this week
The promise of skills.
A skill is a folder. The folder is a capability you can hand to anyone’s Claude and have it just work, without a single line of extra prompting from them.
What that buys you:
- Distribution. Drop the folder into
.claude/skills/; Claude discovers it. Git-clone, share over Slack, publish to a marketplace. - Discoverability. Claude reads the description, decides when to invoke. The user doesn’t have to type “use the X skill.”
- Composability. Skills call other skills. You build a small studio of specialists; Claude assembles them per task.
- Context efficiency. Eight skills cost ~500 tokens at startup, not 70,000 (more in Unit 02).
Progressive disclosure.
The single design idea that makes skills scale: only load what’s relevant, only when it’s relevant.
A project with 8 skills consumes 500 tokens at startup instead of 70,000. At any given moment, an agent typically has one skill body and one or two reference files loaded — perhaps 2,000 tokens instead of the full 70,000. This is what makes a library of skills cheaper than one bloated CLAUDE.md.
The mechanism: three tiers of information loading, drawn next.
The three tiers, drawn.
Frontmatter loads always. Body loads when relevant. References load only when needed.
Why this matters
Without progressive disclosure, every skill in your library would cost you full token budget on every conversation. With it, you can have hundreds of skills installed and Claude pays only for the ones that fire on the current task. This is what makes the awesome-claude-skills repos viable.
Discoverability via description.
The description field in SKILL.md frontmatter is the single most important field. It is the routing signal that decides whether Claude invokes the skill.
What a great description does
- Names the trigger. “Use when the user asks about water bills, utility statements, or sewer charges.”
- Names the artifact. “Produces a one-page summary with the four-month trend, anomalies flagged, and next-billing-cycle estimate.”
- Names the limit. “Do not invoke for credit-card statements or investment accounts.”
--- name: water-bill-analyzer description: | Use when the user provides a Phoenix-area water utility statement (PDF or CSV) and wants to understand their usage trend, flagged anomalies, or next billing estimate. Produces a one-page summary. Do NOT use for: credit card, electric, gas, or investment statements. ---
SKILL.md anatomy.
A complete worked example. Read once; understand the shape; use as your template.
--- name: water-bill-analyzer description: Use when the user provides a Phoenix-area water utility statement... allowed-tools: [Read, Bash] --- # Water Bill Analyzer You analyze Phoenix-area water utility statements and produce a one-page summary. ## Steps 1. Read the file the user provided. 2. Parse usage (gallons), period, and billing total. 3. Compute the four-month trend (call `scripts/trend.py`). 4. Flag anomalies: usage > 1.5x rolling avg, billing > $200, missing meter reads. 5. Estimate the next bill using the seasonal model in `references/seasonality.md`. 6. Produce a markdown summary in the format below. ## Output format **Usage**: X gallons (Y% vs last month, Z% vs same month last year) **Bill**: $A (broken into base + tier 1 + tier 2 + tier 3) **Trend**: rising / steady / falling **Anomalies**: bulleted list, or "none" **Next bill estimate**: $B, ±$C ## Tone Plain words. No "leverage" or "transformative". Quote the bill where useful. See `references/seasonality.md` for the Phoenix water-use seasonal pattern that informs the next-bill estimate. See `scripts/trend.py` for the deterministic four-month trend calculation.
Three rules for the body
- Target 1,500–2,000 words. Beyond that, split into references/.
- State what to do, not why. Once the skill loads, every line is a recurring token cost.
- Reference, don’t embed. Long examples, edge cases, lookup tables — in
references/, loaded only when the skill asks.
Two files to keep open while you build: the finished example above ships at code-examples/skills/water-bill-analyzer/SKILL.md (right-click → Save As), and the blank, fill-in-the-blank version is the Your first SKILL.md template at the bottom of this page. Copy the blank one; the next four labs fill it in piece by piece until you have a working skill.
Lab 1 of 4 — frontmatter + body.
SKILL.md with a triggering description and a body that lists the steps — the always-and-when-relevant tiers from Unit 03.- In any folder, make the skill directory:
mkdir -p water-bill-analyzer, thencd water-bill-analyzer. - Open the blank template at the bottom of this page; paste it into a new
SKILL.mdin that folder. - Fill the frontmatter: set
name: water-bill-analyzerand adescriptionthat names the trigger, the artifact, and the limit (model it on the one in Unit 04 — “Use when the user provides a Phoenix-area water statement… Do NOT use for credit-card or investment statements”). - Fill the body: a one-line job statement plus a numbered
## Stepssection and an## Output formatsection. Stuck? The finished body is shipped here — copy its shape, not its every word.
head -9 SKILL.md shows a frontmatter block fenced by --- on its own first and last line, containing both a name: and a description: key; below it, grep -c '^## ' SKILL.md returns at least 2 (you wrote a Steps and an Output section).Stretch. Keep the body under 2,000 words and move anything longer (rate tables, edge cases) to a one-line “see references/…” pointer — Lab 3 builds that file.
scripts/ — deterministic code.
When the work is mechanical — sorting, parsing, computing — the model should call a script, not think through it token-by-token. Scripts live in scripts/ and Claude executes them.
my-skill/ ├── SKILL.md ├── scripts/ │ ├── trend.py # 4-month rolling avg + anomaly detection │ └── parse_pdf.py # tabula-py extraction ├── references/ │ ├── seasonality.md # Phoenix seasonal water-use pattern │ └── tier_pricing.md # 2026 tier thresholds + rates └── assets/ └── report_template.md
Scripts are the part of the skill the model is bad at doing in its head. Sorting 50 rows, summing a column, joining two CSVs by ID, computing a regression — the model can describe how but burns tokens doing it. A 12-line Python script is faster, cheaper, and (this is the big one) provably correct.
Lab 2 of 4 — add the scripts/ piece.
- From your
water-bill-analyzer/folder (Lab 1):mkdir -p scripts. - Save the shipped script code-examples/skills/water-bill-analyzer/scripts/trend.py (right-click → Save As) into
scripts/trend.py. It is stdlib-only — nopip install. - Run it against its built-in sample to confirm it works:
python3 scripts/trend.py. - In your
SKILL.mdbody, make a step say to call it:python3 scripts/trend.py <path-to-history.csv>(columnsmonth,gallons,amount). Note in the body that running it with no argument uses the sample.
Trend : rising and an anomaly line ending (> 1.5x rolling avg); and grep -c 'scripts/trend.py' SKILL.md returns at least 1.Stretch. Make your own three-row history.csv (header month,gallons,amount) and run python3 scripts/trend.py history.csv — the latest-usage and trend lines should reflect your numbers, not the sample’s.
references/ — deep docs.
References are markdown files the SKILL.md body points at. Claude loads them only when the skill needs them. They live outside the always-loaded body so the body stays small.
Three uses worth knowing
- Domain knowledge. A reference page on Phoenix water tiers. Loaded only when computing the bill estimate.
- Long examples. Three or four full worked examples that illustrate edge cases. Loaded if the agent encounters something ambiguous.
- Recoverable detail. Specifications you wouldn’t memorize but might need to look up — rate tables, format specs, regulatory citations.
The body links to the reference: “see references/seasonality.md for the Phoenix water-use seasonal pattern.” The agent reads that line, decides if it needs to load it, fetches it on demand.
Lab 3 of 4 — add the references/ piece.
- From your
water-bill-analyzer/folder:mkdir -p references. - Save the shipped reference code-examples/skills/water-bill-analyzer/references/seasonality.md (right-click → Save As) into
references/seasonality.md. - In your
SKILL.mdbody, add (or keep) one line that points at it — e.g. “Estimate the next bill using the seasonal model inreferences/seasonality.md.” Do not paste the table into the body; the point is that it loads on demand.
grep -c 'Index' references/seasonality.md returns at least 1 (the seasonal-index table is there), and grep -c 'references/seasonality.md' SKILL.md returns at least 1 (the body points at it but doesn’t inline it).Stretch. Open references/seasonality.md and read the July index (1.75). Confirm the body never repeats that number — it lives in the reference, loaded only when the bill estimate needs it. That is progressive disclosure working.
assets/ — templates & binaries.
For things that aren’t text the model should reason about. Templates Claude fills in. Font files. Brand asset images.
- Templates — a report template Claude populates with computed values.
- Fonts — the brand fonts your skill embeds in generated PDFs.
- Reference images — logos, icons, watermarks.
The Anthropic skills repo has a pptx skill that is a useful reference for asset-heavy work: the SKILL.md stays compact while templates and supporting files do most of the delivery work. That’s normal.
Our water-bill skill needs no binary assets, so its assets/ folder stays empty — which means the three pieces from Labs 1–3 already make a complete, installable skill. Install it and make it fire.
Lab 4 of 4 — install it and make it fire.
- From the parent of your
water-bill-analyzer/folder, copy it user-global:mkdir -p ~/.claude/skills && cp -R water-bill-analyzer ~/.claude/skills/water-bill-analyzer. - Confirm the three pieces landed:
ls ~/.claude/skills/water-bill-analyzer ~/.claude/skills/water-bill-analyzer/scripts ~/.claude/skills/water-bill-analyzer/references— you should seeSKILL.md,trend.py, andseasonality.md. - Start a fresh Claude Code session (so it re-scans skills). Ask a triggering question, e.g.: “Here’s my Phoenix water statement — what’s my usage trend and roughly what’s next month’s bill?” (No real file handy? Say: “Run your trend script against its built-in sample and walk me through the output.”)
- Watch what Claude does. Does it announce the
water-bill-analyzerskill, follow the Steps you wrote, and callscripts/trend.py?
water-bill-analyzer (or visibly follows its workflow), runs scripts/trend.py, and the trend/anomaly numbers in its reply match the script’s own output. If it stays generic and never mentions the skill, your description didn’t trigger — Unit 10’s lab fixes exactly that.Stretch. Move the folder to a single project’s .claude/skills/ instead and confirm it fires there but not in an unrelated repo — the repo-local vs user-global distinction from Unit 12, observed live.
The skill-creator loop.
There's a skill called skill-creator whose job is to help you build other skills. The loop it embodies is the canonical workflow.
What “try in Claude Code” really means
Install the skill in .claude/skills/. Open Claude Code. Ask a question that should trigger the skill. Watch what happens. Does the skill fire? Does it load the right references? Does the output match the format? Each failure is a hint about what to tweak.
Description optimization.
The description is the routing signal. Tighten it iteratively, with the model’s help.
This is “LLM-as-judge” applied to your own skill discoverability. Cheap. Iterative. Catches the routing bugs you’d only notice in production. The bar is the one in this practice’s promise box: at least 9 of 10 should-fire prompts trigger, and at most 1 of 10 near-misses false-fires.
Tune the description to the 9/10 bar.
water-bill-analyzer description against 10 should-fire prompts and 10 near-misses, then edit it until it clears the bar.- Write a tally sheet: 10 prompts that should fire (e.g. “analyze my Phoenix water bill”, “why did my sewer charge jump?”, “estimate next month’s water usage”) and 10 near-misses that should not (e.g. “analyze my electric bill”, “read this credit-card statement”, “summarize this gas invoice”). One line each, with a column for expected (fire / no-fire).
- Paste your current
descriptionplus the 20 prompts into Claude and ask: “For each prompt, given only this description, would you invoke the skill? Answer fire / no-fire per line.” - Mark each answer right or wrong against your expected column. Count: should-fire hits out of 10, near-miss false-fires out of 10.
- For every miss, ask Claude: “Suggest the smallest description edit that flips this case without breaking the ones that already pass.” Apply it.
- Re-run steps 2–3 on the same 20 prompts. Loop until you clear the bar.
Stretch. Add 3 adversarial near-misses (“analyze my water park ticket receipt”, “my water heater bill”) and keep false-fires at ≤1/13 — the edge cases are where descriptions earn their keep.
Evaluating a skill.
A skill is a piece of agent capability. Like agents, skills get better when they have an eval suite. Three kinds of cases.
| Eval case | What it checks |
|---|---|
| Trigger | The description routes correctly. 10 should-fire + 10 should-not. |
| Format | Output matches the documented shape. Headings, numbers, types. |
| Correctness | The math/logic the skill is supposed to do gives the right answer. |
Use the eval harness from Practice 04 (Unit 02). Same shape: JSONL of test cases, a grader per case type, a run script. Add a regression check every time you modify the skill.
Ship it.
Three ways to distribute a skill, in increasing order of reach.
- Repo-local. Drop in
.claude/skills/of your project. Only that repo’s Claude sees it. Good for project-specific workflows. - User-global. Drop in
~/.claude/skills/. Every Claude session you start sees it. Good for personal workflows. - Published. Push to a Git repo. Bundle into a plugin (Practice 02). Submit to a marketplace. Anyone’s Claude can install it.
The public skills repo at github.com/anthropics/skills is your reference for both shape and quality. Read three skills from it before you publish your first one.
Ship your own skill, not the example.
- Name one task you did 3+ times last week (weekly status write-up, PR-description draft, meeting-notes cleanup). That’s your skill.
- Scaffold it the way Labs 1–3 built the water-bill skill: a folder, a
SKILL.md(start from the blank template) with a triggeringdescriptionand a Steps/Output body, plus ascripts/orreferences/file only if the work needs one. - Run Unit 10’s lab on its description: 10 should-fire prompts, 10 near-misses, tune until you clear the bar.
- Install it:
cp -R <your-skill> ~/.claude/skills/<your-skill>. Start a fresh session and ask a real triggering question.
ls ~/.claude/skills/<your-skill>/SKILL.md prints the path (it’s installed), and your Unit-10 tally sheet for it shows ≥9/10 trigger inputs firing. The live triggering question makes Claude announce or follow the skill.Stretch. Bundle it into a plugin (Practice 02) or push the folder to a Git repo so a teammate can install it — the “Published” tier above, done for real.
17 skills, 5 categories.
Before you build your own skill, study the seventeen reference skills Anthropic publishes. They cover common production-grade outputs — Word, Excel, PowerPoint, PDF, web UI, presentations, internal comms — and they show patterns that pair cleanly with each other. The repo to bookmark: github.com/anthropics/skills.
Each category section below documents every skill in it — what it does, when to use it, when to skip it, a real-life use case from a Pradhya practice, and which other skills it pairs with. Use the table on each unit as a reference; refer back when you’re scoping a skill of your own and want to make sure you’re not duplicating one that already exists.
Install one official skill and steal a pattern.
- Open github.com/anthropics/skills. Pick one that matches a file you actually have right now —
xlsxfor a messy CSV,pdffor a scanned receipt,docxfor a report (the cards in Units 14–18 tell you which fits). - Install it user-global: copy that skill’s folder into
~/.claude/skills/(same move as Day 2’s Lab 4). Start a fresh Claude Code session. - Run it on your real file — e.g. “clean this CSV into a pivot-ready sheet” / “OCR this scanned receipt and extract the line items.”
- Open that skill’s
SKILL.mdand find one technique to reuse: how itsdescriptionnames triggers, how it splitsscripts/vsreferences/, or how it formats its output. Write that one sentence down.
Stretch. Fork it: copy the folder under a new name, swap the specifics for your case, and re-run. You just turned a reference skill into one of your own.
Docs & files.
The four office formats. Most knowledge work eventually ends up as a Word doc, an Excel sheet, a PowerPoint, or a PDF — and these four skills produce them at a fidelity well beyond what raw model output can do.
docx Word documents
Create, read, edit, and manipulate .docx files. Tables of contents, headings, page numbers, letterheads, tracked changes, comments, image insertion, find-and-replace, conversions.
pdf PDF files
Read or extract text and tables. Combine, split, rotate. Watermark, encrypt, decrypt. Create new PDFs. Fill PDF forms. Extract images. OCR scanned PDFs so they become searchable.
pptx PowerPoint
Create, read, edit, modify .pptx files. Pitch decks, slide decks, presentations, templates, speaker notes, comments. Combine or split slide files.
xlsx Spreadsheets
Open, read, edit, or create .xlsx, .xlsm, .csv, .tsv files. Formulas, formatting, charts. Clean malformed rows, restructure misplaced headers, fix junk data into a usable sheet.
Design & visual.
Five skills that turn a description into a designed artifact. Generative art, static design, branded looks, themed styling, and Slack-ready GIFs.
algorithmic-art Generative p5.js art
Create algorithmic art via p5.js with seeded randomness and interactive parameter exploration. Particle systems, flow fields, organic systems, parametric variation.
canvas-design Static visual design
Create beautiful visual art in .png and .pdf documents. Posters, designs, static graphic pieces. Original work — does not copy existing artists’ style by name.
brand-guidelines Anthropic brand application
Apply Anthropic’s official brand colors and typography to any artifact that benefits from it. The shipped skill targets Anthropic’s look, but the pattern is the template for your own brand.
theme-factory Themed artifact styling
Toolkit for styling artifacts with a theme — slides, docs, reports, HTML pages. 10 pre-set themes with color + font pairings, or generate a new theme on-the-fly.
slack-gif-creator Animated GIFs for Slack
Create animated GIFs optimized for Slack (size constraints, validation, animation concepts).
Web & UI.
Three skills for the production web side: distinctive frontend design, complex claude.ai artifacts with React + Tailwind + shadcn, and Playwright-driven testing for what you build.
frontend-design Production-grade UI
Create distinctive, production-grade frontend interfaces with high design quality. Web components, pages, dashboards, landing pages, posters, applications. Generates creative, polished code that avoids generic AI aesthetics.
web-artifacts-builder Complex claude.ai HTML artifacts
Suite of tools for creating elaborate, multi-component claude.ai HTML artifacts using React, Tailwind, shadcn/ui. For artifacts requiring state management, routing, or shadcn components.
webapp-testing Local Playwright testing
Toolkit for interacting with and testing local web applications using Playwright. Verify frontend functionality, debug UI behavior, capture browser screenshots, view browser console logs.
Writing & comms.
Two skills for the structured-writing workflows: a guided co-authoring loop for proposals and specs, and a comms toolkit that knows the formats your company uses internally.
doc-coauthoring Structured doc workflow
Guides users through a structured workflow for co-authoring documentation. Helps transfer context, refine through iteration, and verify the doc works for readers.
internal-comms Internal company communications
Resources to help write internal communications in the formats your company uses: status reports, leadership updates, 3P updates, newsletters, FAQs, incident reports, project updates.
Dev & infra.
Three skills aimed at engineers: build with the Claude API the right way, build MCP servers that expose your systems to Claude, and the meta-skill that creates and evaluates other skills.
claude-api Build/optimize Claude API apps
Build, debug, and optimize Claude API / Anthropic SDK applications. Includes prompt caching, extended thinking, compaction, tool use, batch, files, citations, memory. Handles migrating code between Claude model versions.
mcp-builder Build MCP servers
Guide for creating high-quality Model Context Protocol servers that let Claude (and other LLMs) interact with external services through well-designed tools. Python (FastMCP) or Node/TypeScript (MCP SDK).
Validation: this visual follows the MCP architecture docs: host, client, server, JSON-RPC data layer, transports, and primitives (tools, resources, prompts): modelcontextprotocol.io/docs/learn/architecture.
skill-creator The meta-skill
Create new skills from scratch, edit and improve existing ones, measure skill performance, benchmark variance, optimize descriptions for better triggering accuracy.
Pairings & build vs reuse.
Skills compose. The high-value work is rarely one skill — it’s three or four chained together. This unit shows the chains that pay off, and ties the library back to your own skill-building from Day 1–3.
The high-leverage chains
- Report production chain.
doc-coauthoring→xlsx→docx→pdf. From structured prompt to deliverable PDF for the customer. - Deck production chain.
doc-coauthoring→canvas-design→pptx+brand-guidelines. From outline to on-brand presentation with custom visuals. - Web artifact chain.
frontend-design→web-artifacts-builder→webapp-testing. From design intent to tested interactive artifact. - Tax/expense chain.
pdf(OCR receipts) →xlsx(categorize) →docx(memo to accountant). Cowork’s R07 wired end-to-end. - Developer-tool chain.
mcp-builder(build the server) →claude-api(build the client) →skill-creator(package patterns into a skill). - Comms chain.
internal-comms(right tone) →doc-coauthoring(structure) →docx+brand-guidelines(final form).
Build vs reuse — the decision
| If the reference skill is | Do this |
|---|---|
| 80%+ match to your job | Fork it. Read the SKILL.md, copy the structure, swap the specifics. You’re writing 20% of a working skill, not 100%. |
| 50–80% match | Build alongside. Study the architecture (scripts/, references/, assets/). Build a new skill that imitates the pattern but solves your specific case. |
| < 50% match | Build clean. But cite the patterns you learned. Especially the description format — the published skills have the clearest triggering descriptions you’ll see. |
| Targeting Anthropic specifically | Use as-is. brand-guidelines, theme-factory defaults. |
| Targeting your own company | Fork & rename. brand-guidelines → pradhya-brand-guidelines. internal-comms → acme-internal-comms. |
To go further: read the actual anthropics/skills repo on GitHub. Every skill has a SKILL.md you can study in twenty minutes. The patterns repeat. The descriptions are the masterclass in trigger-tuning. Fork what fits.
Ready to fill in.
This is the structural template you copy when starting a new skill. Replace the bracketed parts with your skill's specifics. The skill-creator can take you from this to a publishable skill in 90 minutes.
--- name: [skill-name-kebab-case] description: [WHEN to use this skill, in 1-2 sentences. Start with the user-facing trigger phrases or scenarios. This field controls whether Claude reaches for the skill.] --- # [Skill name in Title Case] ## What this skill does [One paragraph. The job the skill performs. Not a tutorial — a description.] ## When to use - [Concrete trigger 1 — usually a user phrase or task pattern] - [Concrete trigger 2] - [Concrete trigger 3] ## When NOT to use - [Task that sounds similar but should use a different skill — name the other skill] - [Pattern that this skill handles poorly] ## How it works [Brief steps the skill takes. Bullet list, 5-7 items max.] ## Output [What the user gets at the end. Be concrete: file, message, transformation, decision.] ## Scripts and references - `scripts/[name].sh` — [what it does] - `references/[name].md` — [what it contains] ## Examples (optional, but helps trigger tuning) ### Example 1 User: [example request] Action: [skill's response or behavior] ### Example 2 [Different trigger pattern, same skill] ## Notes for skill maintainers - [Anything load-bearing that would surprise a future you] - [Known limitations] - [Last validated against Claude model: 4.7]
The `description` field is everything. It controls whether Claude even reaches for your skill. Write it as the trigger phrase a user would type, not as a tutorial summary. Most skill failures are description failures, not implementation failures.