Practices/ The Founder’s Playbook
4 days · 16 units
Pradhya Practice 16 · The Founder’s Playbook Founder

Building an AI-native startup.

Most founder playbooks were written before AI changed who can build. This one is rebuilt around the four stages that actually matter when a non-technical founder can ship a working product in a weekend — and a technical founder can ship a company. Pradhya's working version of the four-stage founder map, with patterns drawn from a dozen AI-native companies we've watched up close.

Audience
Founders · technical or not · pre-product or pre-launch
Length
4 sessions · 90 min each, plus fieldwork between days (interviews, surveys) — calendar time 2–4 weeks
Walk-away
A documented plan: idea → MVP → launch → scale
Prereq
An idea you’re willing to put to the test
What you’ll be able to do by the end
  • State your startup’s hypothesis sharply enough to test it in a week
  • Architect an MVP that won’t collapse under its own AI-generated weight
  • Tell genuine product-market fit from early hype using a clear framework
  • Wire Claude (Chat, Cowork, Code) into the parts of your stack where it pays off
  • Decide what to build, what to delegate, and what to refuse — for the next 90 days
§ 16.01.01 · Unit 01 · source

The four-stage map.

Idea → MVP → Launch → Scale. The shapes are old. The work inside each is not. The framing here puts goals, exit criteria, failure modes, and AI exercises into each stage so you stop confusing “busy” with “progress.”

The four stages 💡 Idea Hypothesis + discovery 🛠 MVP Architecture + scope 🚀 Launch Measure + PMF signal Scale Operating system Each stage has goals, exit criteria, failure modes, and AI exercises.
Four stages · one direction · no shortcuts (the failures usually come from skipping one)

What goes inside each stage

StageGoalExit when…Common failure
IdeaA testable hypothesis backed by real customer signal10+ customer interviews say “yes, I’d pay for this”Falling in love with the solution before you understand the pain
MVPShippable v1 that demonstrates the core value1 paying user uses it weekly without you holding their handOver-engineering. Building features for hypothetical scale
LaunchPublic release, measurable adoptionRetention curve flattens, not zero, at week 4Confusing launch buzz with PMF
ScaleThe company runs without the founder typing every promptYou can take a week off without revenue droppingHiring “ops” instead of building the operating system
Why an AI-native playbook is different The traditional barriers — technical expertise, team size, development time — have collapsed. New barriers replace them: clear thinking, structured problem decomposition, the ability to collaborate effectively with AI. The playbook teaches the new barriers.

Place your startup on the map.

You’ll do
Honestly diagnose which stage you’re in right now. Most founders are one stage earlier than they think.
Steps
  1. Read the “Exit when…” column above.
  2. For each stage you’ve allegedly cleared, write the evidence (names of customers, weekly active count, retention number).
  3. If you can’t cite evidence, you haven’t cleared the stage — go back.
  4. Pin a note on your monitor with the stage name + the next exit criterion.
Verify
You’ve named the stage. You can recite the exit criterion from memory.

Stretch. Show your stage diagnosis to one honest friend. If they push you back a stage, listen.

§ 16.01.02 · Unit 02

Hypothesis validation.

A startup hypothesis has three pieces: who, pain, willingness. Most founders have one. AI doesn’t fix the missing two — it just makes it cheaper to discover them.

The shape of a testable hypothesis

Bad: “Small businesses need better bookkeeping software.”

Better: “US Schedule-C self-employed earning $50k–$200k will pay $25/month to never see a receipt again, because their current tax-season pain is 30+ hours of reconstruction.”

Three pieces in the better version: who (US Schedule-C, specific income band), pain (30+ hours, named season), willingness ($25/month, specific number).

The kill criterion Write down what would make you abandon this idea. “If 7 of 10 interviews report less than 5 hours of pain, the hypothesis is wrong.” Founders without kill criteria stay on dead ideas for years.

Write the three-part hypothesis + kill criterion.

You’ll do
Write your startup’s hypothesis in 3 sentences. Then write what would make you abandon it.
Steps
  1. Who: name a specific segment with a number you can size (industry + role + revenue band).
  2. Pain: name the specific cost — hours, dollars, missed opportunities — with a quantity.
  3. Willingness: name the price they’d pay and the cadence.
  4. Kill criterion: what evidence would convince you to abandon this idea?
Verify
Your hypothesis fits in 4 sentences. A stranger reading it could tell you whether your interviews validated or killed it.

Stretch. Run the hypothesis past Claude with: “Critique this as a YC partner would. What’s vague? What’s missing?” Tighten where it pushes back.

§ 16.01.03 · Unit 03

Landscape mapping.

Thirty minutes with Claude beats two weeks of competitor research the old way. The map you build now becomes your positioning later.

What goes on the map

  • Direct competitors — same problem, same solution shape.
  • Adjacent solutions — same problem, different shape (often what people use today by default).
  • Substitutes — the “none of the above” alternatives (spreadsheets, paper, doing-it-themselves).
  • Why each existing option fails — the gap you’re building into.

Generate the landscape with Claude.

You’ll do
Run a landscape-mapping prompt against your hypothesis. Produce a 1-page brief.
Steps
  1. Paste your hypothesis into Claude with web search on.
  2. Ask: “List every direct competitor, adjacent solution, and substitute. For each, give: 1-line description, who uses it, why they’d leave.”
  3. Verify 3 random claims by clicking through to the cited sources.
  4. Save the brief as landscape.md in your project folder.
Verify
You have a 1-pager. You found at least one competitor you didn’t know about. You can describe the gap in one sentence.

Stretch. Repeat the exercise from the customer’s perspective: “How would a [target segment] currently solve this?” The gap between competitor mental-model and customer mental-model is often your wedge.

§ 16.01.04 · Unit 04

Customer discovery with AI.

Twenty interviews in a week used to be heroic. Now the bottleneck isn’t conducting them — it’s synthesizing them. Use AI on the back end, not the front.

The interview shape that actually works

  1. Past behavior, not future intent. “What did you do last time X happened?” beats “Would you use X?”
  2. Specific stories, not generalizations. “Tell me about the most recent time…” beats “How often do you…?”
  3. Their words, not yours. Never describe the solution — let them name what they wish existed.
  4. Listen for emotion. The right ideas land on real pain. If the interviewee is calm, you’re probably building a vitamin, not a painkiller.

Where AI helps: sourcing prospects (LinkedIn / Twitter), drafting the outreach, transcribing the calls, and — most importantly — synthesizing patterns across 10+ transcripts. See Cowork Recipe R14 · The feedback themes.

Run 3 interviews this week, synthesized by Claude.

You’ll do
Three customer interviews. Transcripts. One synthesis pass.
Steps
  1. Source 10 candidates matching your “who”. Use LinkedIn search + a Claude-drafted outreach DM.
  2. Book 3 calls (15-min each).
  3. Use Granola / Otter / Read.ai to transcribe.
  4. Feed all 3 transcripts to Claude: “Surface the 3 most-repeated pains across these. Cite verbatim quotes. Note where I led the witness.”
Verify
You can quote 5 verbatim phrases. You spotted at least 1 place where you led the interviewee toward your answer.

Stretch. After 10 interviews, do the synthesis again. Compare to the 3-interview synthesis — how stable are the themes? Unstable themes mean you need more interviews; stable themes mean you can move to MVP.

§ 16.02.01 · Unit 05

Architecture you’ll regret.

AI-generated code accelerates everything, including the regrets. The decisions you make in the first week determine whether you ship in week 4 or rewrite in month 4.

Five decisions that compound

DecisionGet this wrong & …The default that ages well
Language / frameworkYou can’t hire, can’t debug, AI is weak at itTypeScript + React, or Python + FastAPI — both well-trained
DatabaseMigrations become impossible at scalePostgres — Supabase or Neon for managed
AuthYou spend a month rolling your ownClerk or Auth0 — offload until you can’t
HostingYou can’t deploy at 11pmVercel / Cloudflare Pages for frontends, Fly.io / Render for services
LLM providerVendor lock makes the next model migration painfulSDK with a thin abstraction layer for swapping providers

Make your five decisions.

You’ll do
Decide and document the 5 architecture choices above. One sentence each, with reasoning.
Steps
  1. Open ARCHITECTURE.md in your repo.
  2. For each of the 5 decisions, write: the choice, why, and the “reversal cost” (how painful would it be to change in month 6?).
  3. If the reversal cost is “catastrophic”, pick a default that’s easier to swap.
  4. Commit. The doc is the audit trail for future you.
Verify
Future engineers (including you in 6 months) can answer “why are we on Postgres?” from the doc alone.

Stretch. Ask Claude to critique your doc as a hostile reviewer. Note where it pushes back. Resolve, don’t dismiss.

§ 16.02.02 · Unit 06

Scope discipline.

When you can ship features in hours, scope becomes the most important constraint. The MVP is what you say no to.

The MVP cut

  • One workflow. Not one feature. One end-to-end thing the user can do that delivers the value.
  • Hardcode the rest. Settings page? Single hardcoded config. Permissions? You’re the only admin. Onboarding? Send them a Loom.
  • Reverse the funnel. Don’t build sign-up if you don’t have users. Build for the first 10 manually-added users.
  • Cut by week 2. If you can’t ship the one workflow in two weeks of focused work, the scope is too wide.
The cost of yes Every “yes” to a feature is a “no” to the discovery you didn’t do, the customer you didn’t talk to, the harder problem you didn’t solve. AI makes building cheap; deciding is still expensive.

The scope-cut session.

You’ll do
Take your current feature list. Cut it to one workflow. Defend the cut.
Steps
  1. List every feature you’ve considered for v1. (Be honest; usually it’s 15–30 items.)
  2. Identify the one workflow that — if it works — proves the hypothesis.
  3. Strike everything else. Move struck items to BACKLOG.md — not deleted, just postponed.
  4. If you can’t strike, ask Claude: “Which of these features could I hardcode, fake, or do manually for the first 10 users?”
Verify
Your v1 spec is ≤ 1 page. You can describe what the user does in 3 sentences.

Stretch. Tell a friend the v1 spec out loud. If they ask “but what about X?” for something you cut, that’s the test — can you answer with “manually, for now” without flinching?

§ 16.02.03 · Unit 07

Security from day zero.

AI-generated code carries new failure modes: prompt injection, data leakage through context, cost runaways. They’re cheap to prevent on day one and ruinous to retrofit at month six.

Five surfaces to lock down on day one

  1. Prompt injection. Treat all user input + retrieved content as untrusted. Sandbox tool calls. Never let user input change your system prompt’s rules.
  2. Data leakage. If your agent has access to multiple customers’ data, partition by customer in every retrieval. One leaked tenant kills the company.
  3. Cost runaway. Per-customer rate limits + budget alerts. A buggy loop can burn $10k overnight.
  4. Reversibility. Drafts before sends. Reads before writes. Preview before execute. Borrow the discipline from Cowork U11.
  5. Secrets. Never in code. Never in client-side. Vault them. Rotate quarterly.

See SaaS Practice U11 for the security-review agent pattern that catches these at PR time.

Run the day-zero security checklist on your repo.

You’ll do
Audit your codebase against the 5 surfaces. Fix the gaps that exist now.
Steps
  1. For each of the 5 surfaces, write down: what your current code does, what gap exists, what the fix would be.
  2. Fix the gaps that don’t require new infrastructure (e.g. add rate limits, partition queries by customer).
  3. Add the security-review agent (from SaaS U11) to your PR pipeline.
  4. File the rest as tickets with deadlines. Don’t let “we’ll get to it” be the answer.
Verify
No surface is at “we’ll get to it”. Every gap has a fix in code, a ticket, or an accepted-risk note.

Stretch. Wire a cost-alert webhook so a 10x spike in token usage triggers a Slack ping. Catching the bug at $50 is much better than at $5000.

§ 16.02.04 · Unit 08

The AI-native build loop.

The day-to-day rhythm of building when Claude Code is your senior IC. Plan mode, CLAUDE.md, ultrareview, deploy preview. Same patterns whether you’re a solo founder or a five-person team.

The loop

  1. Plan — describe the feature, hit plan mode, edit the plan.
  2. Build — approve the plan; Claude codes it. You read along.
  3. Review — run /ultrareview. Read every flag.
  4. Preview — deploy preview URL. Click through the change as a user would.
  5. Ship — merge. Watch metrics for 1 hour.

Cycle time: 30 minutes to 2 hours per feature. The constraint stops being typing speed and becomes decision quality.

/ultrareview is a real, copy-able pack: a slash command plus six single-purpose review agents (security, performance, tests, types, comments, simplify). Grab it from code-examples/ultrareview/ — the command file and the six agent prompts — and drop the folder into your repo’s .claude/ directory. Full install-and-run walkthrough in SaaS Practice U12.

Run one full loop on your MVP.

You’ll do
Pick one small MVP feature. Walk through every step. Time yourself.
Steps
  1. Pick the smallest visible feature you can think of (a single button + endpoint).
  2. Plan mode → build → ultrareview → preview → ship.
  3. Note where the loop felt fast and where it bogged down.
  4. The bog-downs are your tools to invest in next: better CLAUDE.md, more MCPs, better preview setup.
Verify
Total cycle was < 2 hours from start to deployed. If > 2 hours, you found friction to remove next time.

Stretch. Time five loops over a week. The curve should bend down. If it doesn’t, your scope per feature is too large.

§ 16.03.01 · Unit 09 · source

Real PMF vs early hype.

Launch buzz is not product-market fit. PMF is boring — it shows up as retention, not as TechCrunch posts. The framework here is what tells them apart.

The signals that lie

  • Launch-day signups. Vanity. People will sign up for anything.
  • Twitter engagement. Also vanity. AI products especially get retweet-bumps that don’t convert.
  • VC interest. Investors fund stories. Stories aren’t fit.
  • Press coverage. Lagging. Often arrives after the product’s peak.

The signals that don’t

  • Retention curve. Week-N active / Week-1 active. Healthy: flattens, not zeroes. Unhealthy: cliff at week 2.
  • Organic referral rate. Users telling other users — without a referral program.
  • Cohort revenue expansion. Customers paying more over time, not less.
  • The Sean Ellis test. “How would you feel if you could no longer use [product]?” ≥ 40% “very disappointed” is the classic threshold.
The honesty test If you stopped marketing tomorrow, would users still arrive and stay? If yes, you have fit. If no, you have growth machinery — not the same thing.

Run the Sean Ellis survey.

You’ll do
Send the “how would you feel” question to your active users this week. Score the result.
Steps
  1. Send one question to users who’ve used the product ≥ 2 weeks: “How would you feel if you could no longer use [product]?” Options: very disappointed / somewhat disappointed / not disappointed / N/A.
  2. Collect responses. Need ≥ 30 for a signal.
  3. Compute the “very disappointed” percentage.
  4. ≥ 40% → you have PMF; scale carefully. 20–40% → you have product-segment fit; double down on the segment. < 20% → iterate on the product itself.
Verify
You have a number. You know which segment of users gave the “very disappointed” answer — that’s your real customer.
If you’re pre-launch
No 30 live users yet? Run the same question against the prototype, not a shipped product.
  1. Pick 5 of the contacts from your U04 problem interviews — the ones who described the sharpest pain.
  2. Show them the prototype (a clickable mockup or a 2-minute Loom is fine). Then ask the Sean Ellis question worded for intent: “If this existed today and then disappeared, how would you feel — very disappointed / somewhat / not disappointed?”
  3. Record each answer verbatim plus one sentence of why.
  4. Count the “very disappointed” responses out of 5.
Verify: you have 5 recorded answers and a count (e.g. “3 of 5 very disappointed”). Treat it as a directional read, not PMF — 4–5 of 5 says keep building this; 0–1 says the prototype isn’t hitting the pain yet.

Stretch. Re-run quarterly (post-launch) or after every prototype revision (pre-launch). The percentage tells you whether the product is getting more sticky or less.

§ 16.03.02 · Unit 10

Measurement for AI products.

AI products generate non-deterministic outputs. Traditional metrics break. The new metrics measure trust, recovery, and the gap between “what we shipped” and “what users perceived.”

The new metric stack

MetricWhat it measuresHealthy
Task success rate% of user tasks the agent completed without escalation> 80% on the golden path
Edit distanceHow much users edit AI output before using itTrending down over time
Refusal rateHow often the agent says “I can’t” (and was it right to?)< 5%, with a manual audit of correctness
Recovery rateWhen the agent fails, how often does the user retry?> 60% (gives up too fast = trust broken)
Cost per successful task$ tokens + infrastructure per shipped valueDrops as you optimize, not flat

See Practice 14 · Observability for how to instrument these.

Pick three metrics for your product.

You’ll do
From the five above (or your own), pick the 3 you’ll track weekly. Set baselines.
Steps
  1. Pick 3 metrics that match your product’s value prop.
  2. Define each precisely: numerator, denominator, time window, exclusions.
  3. Measure baseline this week. Even an estimate counts.
  4. Add to your weekly review (or build a dashboard if scale warrants).
Verify
You can quote the 3 numbers from memory. You know what direction each should move.
If you’re pre-launch
No instrumentation, no traffic? Track by hand for one week — paper or a single spreadsheet beats a dashboard you don’t need yet.
  1. Pick 3 metrics from the table that match your value prop (task success rate and edit distance work even with a handful of test users).
  2. Write each one down with its exact definition: numerator, denominator, what counts as “success.”
  3. For 7 days, every time you or a tester runs the core workflow, tally the result by hand — one tick mark per run in the right column.
  4. On day 7, compute each metric from your tally sheet.
Verify: a one-page tally sheet exists with 3 named metrics and at least 7 dated entries, and you can read the three week-1 numbers off it. That sheet is your baseline; instrument it for real only once the manual count gets annoying.

Stretch. Set targets, not just baselines. Without targets, metrics are decoration.

§ 16.03.03 · Unit 11

Agentic workflows in product.

The first wave of AI products was “ChatGPT for X.” The second wave is workflows: opinionated multi-step jobs the agent runs on the user’s behalf. Two questions decide whether to add an agentic feature: can you describe the value in 1 sentence? and can the user verify the result faster than they could have done it themselves?

When to add an agentic workflow

Reach for an agentic workflow when…Don’t when…
The job is repetitive and the value compounds across runsThe job is one-off
The user can verify the result in 30 secondsThe user must check every line
The cost of a wrong action is low or reversibleWrong actions damage data or relationships
The job lives in a single tool / contextThe job requires cross-system orchestration you can’t fully access

Map your product to workflows vs chat.

You’ll do
For each major action in your product, decide: is this chat (open-ended) or a workflow (opinionated)?
Steps
  1. List your product’s top 5 user actions.
  2. For each: chat, workflow, or both?
  3. For workflows, write the 1-sentence value prop and the verification step.
  4. For chat, name the failure mode: what happens when the model goes off-piste? Is the user OK with it?
Verify
You can defend each chat-vs-workflow choice in one sentence. No “chat because it’s flexible” without a real reason.

Stretch. A common evolution: feature ships as chat, becomes a workflow once you see the 5 actual prompts users send. Bake the most-used prompt into a button.

§ 16.03.04 · Unit 12

The launch checklist.

What must be true before public launch. Skip any of these and the launch turns into firefighting instead of growth.

The non-negotiables

  • You have 10+ users who’d be very disappointed if it shut down. The Sean Ellis test from U09.
  • You have 1 paying customer at full price. Discounts don’t prove fit; full price does.
  • Cost per user < price per user. Even by 1¢. Negative unit economics at launch is technical debt with a deadline.
  • The 5 most-likely failure modes are handled. Either fixed, monitored with alerts, or explicitly accepted with a runbook.
  • Onboarding works without you. A stranger can sign up and reach the first “aha” without DMing you.
  • You can shut it down gracefully. If it’s a flop, you have a plan to refund / wind down / pivot without burning users.

Run the launch readiness review.

You’ll do
Audit your product against the 6-item checklist. Be honest. Delay if any fail.
Steps
  1. For each non-negotiable, write: pass / fail / partial.
  2. For each fail or partial, what’s the smallest fix?
  3. If you can fix in a week: delay launch a week. If you can’t: launch in beta only.
  4. Schedule the post-launch review for week 4.
Verify
Every box is checked. The boxes that aren’t have a known plan or a downgrade (beta vs public).
If you’re pre-launch
No paying customer to point at yet? Pressure-test the riskiest checklist item — willingness to pay — with a pricing-interview script on 3 real prospects.
  1. Pick 3 prospects who match your “who” (reuse U04 interviewees).
  2. Run this exact script on each, one at a time: (1) “How do you solve this today, and what does that cost you in time or money?” (2) “If a tool did this for you, what would make it worth paying for?” (3) “At $X/month, is that a yes, a no, or a maybe — and why?” (Name your real price for X.)
  3. Write down each prospect’s answer to question 3 as a literal yes / no / maybe, plus their stated reason.
  4. Tally the three answers. Two or more unprompted “yes” at full price is your pre-launch proxy for the “1 paying customer” box.
Verify: you have 3 recorded yes/no/maybe answers at a named price, and you can state your prospects’ #1 objection in one sentence. If all 3 are “maybe” or “no,” the price or the value prop — not the launch — is the thing to fix next.

Stretch. Show the checklist to one founder who’s 2 years ahead of you. Their pushback on your “pass” entries is gold.

§ 16.04.01 · Unit 13

Product tools matrix.

There are three product surfaces — Chat (claude.ai), Cowork (Claude Desktop), and Code (Claude Code). Each fits a different stage and shape of work. Picking wrong is the most common founder mistake.

The matrix

SurfaceBest forWhere it fits in your stack
Chat (claude.ai)Open-ended exploration, prompt design, customer support repliesCustomer-facing “ask Claude” features; pre-prompt-engineering workspace
Cowork (Claude Desktop)Personal operations: read files, run multi-step jobs across your appsFounder ops (sales prep, weekly review, contract review); not customer-facing
Code (Claude Code)Engineering work: write, refactor, shipYour engineering stack from day one
API (SDK)Whatever you build into your productAnywhere AI shows up in your product’s critical path

Founders frequently confuse the lines. A common mistake: building product features in Cowork (which only you can use) instead of the API. Or treating Chat as your engineering tool, when Code does it better. See Practice 03 (Cowork) and Practice 02 (Claude Code).

Place every AI-touching task into the matrix.

You’ll do
List every place AI shows up in your company. Assign each to Chat, Cowork, Code, or API.
Steps
  1. Brainstorm: customer support, sales prep, content, ops, engineering, customer-facing features.
  2. For each, assign the right surface.
  3. Flag mismatches: where are you using the wrong surface today?
  4. Plan one migration this week.
Verify
Every row has a surface. You found at least one mismatch.

Stretch. Add your team’s entries too. The matrix becomes a quick onboarding doc for the next hire: “here’s where AI lives for us.”

§ 16.04.02 · Unit 14

Founder operating system.

Scale starts when the company stops needing the founder to do every prompt. The operating system is the set of skills, recipes, and agents that take over the work you’d otherwise re-do every week.

What goes in the founder OS

  • The Cowork recipe library — meeting prep, weekly review, contract review, customer triage, investor updates. See Cowork Day 04.
  • The Claude Code plugin — your team’s conventions encoded as hooks, slash commands, skills. See Practice 02.
  • The customer-facing agentic workflows — the API-driven product features.
  • The metrics dashboard — from U10, automated and visible.
  • The runbooks — what to do when X breaks. Written before the break, not during.
The freedom test Could you take a week off and have the business not lose ground? If no, the operating system isn’t there yet. The work isn’t hiring an ops person — it’s building the systems that make ops smaller, not larger.

Build the founder OS v1.

You’ll do
Inventory what an operating system would have to cover for your business. Start filling the gaps.
Steps
  1. Make a 2-column list: tasks I do weekly · can they be skilled / recipe’d / agent’d?
  2. For each “yes”, pick one to build this week (Cowork recipe is usually fastest).
  3. Document each as a starred chat, a skill, or a plugin.
  4. Repeat weekly. The OS gets thicker; your week gets lighter.
Verify
You shipped 1 OS component this week. You used it twice.

Stretch. Pair: ask your co-founder or first hire which of your recurring tasks they wish they could just run themselves. That’s the most leveraged OS component to build next.

§ 16.04.03 · Unit 15 · source

Founder stories.

Five companies, five lessons. The playbook’s case studies are short on purpose — the lesson is always one decision, not the whole company.

Ambral · Customer success at scale

Lean team, AI as the leverage point.

Ambral built customer-success software that synthesizes signals from customer activity and interactions into AI-powered models of every account. A small founding team is doing the work of a 20-person CS org because the AI is the org chart, not the headcount.

The lesson: if your product’s value compounds with data per customer, the AI handles the scaling. Headcount stays flat.

HumanLayer · Context engineering as moat

Pivoted, then codified what worked.

HumanLayer pivoted and scaled while turning their internal context-engineering practices into a framework now used across the YC ecosystem. Built around the insight that the most useful functions in software are also the riskiest — especially for non-deterministic LLM systems.

The lesson: the practice you develop solving your own hard problem is sometimes more valuable than the product itself. Document and share.

Vulcan Technologies · Government contracts as non-engineers

Domain expertise + AI > engineering background.

Vulcan won government contracts as non-engineers. The AI was the engineer; the founders were the domain experts and operators.

The lesson: if you have deep domain knowledge, the “but I’m not technical” objection no longer holds. The technical work is what AI does. Your job is to know the customer, the regulation, the contract terms.

Carta Healthcare · Vertical AI in regulated markets

The compliance + AI moat.

Carta Healthcare builds AI tooling for the healthcare data abstraction problem — an area where compliance, accuracy, and trust gate every deal. The wedge is being the AI player that meets the healthcare bar.

The lesson: regulated markets are slow but defensible. Once you’ve cleared the compliance bar, generic AI tools can’t compete.

Anything · AI-native consumer

Shipping the surface where users live.

Anything builds AI-native consumer experiences where the agent is the product, not an add-on. Where most consumer companies bolt a chat onto an existing app, Anything started from the assumption that the chat is the app.

The lesson: the form factor is part of the bet. Choosing “AI-as-feature” vs “AI-as-product” is a strategic decision, not a UX one.

Find the founder story that maps to yours.

You’ll do
Pick the story closest to your situation. Steal the lesson. Apply it this week.
Steps
  1. Read each story above. Which one’s pattern matches your startup?
  2. Write down what the founder did differently than you would have.
  3. Adapt one specific move from that story to your week.
  4. If you can’t find a match, search for one founder who’s 18 months ahead in your category. Steal from them instead.
Verify
You can name your “reference founder” and the one move you’re borrowing.

Stretch. Email the founder. Most respond if you ask one specific question. Worst case, no reply — best case, a 15-min call that compresses your learning by a year.

§ 16.04.04 · Unit 16

The 90-day plan.

A practice is decoration unless it changes what happens Monday morning. The closing exercise turns everything above into one written 90-day plan you actually execute.

The shape of the plan

  • The one number. Pick one metric from U10 you’ll move. The 90 days are about moving it.
  • The 3 outputs. What ships in 30 / 60 / 90 days? Be specific. Features, decisions, customers.
  • The kill criteria. What would tell you to stop? Don’t skip this.
  • The weekly check-in. Same time, same template. Friday afternoon, 20 minutes, alone.
  • The accountability partner. One person who reads the plan and asks at week 12 whether you did what you said.
Closing exercise Write the 90-day plan today. One page. Pin it where you can see it from your desk. Send it to your accountability partner. The act of writing it is the half of the work that 90% of founders skip. The other half is doing it — but the writing is the hard half.

Write the one-page 90-day plan.

You’ll do
One page. Five sections: the one number, the 3 outputs, kill criteria, weekly check-in, accountability partner.
Steps
  1. Pick the number from U10. State the current value and the target.
  2. List the 3 outputs — 30 day, 60 day, 90 day — that would credibly move that number.
  3. Write the kill criteria. What would convince you to scrap or pivot?
  4. Schedule the weekly check-in. Same slot, same template, on the calendar.
  5. Pick the accountability partner. Send them the plan today.
Verify
The plan is one page. It’s sent. It’s pinned. The calendar event for the first check-in exists.

Stretch. At week 12, re-read the plan before writing the next one. Note what you got right, what you got wrong, and what surprised you. The reflection is where the next playbook starts.

§ Walk-away · The founder weekly review prompt

The one prompt every founder runs Friday.

Across the dozen AI-native companies we've watched closely, the single highest-leverage habit is a structured Friday retro using Claude as your honest second voice. This is the prompt; install it as a recurring Claude Project.

# Founder weekly review — paste into a Claude Project every Friday

You are my honest second voice. I'm a founder building [one sentence about
your company]. Right now I am at stage: [Idea / MVP / Launch / Scale].

Every Friday I check in with you. Here's what happened this week:

WINS (3 max):
- [list]

WHAT I LEARNED about my customer or market:
- [list]

WHAT I AVOIDED that I shouldn't have:
- [list, be honest]

WHAT I'M TELLING MYSELF that might be a lie:
- [list]

Based on this, tell me:
1. The ONE move next week that would change the trajectory most.
   Be specific — a person to call, a feature to ship, a piece to write,
   a decision to stop avoiding.
2. The thing I'm clearly procrastinating on, and why I'm probably wrong
   about why.
3. If my stage is right based on what's actually happening. (Founders
   almost always mis-stage themselves — too far ahead in their head,
   behind in reality. Push back if my self-assessment is off.)
4. One question I should be sitting with this weekend.

Be direct. No hedging. I don't need a coach — I need someone who'd
notice if I were drifting.

Why this works: The "things I'm telling myself that might be a lie" line is the unlock. Founders' self-reports are systematically biased toward the version of the story they wish were true. Naming that bias up front gives Claude permission to push back, which is the part you actually need.

Install it today.

Steps
  1. Create a Claude Project called Friday Founder Review.
  2. Paste the prompt above as the Project's custom instructions.
  3. Add a recurring Friday-4pm calendar event titled Friday Review — 20 minutes.
  4. Run it this Friday. Save the output. Re-read Monday before deciding what to work on.
Verify
After 4 weeks: scroll back through the outputs. The patterns that repeat are your real product. The things Claude pushed back on that you defended are usually where the work is.