What is Spec-Driven Development?
Spec-Driven Development is a workflow where you write a structured specification before asking Claude to implement anything. The spec is a Markdown document — or a small set of documents — that describes what to build, the data contracts, the expected behavior, constraints, and acceptance criteria. Claude reads the spec and builds to it.
The spec is the source of truth. Not the conversation. Not the code. Not Claude’s assumptions about what you probably meant.
A good spec has four properties:
Readable by Claude. Write it in plain language. No diagrams that only exist in your head, no acronyms that aren’t defined somewhere in the file.
Specific enough to converge. If you handed the same spec to two different Claude sessions with no other context, they should make the same key design decisions. If they’d diverge significantly, the spec needs more precision.
Small enough to fit in context. A spec is not a novel. If it fills the context window, split it. Each spec should cover one coherent piece of work.
Reviewable before any code is written. The whole point is that a human reads it and signs off before implementation starts. If no one reviews the spec, you’ve skipped the alignment step that makes SDD worthwhile.
Basic Workflow vs SDD
The difference between a basic Claude session and SDD comes down to where alignment happens.
In a basic workflow, you describe what you want in a prompt, Claude builds something, you look at the code, realize it’s not quite right, and you correct it. This loop repeats until the result is close enough. The alignment is happening through the code itself — which is the most expensive place for it to happen.
In SDD, you write the spec first. A human reviews it. Only then does Claude build. You review the code against the spec rather than against a vague mental model. There’s a shared, written definition of “done” from the start.
direction: right
basic: Basic Workflow {
direction: down
a: User prompt
b: Claude builds
c: User reviews code
d: Correct?
e: Done
a -> b
b -> c
c -> d
d -> a: no
d -> e: yes
}
sdd: Spec-Driven Development {
direction: down
a: User writes spec
b: Human reviews spec
c: Claude builds to spec
d: User reviews code
e: Done
a -> b
b -> c
c -> d
d -> e
} Correcting a spec costs minutes. Correcting code costs hours, and sometimes requires explaining to Claude why the thing it just built isn’t what you wanted — a conversation that could have been a one-line edit to a spec document.
How It’s Organized
A spec typically has five sections. None of them are optional.
Goal. One paragraph. What does this component do, and why does it exist? This is the context Claude needs to make sensible decisions when the spec doesn’t cover every edge case.
Inputs / Outputs. Data schemas, types, and formats. For a Python function: the argument types and the return type. For an API: the request body schema and the response structure. Be precise — “a DataFrame” is not specific enough; “a pandas DataFrame with columns customer_id (str), tenure_months (int), monthly_spend (float)” is.
Behavior. What the component does, step by step. This is not pseudocode, but it’s close. Describe the logic at a level where an engineer could implement it without guessing.
Constraints. What the component must not do. Performance requirements. External dependencies and their versions. Compatibility requirements with other parts of the system.
Acceptance criteria. The conditions under which you’ll call this done. Each criterion must be testable. “Fast” is not a criterion. “Processes 100,000 rows in under 3 seconds on a standard laptop” is.
Here’s a minimal spec for a feature engineering function in a churn prediction model:
## Goal
Bin the `tenure_months` column into four labeled categories for use as a
categorical feature in the churn model. Avoids treating tenure as linear
when the relationship with churn is non-linear in our data.
## Inputs / Outputs
Input: pandas DataFrame with a `tenure_months` column (int, non-negative).
Output: the same DataFrame with a new column `tenure_bucket` (str, categorical).
## Behavior
1. Apply the following fixed bins: 0–6 months → "new", 7–24 → "growing",
25–60 → "established", 61+ → "mature".
2. Bins are closed on the right (i.e., 6 maps to "new", 7 maps to "growing").
3. The original `tenure_months` column is preserved unchanged.
4. The function must be stateless — no fitting step, same bins every call.
## Constraints
- Python 3.11+, pandas 2.x
- No external dependencies beyond pandas
- Must work inside a sklearn Pipeline (implements fit/transform interface)
## Acceptance Criteria
- [ ] Given a DataFrame with tenure_months [0, 6, 7, 24, 25, 60, 61, 120],
the output tenure_bucket values are
["new", "new", "growing", "growing", "established", "established", "mature", "mature"].
- [ ] Given a DataFrame with no rows, the function returns an empty DataFrame
with the tenure_bucket column present.
- [ ] The function raises ValueError if tenure_months contains negative values.
Short. Concrete. Reviewable in two minutes.
When Is SDD Better?
SDD adds overhead. Writing a spec, reviewing it, then implementing to it takes more time than just prompting Claude directly. For small tasks, that overhead isn’t worth it.
SDD pays off when one or more of these is true:
- The task is large — more than roughly two hours of implementation work
- Multiple people need to agree on what gets built before work starts
- The output has a data contract: an API response schema, a model input format, a file format that other code depends on
- You’ll need to reproduce or extend this work in a future session
- The work spans multiple Claude sessions, where context from session one won’t carry to session two
A one-liner like @src/utils.py add a type hint to this function doesn’t need a spec. “Build a feature store integration for our training pipeline” does.
Tip
If you’re unsure whether a task needs a spec, ask yourself: could two different engineers read this prompt and build the same thing? If the answer is no, write a spec.