Spec-Driven Agentic Development
Stop feeding vague ideas to AI agents. Collect the scattered requirements, synthesize them into a spec, review it as a team, freeze it, then build. The methodology that eliminates rework.
You have a feature to build. The requirements are in a Linear ticket. Some context is buried in a Slack thread from two weeks ago. There are notes from a meeting you half-remember. A teammate mentioned an edge case in a standup. Your tech lead has opinions about the API design that live exclusively in their head.
Now you sit down, open Claude Code, and type: "Build the user notification system."
What you get back will be technically impressive and fundamentally wrong. Not because the AI is bad — but because your input was scattered, incomplete, and ambiguous. I wrote about this in The Verification Gap: vague input, vague output. It’s the single most expensive mistake in AI-assisted development, and most teams make it every day.
There’s a better way. It’s not a new tool. It’s not a framework. It’s a discipline.
Write the spec first. Review it as a team. Freeze it. Then build.
The Scattered Input Problem
Every feature starts as distributed knowledge. Requirements live in different places, in different formats, owned by different people:
- Linear/Jira: The ticket — usually a title and two sentences
- Slack: Three threads, two channels, one DM with the actual decision
- Meetings: Verbal agreements nobody wrote down
- Your head: Technical constraints you know from experience
- Someone else’s head: Edge cases, business rules, compliance requirements
- Figma/Docs: Design mockups that may or may not match the latest decisions
This is normal. Requirements should emerge from conversations and collaboration. The problem isn’t that the information is scattered — it’s that we skip the step of assembling it into a coherent specification before we start building.
Instead, we go straight from scattered inputs to code. We treat the AI agent like a mind reader. We assume it will figure out what we meant. And then we spend three days iterating on code that should never have been written in the first place — because the foundation was wrong.
Two Approaches
The traditional approach:
Scattered info → Vague prompt → Code → Review → "This isn't what I meant" → Rewrite → Review → Rewrite → …You iterate on code. Every cycle is expensive: the AI rewrites files, tests break, reviewers re-read everything, context shifts. Three rounds of code review later, someone realizes the API contract was wrong from the start.
The spec-driven approach:
Scattered info → Spec draft → Review spec → Iterate spec → Freeze spec → Build from spec → DoneYou iterate on a Markdown file. Cheap. Fast. No code to rewrite. No tests to fix. No merge conflicts. When the spec is right — when everyone agrees on exactly what to build — you hand it to the agent and get it right the first time.
The shift is simple: move the iteration from code to spec. Don’t review implementations. Review intentions.
The Workflow
Step 1: Gather Everything
Before you write a single line of spec, collect every piece of information about the feature. Open all the sources:
- The ticket (Linear, Jira, GitHub Issue)
- Relevant Slack threads (copy the important messages)
- Meeting notes or transcripts
- Design mockups or wireframes
- Existing code that will be affected
- Your own notes and mental model
Don’t organize it yet. Don’t filter it. Just dump it all into a single context. The messier the better — that’s the raw material.
Step 2: Let the Agent Synthesize
Now feed it all to Claude Code. Not as a build instruction — as a synthesis request.
Here's everything we know about the notification system feature:
[Paste the Linear ticket]
[Paste the Slack thread]
[Paste the meeting notes]
[Your own notes about technical constraints]
Create a detailed specification in a Markdown file at specs/notification-system.md.
The spec should include:
- Problem statement (what we're solving and why)
- Proposed solution (high-level approach)
- API contract (endpoints, payloads, responses)
- Data model changes (new tables, migrations)
- Edge cases and error handling
- Security considerations
- Out of scope (what we're NOT building)
- Open questions (things we still need to decide)Claude Code is extraordinarily good at this. It takes your scattered, messy, contradictory inputs and produces a structured, coherent document. It identifies gaps. It surfaces decisions you haven’t made yet. It turns implicit assumptions into explicit statements.
The output isn’t perfect — it’s a first draft. That’s the point. A first draft of a spec is infinitely more useful than a first draft of code.
Step 3: The Spec PR
Here’s where the magic happens. Create a pull request that contains exactly one file: the spec.
git checkout -b spec/notification-system
git add specs/notification-system.md
git commit -m "RFC: Notification system specification"
git push -u origin spec/notification-system
gh pr create --title "RFC: Notification system" --body "Spec for the notification system feature. Please review the approach, API contract, and data model before we start building."
No code. No implementation. Just a Markdown file describing what you’re going to build and how.
This is the most reviewable PR your team will ever see. A reviewer doesn’t need to understand code paths, trace execution flows, or run tests. They read a document and answer one question: "Is this what we should build?"
Step 4: Multi-Pass Review
This is where both humans and agents provide input. Multiple eyes, multiple passes:
Human reviewers catch:
- Business logic errors ("We also need to handle the enterprise tier differently")
- Missing requirements ("What about GDPR? Users need to opt out")
- Scope creep ("Let’s not build the analytics dashboard in v1")
- Political/organizational context the agent can’t know
AI agent reviewers catch:
- Technical inconsistencies ("The payload in section 3 contradicts the data model in section 5")
- Missing edge cases ("What happens if the notification queue is full?")
- API design issues ("This endpoint mixes creation and update semantics")
- Security gaps ("The webhook endpoint needs authentication")
You can even have Claude Code review its own spec from a different angle:
Review specs/notification-system.md as a senior backend engineer.
Focus on: scalability, failure modes, and operational concerns.
What's missing? What will break at scale? What will wake someone up at 3am?Or from the frontend perspective:
Review specs/notification-system.md as a frontend engineer.
Focus on: API ergonomics, loading states, error handling from the client side.
Is this API easy to consume? What edge cases will the UI need to handle?Each perspective surfaces different gaps. Each pass makes the spec more complete. And each iteration costs minutes, not hours, because you’re editing a Markdown file — not rewriting code.
Step 5: Freeze
When the team approves the PR — when every open question is resolved, every edge case addressed, every scope decision made — you merge it.
That merge is a contract. The spec is now frozen. It’s the single source of truth for what gets built. No more “actually, I was thinking...” after the fact. If requirements change, you update the spec first. The spec leads, the code follows.
Step 6: Build From the Frozen Spec
Now — and only now — do you start coding. And the prompt to Claude Code is trivially simple:
Implement the notification system exactly as described in specs/notification-system.md.
Follow the API contract, data model, and edge case handling specified in the doc.
Do not deviate from the spec without asking.The difference is night and day. Instead of a vague “build notifications,” you’re handing the agent a precise, reviewed, team-approved blueprint. The agent knows the endpoints, the payloads, the error codes, the data model, the security requirements, and the explicit scope boundaries.
Better input. Better output. Every time.
Anatomy of a Good Spec
A spec doesn’t need to be a novel. It needs to be clear, complete, and unambiguous. Here’s the structure that works:
# Feature: Notification System
## Problem Statement
What problem are we solving? Why now? Who is affected?
## Proposed Solution
High-level approach. Architecture decisions. Why this approach over alternatives.
## API Contract
Exact endpoints, methods, payloads, responses, status codes.
This is the part both frontend and backend engineers will reference daily.
## Data Model
New tables, columns, indexes, migrations.
Include the schema — don't leave it to interpretation.
## Business Rules
- Rule 1: Users receive max 10 notifications per hour
- Rule 2: Email notifications are batched every 15 minutes
- Rule 3: Push notifications are immediate for @mentions only
## Edge Cases
- What happens when the queue is full?
- What if a user has notifications disabled?
- What about users in timezones where it's 3am?
## Security Considerations
- Authentication on webhook endpoints
- Rate limiting on notification triggers
- PII handling in notification payloads
## Out of Scope (v1)
- Analytics dashboard
- Custom notification templates
- Third-party integrations (Slack, Teams)
## Open Questions
- [ ] Do we need read receipts?
- [ ] Should notifications persist after 90 days?
- [ ] What's the SLA for notification delivery?
## Dependencies
- Requires: Redis for queue (already provisioned)
- Requires: SendGrid for email (new integration)
- Blocked by: Auth service v2 migration (ETA: next sprint)The Out of Scope section is just as important as the requirements. It’s a explicit fence that says "don’t go here." Without it, agents (and engineers) will inevitably gold-plate features nobody asked for.
The Open Questions section forces honesty. If you don’t know something, say so. It’s better to surface an unresolved question during spec review than to discover it during code review when someone’s already built the wrong thing.
Why This Works
Iteration is Cheap
Rewriting a paragraph in a Markdown file takes seconds. Rewriting a service takes hours. When you iterate on the spec, every correction is a text edit. When you iterate on code, every correction is a refactor, a test update, a migration change, and a re-review.
Alignment Happens Before Execution
The most expensive bug in software is a misunderstanding. "That’s not what I meant" costs more than any runtime error. Spec review forces the team to align before anyone writes code. Disagreements surface at the cheapest possible stage.
Agents Get Better Input
An AI agent with a frozen spec produces dramatically better code than an agent with a Slack thread. The spec eliminates ambiguity. The agent doesn’t have to guess what you meant — it’s written down, reviewed, and approved.
The Spec is Documentation
When the feature ships, the spec stays in the repo. It’s not a throwaway artifact — it’s living documentation. Six months from now, when someone asks "Why does the notification system batch emails every 15 minutes?" the answer is in the spec, with the PR discussion that led to that decision.
It Scales to Teams
In a team of five, the spec PR becomes a coordination mechanism. Frontend knows the exact API contract. Backend knows the data model. The PM confirms the business rules. QA uses the edge cases section as a test plan. Everyone works from the same document.
The Agent-Human Review Loop
The spec review is where this approach becomes uniquely powerful in the age of AI. You’re not just doing human code review — you’re combining human judgment with AI analysis.
Here’s what a typical review cycle looks like:
- Agent drafts the spec from scattered inputs
- You review and add business context the agent couldn’t know
- Agent re-reviews from a different persona (security engineer, frontend dev, SRE)
- Teammate reviews and catches organizational/political blind spots
- Agent addresses all feedback and updates the spec
- Final human approval — merge
Each pass through the loop makes the spec tighter. The agent handles the tedious work — checking consistency, identifying gaps, formatting — while humans focus on judgment calls the agent can’t make: priorities, politics, product intuition.
The result is a spec that’s been reviewed by more “eyes” than any traditional design doc. And every one of those reviews happened on a Markdown file, not on a 2,000-line PR full of code.
Common Objections
"This slows us down"
It feels slower because you’re doing work upfront. But measure the total time from idea to shipped feature. Teams that skip the spec spend 60% of their time in rework loops: code review comments, miscommunication, wrong assumptions, features nobody asked for. The spec eliminates most of that rework.
Writing the spec takes an hour. Reviewing it takes a day. But building from a frozen spec takes a fraction of the time it takes to iterate code through five rounds of “that’s not what I meant.”
"We don't know enough to write a spec yet"
Good. That’s exactly what the Open Questions section is for. A spec with ten open questions is better than no spec. It makes the unknowns visible. The team can discuss them before someone builds the wrong thing based on an assumption.
If you truly can’t write any spec, you don’t have a feature — you have a research task. Spec that instead: “Spike: investigate notification delivery options, output a recommendation by Friday.”
"The spec will get outdated"
It will — if you don’t treat it as a living document. The rule is simple: if the code changes, the spec changes first. The spec PR becomes the proposal. The code PR follows. This takes discipline, but it’s the same discipline as keeping your API docs in sync. And it’s far cheaper than the alternative: code that nobody understands six months later.
"This is just waterfall"
It’s not. Waterfall means months of specs nobody reads followed by months of code that doesn’t match. Spec-driven development means one focused spec per feature, reviewed in a day or two, then built immediately. The feedback loop is tight. The spec is small. You’re still shipping frequently — you’re just shipping the right thing.
Setting It Up
The tooling is trivial. You probably have everything you need.
Repository Structure
your-repo/
├── specs/ # All feature specs live here
│ ├── notification-system.md
│ ├── payment-refunds.md
│ └── user-onboarding-v2.md
├── src/
├── tests/
└── ...PR Naming Convention
# Spec PRs
RFC: Notification system
RFC: Payment refund flow v2
RFC: User onboarding redesign
# Implementation PRs (reference the spec)
Implement notification system (specs/notification-system.md)The RFC: prefix makes it instantly clear: this is a spec review, not a code review. Different reviewers, different expectations, different review criteria.
CLAUDE.md Integration
Add this to your project’s CLAUDE.md so the agent always respects specs:
## Spec-Driven Development
- Before implementing any feature, check if a spec exists in `specs/`
- If a spec exists, follow it exactly. Do not deviate without asking.
- If no spec exists for a non-trivial feature, create one first.
- Implementation PRs must reference the spec they implement.A Spec-Writer Skill
Create a reusable Claude Code skill that automates spec generation:
# .claude/skills/write-spec/SKILL.md
---
name: write-spec
description: Create a feature specification from scattered requirements
disable-model-invocation: true
argument-hint: [feature-name]
---
Create a detailed specification for: $ARGUMENTS
Save to: specs/$0.md
Include these sections:
1. Problem Statement
2. Proposed Solution
3. API Contract (if applicable)
4. Data Model Changes (if applicable)
5. Business Rules
6. Edge Cases
7. Security Considerations
8. Out of Scope (v1)
9. Open Questions
10. Dependencies
Ask me for any missing context before drafting.
Use existing code in the repo to inform technical decisions.Now /write-spec notification-system triggers the entire spec creation workflow.
Closing the Verification Gap
In The Verification Gap, I argued that the biggest risk in AI-assisted development isn’t bad code — it’s the inability to verify that what the agent built is what you actually wanted. The gap between intention and implementation.
Spec-driven development attacks that gap directly. The spec is the intention, made explicit. When the spec says “POST /notifications returns 201 with the notification ID” and the code does exactly that, verification is trivial. You’re comparing implementation against a written contract, not against a vague idea in someone’s head.
The verification gap doesn’t disappear. But it shrinks from "Did the agent understand what I was thinking?" to "Does the code match the spec?" — and that’s a much easier question to answer.
The Discipline That Pays
Spec-driven agentic development isn’t complicated. It’s a Markdown file and a PR review. The tools are trivial.
The hard part is the discipline: resisting the urge to skip straight to code. It feels productive to see an agent generating files immediately. It feels slow to write a document about what you’re going to build.
But every team that adopts this pattern reports the same thing: total feature delivery time goes down. Not because the coding is faster — but because the rework disappears. The misunderstandings vanish. The “that’s not what I meant” conversations stop.
Better input, better output. It was always that simple.