I'm Ryuta Hamamoto from TIMEWELL Inc.
In 2025, when Andrej Karpathy posted the term "Vibe Coding" on X (formerly Twitter), the development world changed overnight. Tools like Cursor, Cline, Devin, and GitHub Copilot Agent arrived in rapid succession, and the experience of "type a prompt, get working code" has become part of many engineers' daily routines. I use these tools constantly in TIMEWELL's development work myself. Productivity has genuinely gone up. But honestly, I've also been burned more than once trying to build an entire product through Vibe Coding alone. Code that AI generates looks like it works, only for a cascade of bugs to bring everything down three days later. Sprint ahead with vague specs and you end up with code that's completely incompatible with what your teammates wrote. After falling into those "AI development traps" over and over, I arrived at an idea. That idea is AI Spec-Driven Development (AI-SDD).
TDD and BDD — Tests First, or Behavior First?
Before getting into AI Spec-Driven Development, I want to take a brief look at the history of software development. Talking about new concepts without understanding how development methodologies have evolved leads nowhere solid.
Kent Beck formalized Test-Driven Development (TDD) in 2003. "Write a failing test first, implement the minimum code to make it pass, then refactor" — the Red-Green-Refactor cycle, as it became known, sent shockwaves through the software industry at the time. In a world where "write code first, test later" was the standard, it reversed the order entirely. By writing tests first, developers were forced to think about what the code needed to satisfy before writing a single line of implementation. The result was better design and earlier bug detection.
TDD does have a structural limitation, however. Test code is written in programming languages, which means no one outside of engineering can read it. If a product manager wants to verify that a feature meets business requirements, staring at JUnit or pytest code is not a realistic option. TDD ensures technical correctness, but business correctness falls outside its scope.
Behavior-Driven Development (BDD) emerged around 2006 to address this gap. Introduced by Dan North, the approach lifted TDD up a level. In BDD, "system behavior" is described in a near-natural-language format called Gherkin notation. The Given (precondition) → When (action) → Then (expected result) structure can be read not just by engineers, but by QA teams, product owners, and business stakeholders as well. Tools like Cucumber and SpecFlow convert Gherkin notation into executable tests, bringing the ideal of "specs that are also tests" one step closer to reality.
Research from Veriserve suggests that projects adopting BDD saw an average 30-40% reduction in rework caused by misaligned understanding between developers and business stakeholders [1]. In other words, BDD also functions as a communication tool for aligning the entire team on what needs to be built.
BDD has its own weaknesses, though. Writing scenarios in Gherkin notation is harder than it sounds, especially when complex business logic needs to fit into the Given/When/Then mold — scenarios either become bloated or remain too abstract to guide implementation. And more fundamentally, both TDD and BDD assume that humans write the code. In an era where AI generates code, is capturing tests and behavior enough?
Interested in leveraging AI?
Download our service materials. Feel free to reach out for a consultation.
Spec-Driven Development (SDD) — Making Specs the North Star
From the second half of 2025, as a reaction to Vibe Coding, the concept of Spec-Driven Development (SDD) began spreading rapidly. Erik Hanchett's talk at GitKon 2025, "Stop Vibe Coding Everything," was emblematic. His message was simple: unless you tell AI precisely what to build, AI cannot build something precise.
The core of SDD is thoroughly defining specs before writing any code. The "specs" in question are not the heavy, multi-hundred-page requirements documents of waterfall projects. They are user stories, acceptance criteria, API contracts, screen transitions, and enumerated edge cases — structured at a level of granularity that AI can interpret. In a case study from a KDDI engineering team implementing a message delivery platform, adopting SDD reportedly reduced the rework rate on AI-generated code by over 60% [2].
The SDD development flow breaks into three broad phases. In the Design phase, user stories and acceptance criteria are defined, and architecture and API design are locked in. In the Build phase, AI generates code based on the defined specs, and tests auto-generated from those specs are used for validation. In the Refine phase, specs are updated whenever changes arise, keeping them alive as a living document.
An important point here is that SDD does not negate TDD or BDD. On the contrary, SDD legitimately carries on their lineage. Just as TDD ensured code quality through tests-first and BDD aligned with business requirements through behavior-first, SDD governs the entire development trajectory through specs-first [3].
The State of AI-Driven Development — The Illusion and Reality of Full Automation
In parallel, the term "AI-driven development" gained widespread traction throughout 2025. According to Nikkei Cross Tech, 2026 is shaping up to be the year when the seeds of "fully automated development" — where AI covers the entire development lifecycle from requirements definition through design, implementation, testing, and operations — begin to germinate [4]. Major domestic SIers have been accelerating their investment in AI-driven development, and at AI Engineering Summit Tokyo 2025, organized by Findy, Cursor's VP of Developer Experience Lee noted that "the best tool changed seven times in 2025 alone" [5].
From my own experience, though, the idea that "AI-driven development means handing everything to AI" is dangerous. TIMEWELL has integrated multiple AI development tools into our work, but AI delivers its true value only when given clear instructions. Code that AI writes from vague prompts may appear to work, but edge case handling is often missing and consistency with the existing codebase tends to break down. AI excels at following detailed instructions but struggles to infer intent from ambiguous ones [3].
This is not a flaw in AI — it is a characteristic. LLMs have learned vast patterns of code, so when given clear input they return remarkably accurate output. But given an instruction like "make it work nicely," without a definition of "nicely," the only possible response is something average and safe — boring and ultimately unusable code. This is the fundamental limitation of Vibe Coding.
AI Spec-Driven Development (AI-SDD) — Why the Combination of Specs and AI Is Essential
Let me bring the threads together. TDD ensures technical quality through tests. BDD aligns with business requirements through behavior. SDD defines the overall direction of development through specs. AI-driven development integrates AI's generative capabilities into the development process. All of these are correct approaches. But each one alone is insufficient.
AI Spec-Driven Development (AI-SDD), as I define it, is a methodology that intentionally fuses SDD with AI-driven development. In a single sentence: a development approach that maximizes spec quality in order to get the most out of AI.
Comparing it with prior methodologies: in TDD, test code was the primary artifact that drove development. In BDD, Gherkin scenarios played that role. In SDD, the spec document sits at the center. In AI-SDD, the spec document remains central, but the specs themselves are created and evolved through collaboration with AI, and AI references those specs throughout every stage of implementation, testing, and review. In other words, specs become the highest-priority information source in AI's context window.
Here is how the concrete workflow plays out. First, humans and AI collaborate to draft specs. At this stage, tools like Cursor, Claude, or ChatGPT are used to surface user stories, define acceptance criteria, design APIs, detail screen specifications, and enumerate edge cases. The key is not to have AI "write the specs," but to use AI as a sounding board, with humans ultimately authorizing the final specification. Simply having AI draft the specs and then asking "Are there any contradictions in this spec? Are there edge cases we've missed?" dramatically sharpens spec quality.
Next, the finalized specs are handed to an AI agent, which simultaneously generates implementation code and test code. This is the biggest difference from conventional SDD. In SDD, humans often wrote the code from the specs themselves, but in AI-SDD, AI reads the specs directly and generates both code and tests in one pass. When specs are sufficiently clear, AI generation accuracy improves dramatically. In measurements from an internal TIMEWELL project, raising the level of spec detail to "user story + acceptance criteria + at least five edge cases" pushed the first-pass pass rate of AI-generated code — the rate at which tests pass without any modification — from 35% to 78%.
Any changes or realizations that emerge during development are reflected in the specs immediately. This "continuous spec updating" is another core pillar of AI-SDD. If you modify only the code and leave the specs untouched, the next time AI generates code it will reference outdated specs and inconsistencies will appear. Specs must remain the project's Single Source of Truth at all times.
A Bird's-Eye View of the Four Methodologies — What Drives What?
At this point I want to revisit the relationship between TDD, BDD, SDD, and AI-SDD.
In TDD, a developer writes test code, and that test drives implementation. The focus is on whether code is technically correct, and the verification cycle runs in minutes to tens of minutes through Red → Green → Refactor. It depends heavily on individual engineering skill, and the connection to the business side is thin.
In BDD, the entire team — including product owners and QA — writes Gherkin scenarios, and those behavioral definitions drive implementation. The focus is whether the system behaves as users expect, with alignment to business value at the center of verification. It has a strong dimension as a communication tool and is well-suited for team development.
In SDD, the spec document drives the entire development process. Every step from requirements definition through testing and deployment circles back to the spec. The difference from BDD is that SDD specs are not limited to Gherkin scenarios — they comprehensively cover API definitions, data models, and non-functional requirements.
In AI-SDD, the spec document drives both humans and AI. Humans read the spec to make decisions; AI reads the spec to generate code. AI participates in the creation of specs, but final authorization rests with humans. This division of labor — humans decide What, AI executes How — is the essence of AI-SDD.
My commitment to this methodology comes from direct experience. TIMEWELL provides AI-powered business support services, and building products with AI ourselves has made one fact undeniably clear: AI output quality is rate-limited by input quality. No matter how powerful an LLM you use, vague specs produce vague output. Conversely, with solid specs in place, AI generates code of remarkable fidelity and quality. The key to unlocking AI's capabilities lies not in model performance, but in spec precision.
Making AI Spec-Driven Development the Common Language of Development Teams
Finally, I want to be candid about why I want to spread the concept of AI Spec-Driven Development.
As of 2026, AI development tools are evolving at a blistering pace. As Cursor's "the best tool changed seven times in a year" story illustrates, tools keep turning over. But something remains constant regardless of which tool you use. That something is specs. Whether you use Cursor, Copilot, or Claude Code, good specs produce good code. Bad specs produce bad code no matter which tool you pick.
As Nikkei Cross Tech has reported, 2026 is the year AI begins to cover every phase of the development lifecycle [4]. That is precisely why now is the moment to shift focus from "how to use AI" to "what to tell AI." Just as TDD changed how we write tests and BDD changed how teams communicate, AI-SDD changes how we write specs. And I am convinced that writing specs is the most fundamental skill in collaborating with AI.
The day when AI Spec-Driven Development becomes the common language of development teams is not far off.
References
[1] Veriserve. "What is Behavior-Driven Development (BDD)? A Comprehensive Guide from Differences with TDD to Gherkin Notation." https://www.veriserve.co.jp/helloqualityworld/media/20251008001/ (2025-10-08)
[2] KDDI Tech note. "Specs Lead, AI Implements — Practical Enterprise Development with Spec-Driven Development (SDD)." https://tech-note.kddi.com/n/ne68f4f243f19 (2025)
[3] TestCollab. "From Vibe Coding to Spec-Driven Development." https://testcollab.com/blog/from-vibe-coding-to-spec-driven-development (2025)
[4] Nikkei Cross Tech. "AI-Driven System Development Moves Toward Full Automation: The Revaluation of Engineering." https://xtech.nikkei.com/atcl/nxt/column/18/03401/120200007/ (2025-12)
[5] Findy. "Engineering Hiring and Org-Building Trend Forecast 2026." https://note.com/yuichiro826/n/n6338ef1df6b5 (2025)
