How much extra cost do Sub-agents incur?

Each sub-agent carries its own independent 200K-token context, so token consumption grows in proportion to the number of agents running in parallel. Finout's 2026 numbers suggest 4-7x more tokens used with Sub-agents and about 15x more with Agent Teams compared to standard usage. The median monthly spend per developer lands somewhere between $150 and $250.

Where should I place custom Sub-agents?

Put Markdown files in `.claude/agents/` for project-specific agents, or in `~/.claude/agents/` if you want them available across all projects. The YAML frontmatter requires `name` and `description`, and you can narrow privileges and cost with `tools` and `model`. For team sharing, commit `.claude/agents/` to Git as the baseline approach.

What should I watch out for when running in parallel?

Sub-agents do not talk to each other directly, so if two of them write to the same file at once the output gets corrupted. Concentrate write-heavy tasks into a single agent and fan out read-only investigation tasks in parallel for safety. Specifying `isolation: worktree` spins up a temporary git worktree automatically, keeping work areas separated.

Mastering Claude Code Sub-agents | Practical Patterns for Parallel Task Dispatch and Multiplied Coding Speed [2026 Edition]

Q: What is the difference between Claude Code Sub-agents and Agent Teams?

Sub-agents are a mechanism where a parent agent spawns child agents inside a single session, and only the parent receives the child's results. Agent Teams is a separate feature where multiple Claude Code sessions coordinate through a shared task list, and you enable it explicitly with `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1`. Sub-agents fit speed-oriented one-shot tasks, while Agent Teams fits long-running collaborative development.

Hello, this is Hamamoto from TIMEWELL.

This is the fifth installment in the Claude Code series, and the theme is Sub-agents. After writing about skills and configuration files, the last frontier left to cover is this territory of "running Claude in parallel." Once you get this right, a single-purpose coding AI turns into a small development team on the spot.

Up through last year, the dominant approach was a brute-force one: opening several terminals and juggling multiple sessions by hand. Since the start of 2026, automatic dispatch via the Agent tool has stabilized, and secondary tasks like code exploration, test execution, and documentation generation can now be cleanly separated from the parent session. What you actually feel is that while a sub-agent is off reading a book-sized codebase, your own chat window stays clean. That difference matters more than it sounds.

In this article, I will work through the basics of Sub-agents based on the official specification as of April 2026, explain how to use Explore, Plan, and Code in combination, walk through three real-world patterns, and close with cost management. Reading it alongside a summary of Claude Code skills should give you the full picture of the .claude/ folder.

The Basics of Sub-agents and the Three Built-in Agents

Sub-agents are child agents that the parent Claude Code session spawns on demand. Each one has its own independent 200K-token context window, its own system prompt, and a tightly scoped set of tool permissions. The parent calls them when it decides "this task should be delegated," and when they finish they return only a summary back to the parent. The biggest upside is that logs and search results never flood back into the parent's conversation window.

Anthropic ships three built-in Sub-agents. Explore is a read-only high-speed investigator that runs on Haiku and handles codebase search and structural understanding. Plan is the researcher invoked during Plan Mode. It also has write permissions stripped out and specializes in the research phase that precedes plan drafting. And general-purpose is the all-rounder with full tool access, assigned to complex tasks where exploration and modification both need to happen. The parent matches the user's request against each sub-agent's description and decides which child to hand off to automatically.

One thing worth noting here is that in version 2.1.63 released in October 2025, the Task tool was renamed to the Agent tool. The old Task(worker) notation you find in older articles and configuration files still works as a compatibility alias, but the official notation for new code is Agent(worker). The docs have fully migrated to Agent already, so when someone on your team writes a sub-agent, this is the only rename to keep in mind.

Another important spec: sub-agents cannot spawn further sub-agents. This is a hard limit to prevent infinite nesting, so scenarios like "a Plan agent calling Explore inside itself" simply cannot happen. When you design your parallelization, you have to assume this one-level restriction from the start.

Writing Custom Sub-agents and YAML Frontmatter

The three built-ins cover fewer situations than you might expect. A code reviewer aligned with your team's coding standards, an implementer who knows a specific framework inside out, a documenter who writes release notes. Defining these role-fixed agents means the parent no longer has to write long instructions every time.

Custom Sub-agents are Markdown files placed in .claude/agents/ (project scope) or ~/.claude/agents/ (user scope). The YAML frontmatter at the top of the file declares the metadata, and the body of the file becomes the system prompt verbatim. The structure is straightforward.

---
name: code-reviewer
description: 経験豊富なレビュアー。コード変更後に自動で呼び出され、品質・セキュリティ・慣習違反を指摘する / Seasoned reviewer automatically invoked after code changes, flagging quality, security, and convention issues
tools: Read, Grep, Glob, Bash
model: sonnet
memory: project
---

You are a senior code reviewer. Read the diff and return specific, actionable feedback from the perspectives of quality, security, and best practices.

The only required fields are name and description. If you omit tools, the sub-agent inherits everything the parent has; if you specify it, it is limited strictly to what you list. Narrowing aggressively here is the mark of a professional. A reviewer that only needs Read, Grep, and Glob has no business holding write permissions. A sub-agent with broad privileges is more likely to cause unintended file changes, and the resulting debugging is painful.

The model can be switched per sub-agent. Specifying haiku gives you a fast, low-cost read-only worker; writing opus or a full model ID like claude-opus-4-7 targets complex reasoning. You can optimize to the role. Simply moving every investigation-type sub-agent to Haiku cuts monthly token consumption noticeably.

A surprisingly common trap is how you phrase description. According to Anthropic's engineering blog, the parent agent uses sub-agent descriptions as routing hints. Vague phrasing like description: front-end stuff degrades routing accuracy, while something concrete like description: inspects React Server Component boundaries and the placement of 'use client' gets invoked much more often. The trick is to make the name and description job-shaped, meaning the role is obvious at a glance.

The Explore-Plan-Code Pattern and Three Practical Invocation Examples

Sub-agents truly shine when a large task is broken down into "investigate," "plan," and "write," each sealed into its own context. This is the Explore-Plan-Code pattern that Anthropic recommends in its official best practices, and once I started consciously following this three-stage split, Claude Code's output stabilized noticeably.

Pattern 1: Parallel Explore + Central Plan + Sequential Code

This is for the investigation phase before a larger feature change, where you run two to four read-only Explore agents in parallel. For a request like "I want to migrate the authentication stack to OIDC," the parent dispatches multiple Explore agents via the Agent tool, for example:

Explore A: enumerate existing auth-related files and entry points
Explore B: audit test coverage and mocking status
Explore C: check CI settings, environment variables, and dependency library versions

Each of these runs in its own independent 200K context, and only the summaries come back to the parent. The parent bundles those summaries, hands them to a Plan sub-agent, and has the design document drafted in think mode. Finally, general-purpose writes the code. Because investigation and implementation contexts never contaminate each other, the parent's chat window stays clean throughout.

Pattern 2: Destructive Experiments Using isolation worktree

When you specify isolation: worktree, that sub-agent runs inside a temporary git worktree. The parent's working directory is untouched no matter what, which suits refactoring experiments or major library version bumps where you want the freedom to roll back after a failure.

---
name: migration-explorer
description: 依存ライブラリのメジャーバージョンアップを試験的に実行し、差分と影響範囲をレポート / Experimentally performs major version upgrades of dependencies and reports diffs and impact scope
tools: Read, Edit, Write, Bash, Grep, Glob
isolation: worktree
model: sonnet
---

If the sub-agent does not leave changes behind, the worktree itself is auto-deleted. Being able to say "let's just try it, and if it does not work we throw it away" in a physically safe way is a genuinely large drop in psychological cost.

Pattern 3: A code-reviewer Agent with Auto-dispatch

This is the pattern of keeping a single specialized code-reviewer agent on staff, automatically triggered before a commit or a PR. Specifying memory: project has it accumulate review patterns and recurring issues under ~/.claude/agent-memory/, and the more you use it the sharper its feedback becomes.

In my own environment, I include a phrase like "invoked automatically before pushing" in the description, which gets it called whenever the parent detects a code change. The reviewer returns only feedback bucketed into three severity levels, critical, warning, and suggestion. Since implementation sub-agent logs do not bleed in, you get both review precision and parent-session readability.

Combining all three patterns means that a single /feat-style request can end up running six agents: Explore x3, Plan x1, Reviewer x1, Coder x1. It might look like overkill, but each sub-agent has a crisp role and a tight maxTurns, so runaway behavior is rare.

How Sub-agents Differ From Agent Teams, and How to Choose

Sub-agents are often confused with Agent Teams, which arrived in 2026 as an experimental feature. The official documentation treats them as distinctly separate, and Agent Teams requires the CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 flag to enable. The first impression is "aren't they both parallel?" but their design philosophies differ.

Sub-agents build a parent-child relationship inside a single session, and the children do not directly recognize each other. Results always return to the parent, and the children sit in an independent fan-out shape. You can just as well call this an orchestrator-worker pattern, and it overlaps with the multi-agent research system that Anthropic has published. It fits one-shot investigations and parallel implementation, and responds quickly.

Agent Teams, by contrast, is a model where multiple Claude Code sessions coordinate through a shared task list. Member agents can inspect the progress of other members in real time, declare dependencies, and avoid collisions as work proceeds. In implementation, each member runs in its own process. Anthropic emphasizes that "unlike simply running in multiple terminals in parallel, the members recognize each other," and this is exactly the point.

My decision axis for picking one over the other is simple: "is this a short one-shot job, or coordinated work that spans days?" If I want to wrap a feature addition in 30 minutes first thing in the morning, Sub-agents is enough. For a refactoring project that spans weeks, Agent Teams. According to Finout's 2026 pricing guide, Agent Teams uses roughly 15x the tokens of normal usage, while Sub-agents use 4-7x. The cost side also lines up with this split.

They can be combined, incidentally. You can have each Agent Teams member spawn its own Sub-agents, producing a two-tier design where the top tier coordinates and the lower tier fans out. Running this two-tier structure in production is heavy, so I would not aim for it on day one.

Cost Management and Operational Checkpoints

"Powerful but token-hungry" is the fate of parallel agents. Unless you bake countermeasures into your process, you end up surprised at the month-end bill. Here are four operational rules that have worked on the ground.

The first is to aggressively push investigation work onto Haiku. The fact that Explore defaults to Haiku by design reflects this thinking. Setting model: haiku on your custom investigation sub-agents significantly reduces per-token cost. Meanwhile, sub-agents that sit in the Plan position and handle decisions and design should lean on Sonnet or Opus. Splitting work across models buys you a better balance of quality and cost.

The second is writing tools permissions "from minimum upward." Start with tools: Read only, and add more when you actually hit a gap. Do not casually include Write or Edit, and even Bash should be handed over only for narrowly scoped needs. A permissive sub-agent is more likely to cause unintended token consumption (for instance, hammering git status or leaving tests running indefinitely).

The third is setting maxTurns explicitly. If you leave it blank, it inherits the parent session's settings, which makes runaway behavior harder to notice. Capping investigation at 5-8 turns and implementation at around 15 limits any damage during failures to a few hundred tokens at most.

The fourth is writing your retrospectives into CLAUDE.md. Briefly recording your go-to sub-agent combinations, dispatch decision rules, and failure patterns makes subsequent sessions smarter. I also touched on this in a piece surveying the entire .claude folder, but I believe how you cultivate memory is the single largest variable in Claude Code satisfaction.

As an aside, in my environment I have split code review and security checks into separate sub-agents, then bundled them into /review and /security slash commands following the philosophy of the superpowers plugin. Running these two first thing in the morning cuts down drastically on overlooked review perspectives. When you shift work into the tooling layer, humans free up time for judgment and decision-making.

How Parallel Sub-agents Reshape Team Development

What makes Claude Code interesting is that it is shifting from "a productivity tool for individuals" into "an environment for simulated team development." Sub-agents are more than a parallelization feature. They demand that you break down your own development process into role assignments and manage it. Coding time decreases, and time thinking about agent design increases. Opinions on that trade-off will differ, but I welcome it.

In our client work, the AI consultants on WARP help companies build internal standards for Claude Code as part of their DX initiatives. A typical engagement bundles the CLAUDE.md template, a sub-agent library, and permission and cost guardrails, all scoped to a three- to six-month rollout. For organizations that want to standardize their AI development environment while protecting internal knowledge, we often combine this with our enterprise GraphRAG platform ZEROCK, so that internal policies and design patterns are always accessible to the AI.

Looking at AI development tools heading into later 2026, I expect parallelism to keep increasing. Designing each Sub-agent deliberately, naming and describing them so they get called automatically. Building this muscle now means that when another large shift arrives next year, you will be able to absorb it without panic. Start by writing a single code-reviewer into .claude/agents/. Even that alone changes the view.

References

[^1]: Anthropic "Create custom subagents" Claude Code Docs. https://code.claude.com/docs/en/sub-agents [^2]: Anthropic "How and when to use subagents in Claude Code" Claude Official Blog. https://claude.com/blog/subagents-in-claude-code [^3]: Anthropic "Orchestrate teams of Claude Code sessions" Claude Code Docs. https://code.claude.com/docs/en/agent-teams [^4]: Anthropic "How we built our multi-agent research system" Anthropic Engineering. https://www.anthropic.com/engineering/multi-agent-research-system [^5]: claudefa.st "Claude Code Sub-Agents: Parallel vs Sequential Patterns". https://claudefa.st/blog/guide/agents/sub-agent-best-practices [^6]: Finout "Claude Code Pricing 2026: Complete Plans & Cost Guide". https://www.finout.io/blog/claude-code-pricing-2026

Mastering Claude Code Sub-agents | Practical Patterns for Parallel Task Dispatch and Multiplied Coding Speed [2026 Edition]

Mastering Claude Code Sub-agents | Practical Patterns for Parallel Task Dispatch and Multiplied Coding Speed [2026 Edition]

The Basics of Sub-agents and the Three Built-in Agents

Writing Custom Sub-agents and YAML Frontmatter

The Explore-Plan-Code Pattern and Three Practical Invocation Examples

Pattern 1: Parallel Explore + Central Plan + Sequential Code

Pattern 2: Destructive Experiments Using isolation worktree

Pattern 3: A code-reviewer Agent with Auto-dispatch

How Sub-agents Differ From Agent Teams, and How to Choose

Cost Management and Operational Checkpoints

How Parallel Sub-agents Reshape Team Development

References

How well do you understand AI?

Newsletter

あなたのAIリテラシー、診断してみませんか？

Related Knowledge Base

Solutions

Learn More About テックトレンド

Related Articles

Anthropic's $965 Billion Raise and What the Agentic Economy Means for Japanese Companies

Which AI Are Startups Actually Paying For? Reading Into a16z's "AI Apps 50"

Claude Dynamic Workflows — A Comprehensive Guide: What's New and What It Means for Enterprise AI

Newsletter