Claude Code Agent Teams: The Complete Guide to AI Agents Collaborating, Reviewing Each Other, and Raising the Bar on Quality
Ryuta Hamamoto, TIMEWELL.
On February 5, 2026, Anthropic released "Agent Teams" alongside Claude Opus 4.6 — a new feature with the potential to fundamentally transform how Claude Code is used. Until now, AI agents have operated on a "one agent, one task" model. Agent Teams changes that: multiple Claude sessions work together as a team, progressing through work by actively communicating with one another.
When I first tried the feature, what struck me immediately was watching AI agents debate with each other. The research agent gathered data. The analysis agent asked, "What are the assumptions behind these numbers?" The review agent pushed back: "Isn't that assumption too optimistic?" It was essentially a human team meeting — happening entirely in the world of AI.
This article covers everything: how Agent Teams works, how to set it up, practical usage patterns, and most importantly, what actually changes about output quality.
What You'll Learn
- The core concept behind Agent Teams and how it fundamentally differs from traditional sub-agents
- Step-by-step setup instructions (reproducible even for non-engineers)
- How mutual review between agents drives quality improvements
- Five practical orchestration patterns you can use today
- The real-world constraints — including cost — you need to understand
1. What Is Agent Teams? From "Disposable" to "Team"
Core Structure
Agent Teams is a mechanism within Claude Code that coordinates multiple Claude instances in parallel as a team. It has four components:
| Component | Role |
|---|---|
| Team Lead | The main Claude session. Creates the team, manages members, and oversees the work |
| Teammates | Independent Claude instances each handling specific tasks, with defined specializations |
| Task List | A shared work queue across the entire team. Members autonomously pick up and complete tasks |
| Mailbox | The messaging system between agents. Supports both direct messages and team-wide broadcasts |
How It Differs from Traditional Sub-Agents (Task Tool)
Understanding Agent Teams requires knowing how it differs from the existing sub-agent approach.
| Attribute | Sub-Agents (traditional) | Agent Teams (new) |
|---|---|---|
| Lifespan | Terminated after task completes (disposable) | Persists until explicitly shut down |
| Communication | Reports only to the main agent | Members can communicate directly with each other |
| Coordination | Main agent manages everything | Autonomous task distribution via shared task list |
| Context Retention | Resets with each task | Context is maintained throughout the session |
| Correction Instructions | Requires full restart from scratch | Corrections can be sent directly to the same agent |
| Best Used For | Focused tasks where only the result matters | Discussion, review, and iterative improvement |
The critical difference is being able to send correction instructions directly. With traditional sub-agents, once a task ends the agent literally disappears — if you want to say "fix this part," you have to launch a new agent and explain everything from the beginning. With Agent Teams, you can send corrections directly to the same team member. In any work where quality matters, this is a massive difference.
Struggling with AI adoption?
We have prepared materials covering ZEROCK case studies and implementation methods.
2. Setup: Agent Teams in 10 Minutes
Agent Teams is a Research Preview (experimental) feature and is disabled by default. Here's how to enable it.
Step 1: Install tmux (Recommended)
Agent Teams can assign each teammate a dedicated terminal pane. This split view requires tmux.
# macOS
brew install tmux
# Verify installation
tmux -V
Step 2: Edit the Configuration File
Add the following to ~/.claude/settings.json (global settings):
{
"env": {
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
},
"teammateMode": "tmux"
}
There are three display modes to choose from:
| Mode | Description | Requirements |
|---|---|---|
"auto" (default) |
Split panes inside tmux; in-process otherwise | None |
"in-process" |
All members run within the main terminal | None |
"tmux" |
Each member gets a dedicated tmux pane | tmux required |
Note: Split panes don't display correctly in VS Code's integrated terminal, Windows Terminal, or Ghostty. If you're using tmux mode, launch from macOS Terminal or iTerm2.
Step 3: Create a tmux Session and Launch Claude Code
# Navigate to your project directory
cd your-project
# Create a tmux session
tmux new -s my-team
# Launch Claude Code
claude
Step 4: Enter a Prompt to Start the Team
Create a team of 3 agents:
- researcher: handles market research
- analyst: handles data analysis
- reviewer: handles quality review
Create a market analysis report in docs/analysis.md.
That's it. Claude Code becomes the team lead, launches the three teammates, and begins parallel work. At first the lead organizes its thinking alone, but before long tmux panes split and you can watch multiple agents working simultaneously.
3. How Quality Actually Changes — The Real Value of Agent Teams
The true value of Agent Teams isn't that things get faster in parallel. It's that output quality improves through dialogue between agents.
3-1. How Mutual Review Raises Quality
With a single AI, asking it to "critically review your own answer" doesn't work well — the agent is biased toward its own perspective and rarely challenges its own foundational assumptions.
With Agent Teams, a separate Claude instance with a completely independent context performs the review. This enables:
- Assumption verification: The reviewer asks "what's your basis for that?" when the analyst takes something for granted
- Independent research-backed challenges: The reviewer conducts its own web searches, finds industry benchmarks and competitor data, and uses them to validate claims
- Automated correction cycles: The loop of critique → revision → re-review runs continuously without human intervention
In one real example, the agent handling financial analysis used a 15% gross margin assumption. The reviewer flagged it: "This assumption may diverge from historical actuals" and "the repeat rate target looks optimistic against industry benchmarks" — and notably, the reviewer had done its own web search to pull that industry data before making the call. The result: additional conservative scenarios, explicit documentation of assumptions, and a dramatically more credible analysis.
3-2. CONDITIONAL-GO: Staged Quality Gates
Agent Teams naturally produce a staged review pattern like this:
| Verdict | Meaning |
|---|---|
| GO | No issues — proceed as-is |
| CONDITIONAL-GO | Conditional approval. Approved once Must Fix items are resolved |
| NO-GO | Fundamental problem. The approach needs rethinking |
The power here is that the reviewer can send feedback directly to whoever handles the revision. The critique → fix → re-review loop that previously required human intermediaries now runs autonomously between agents.
3-3. Three Quality Improvement Patterns
Here are the recurring patterns that consistently produce better output:
Pattern 1: Competing Hypothesis Testing Multiple agents each form a different hypothesis and then try to validate or disprove each other's. Like a scientific debate — converging on the most defensible conclusion.
Pattern 2: Specialized Layered Review Agents with distinct areas of focus — security, performance, test coverage — each review from their specific perspective. Catches blindspots that any single reviewer would miss.
Pattern 3: Pipeline-Based Incremental Quality Research → Analysis → Strategy → Review. Each phase validates and builds on the previous phase's output. Task dependencies create a natural sequencing mechanism.
4. Messaging: write vs. broadcast
There are two types of communication between teammates.
write (Direct Message)
Send a message directly to a specific teammate. Used for one-to-one exchanges — like passing data from the researcher to the analyst.
broadcast (Team-Wide Notification)
Send a message to all teammates simultaneously. Note: token costs scale with team size, so use this sparingly. Reserve it for situations where everyone genuinely needs to know immediately — critical direction changes, urgent problem discoveries.
Message Types
The system also uses several internal message types:
| Message Type | Purpose |
|---|---|
| Plain text | General dialogue between agents |
shutdown_request |
Leader requests a member to terminate |
idle_notification |
Member signals task completion |
task_completed |
Task completion notification |
plan_approval_request |
Member in plan mode requests leader approval |
5. Five Practical Orchestration Patterns
Pattern 1: Parallel Expert Review
Create a team of 3 agents to review PR #142:
- Security review specialist
- Performance impact reviewer
- Test coverage verifier
Each should surface issues from their specific perspective.
Pattern 2: Research → Analysis Pipeline
Create a team to conduct a market analysis:
1. researcher: collect industry data first
2. analyst: run quantitative analysis based on researcher's data
3. strategist: design strategic options from the analysis
4. red-team: critically review the whole output
Set dependencies and progress in order.
Pattern 3: Competing Hypothesis Debugging
Users are reporting WebSocket connections dropping after one message.
Create a team of 5 agents, each investigating a different hypothesis.
Validate and challenge each other's hypotheses, then identify the most likely root cause.
Pattern 4: Plan-Approval Refactoring
Teammates can be required to submit a plan before implementing. This ensures leader approval before any changes are made, preventing unintended modifications.
Create a team to refactor the authentication module.
Each member must submit a plan and receive approval before beginning implementation.
Pattern 5: New Product Planning
Create a team of 3 agents and begin.
Create a new product proposal in docs/product-plan.md.
Make reasonable assumptions where information is missing, and document all assumptions at the top.
Share progress every 5 minutes and converge on a single proposal within 30 minutes.
6. Constraints and Considerations
Agent Teams is still in Research Preview. Before putting it into production workflows, understand these limitations.
Cost
Each teammate is an independent Claude instance. API costs scale proportionally with team size. In one reported internal test at Anthropic — a 16-agent parallel run on a large project — API costs reached approximately $20,000. Start with small 2–3 person teams.
File Conflicts
If multiple teammates edit the same file simultaneously, overwrites can happen. When designing tasks, clearly partition which files each member owns — the same thinking as branching in human team development.
Technical Constraints
| Constraint | Details |
|---|---|
| No session restoration | Teammates are not restored with /resume or /rewind |
| No nested teams | Teammates cannot create their own sub-teams |
| Fixed team lead | The team leader cannot be transferred |
| One team per session | Multiple concurrent teams are not supported |
| Heartbeat | Members inactive for 5 minutes are automatically marked idle |
When to Use Each Approach
Agent Teams isn't the right tool for everything. Use this to decide:
Agent Teams is the right choice when:
- The work requires discussion or review between members
- Multiple perspectives directly affect output quality
- You anticipate a cycle of critique → revision → re-review
Traditional sub-agents are the right choice when:
- The work is focused and only the result matters — no discussion needed
- Tasks involve heavy edits to shared files (high conflict risk)
- You need to minimize token costs
7. Keyboard Shortcuts
Useful shortcuts for working with Agent Teams:
| Shortcut | Action |
|---|---|
Shift+↑/↓ |
Select a teammate (in-process mode) |
Enter |
Show the selected teammate's session |
Escape |
Interrupt the current teammate's turn |
Ctrl+T |
Toggle the task list display |
Shift+Tab |
Switch to delegate mode (prevents the leader from implementing directly) |
8. Enterprise Considerations and the Road Ahead
Agent Teams is powerful, but there are important considerations for serious enterprise adoption.
Security and Governance: Teammates inherit the leader's permission settings. If the leader has file system access, every teammate does too. For projects involving sensitive information, permission scopes need to be set carefully.
Quality Hooks: Agent Teams provides two hook events — TeammateIdle and TaskCompleted. These can be used to build custom quality gates: automatically running tests when a task completes, or rejecting outputs that don't meet defined standards.
Cost Management: You can specify which model each teammate uses (Opus / Sonnet / Haiku). Assigning Opus to review roles and Sonnet to research roles, for example, lets you match model capability to task importance and optimize costs accordingly.
For running AI agents at this enterprise scale — safely and efficiently — TIMEWELL offers ZEROCK. ZEROCK is an enterprise AI platform built with the foundations you need: GraphRAG-powered knowledge control, data management on domestic AWS servers, and a prompt library. As you bring cutting-edge capabilities like Agent Teams into business operations, ZEROCK provides the secure foundation to do it on.
Summary
The most significant shift Agent Teams brings is AI quality control being performed by AI itself.
- Mutual review: Independent-context agents validate each other's assumptions
- Automated correction cycles: Critique → revision → re-review loops run without human intervention
- Staged quality gates: GO / CONDITIONAL-GO / NO-GO verdicts emerge naturally
- Competing hypothesis testing: Multiple perspectives converge on the most defensible conclusion
Human roles shift from "worker" to "final decision-maker." AI teams handle analysis and quality management; humans make decisions from the options AI surfaces. With Agent Teams, this division of labor is finally becoming genuinely practical.
This is still Research Preview — but the direction is clear. From the era of single AI agents working alone, to the era of AI agents working as teams. Start with a small task, try it, and feel the difference in quality for yourself.
