テックトレンド

Automating Code Review with Claude Code: A Practical Guide to Superpowers and Hooks [2026 Edition]

2026-04-24濱本 隆太

A practical walkthrough of automating AI-assisted code review by combining Claude Code's code-reviewer agent, Superpowers' requesting-code-review, Hooks, and /security-review. Covers the latest capabilities and operational design as of April 2026.

Automating Code Review with Claude Code: A Practical Guide to Superpowers and Hooks [2026 Edition]
シェア

Hello, this is Hamamoto from TIMEWELL.

Ten PRs piled up going into the weekend; more PRs landing before the previous ones are reviewed. Every engineering team has lived this. And in 2026, with the volume of AI-written code spiking, human reviewers have become an even more obvious bottleneck.

Following Claude Code's updates, I get the sense that very concrete answers to this problem are finally lining up. The official Code Review plugin, the community-built Superpowers, Hooks, and /security-review. Combine these four and you can realistically build a setup where AI handles more than half of the review work and humans hold the final decision.

This piece walks through what each of those does and then shows two concrete workflows you can actually run. If you're already letting Claude write your PRs, treat this as preparation for the "how do we review all this?" problem that is inevitably coming next.

What is Anthropic's official Code Review plugin actually doing?

In April 2026, Anthropic released the code review infrastructure it had been using internally[^1]. Install the code-review plugin on a repository and, every time a PR opens, every time there is a push, or on demand, multiple specialized agents run in parallel on Anthropic's infrastructure.

The angles are split across agents. One chases logic errors, another hunts security vulnerabilities, another looks for edge-case breakage, and another compares against past commits to surface regressions. The results are not dumped straight into comments — they go through a verification step that cross-checks each candidate issue against actual code behavior and filters out false positives. That one extra stage is why the output avoids the tell-tale "AI-sounding, generic" review comments[^2].

What I particularly like is that Anthropic explicitly says "we do not auto-approve." Comments land on the PR, but the merge decision stays with the human reviewer. This sounds obvious, but AI review products skip it surprisingly often. Fully automating approval creates a different risk: humans defer to the AI's conclusions and glance at the surface before merging. Anthropic clearly wanted to avoid that. In practice, teams that preserve final human judgment report fewer "it passed review but broke production" incidents compared to teams that let AI merge unilaterally.

One note on cost. Review pricing scales with PR size and complexity, and the average PR finishes in about 20 minutes[^1]. It is only available on Team and Enterprise plans, but at a few minutes of compute per PR, it is comfortably cheaper than an hour of a human reviewer's time. And because the review runs on Anthropic's infrastructure rather than your CI runner, you don't pay twice for compute and build minutes.

Interested in leveraging AI?

Download our service materials. Feel free to reach out for a consultation.

Where Superpowers' requesting-code-review shines

The official plugin handles macro review at the PR level. The requesting-code-review skill in obra/superpowers does micro review at the task level[^3]. The two are not competitors — they are complementary.

In the Superpowers workflow, you break development down ahead of time with the writing-plans skill, then implement each small task with a different subagent through subagent-driven-development[^4]. Each time a task finishes, requesting-code-review fires automatically. It checks the implementation against the original plan — has the spec drifted, are there security gaps, is readability holding up — and tags issues by severity. If anything is marked critical, progression to the next task is blocked.

Whether this "between-task review" is in the loop completely changes the end quality when you delegate a long piece of work to Claude. Without it, Claude tends to quietly reinterpret the spec mid-stream, and by the time you look up the output has drifted far from the original plan. Adding a small checkpoint between tasks course-corrects that drift early.

What I personally appreciate is that this skill also takes the less glamorous angles seriously — SOLID principles, readability, naming conventions. Review often gets framed around vulnerabilities and logic bugs, but in real work most of what rots a PR is "code that gets harder to read the more you stare at it." Having that caught automatically is genuinely helpful, and over weeks it noticeably lifts the baseline quality of everything the team ships.

Another practical benefit: because the review fires at each task boundary, the findings stay small and contextual. It is far easier to act on "this helper function violates single-responsibility, here's why" than on a 40-comment dump on a 2,000-line PR. The feedback loop is tight enough that fixes usually happen in the same session, not three days later in a separate follow-up PR.

I've also written about how Claude Code skills work — if you want to dig into why the "skill" abstraction is such a useful invention, that piece goes deeper.

Using PreToolUse and Stop hooks for different purposes

Anthropic's Code Review plugin and Superpowers are both strong, but on their own they tilt toward "review after the AI finishes writing." Stopping mistakes mid-write, or refusing to touch certain directories at all, needs a different mechanism. That's where Hooks come in.

Claude Code exposes 18 event hooks[^5]. PreToolUse fires immediately before a tool runs. Since v2.0.10, it can not only block execution but also rewrite the JSON input passed to the tool[^5]. You can force --dry-run onto a bash call that contains rm -rf, auto-prepend an internal prefix to git commit messages, or redact anything that looks like an API key. The granularity is fine enough for real operational needs.

Stop hooks fire at the moment Claude wraps a sequence of work and returns control to you. Drop an auto-review shell script there and, every time work finishes, a separate subagent can diff what changed and leave a short summary. An O'Reilly case study showed calling a subagent via claude-cli from a Stop hook to catch re-emergence of patterns banned in CLAUDE.md — the kind of thing lint can't detect[^6].

The rule of thumb is "use Hooks for mechanical checks, subagents for semantic judgment." Sending rules you could express in a linter or a test to AI review inflates cost and makes results unstable. Drop mechanical rules in PreToolUse first. Then send only semantic issues to AI review. That separation is the heart of the design.

A concrete example: one of our engagements had a recurring problem where engineers kept committing database migration files directly to main. A ten-line PreToolUse hook that refused writes to migrations/*.sql outside of a designated branch prefix eliminated the issue completely, at zero ongoing cost. That is the kind of problem where throwing AI at it would be wasteful.

Continuous vulnerability monitoring with /security-review and GitHub Actions

One thing that cannot slip through the cracks when automating code review is security review. Logic and readability can be absorbed by internal conventions, but a missed vulnerability turns into an incident the moment it escapes. It deserves a dedicated line of defense.

Claude Code ships with a /security-review slash command[^7]. Run /security-review locally and it checks pending changes against SQL injection, cross-site scripting, missing auth, mishandled data, and dependency vulnerabilities, then reports findings tagged by severity.

Anthropic also publishes claude-code-security-review as a GitHub Action[^8]. Wire it into a workflow and every PR open triggers an automatic security review, with comments posted on the relevant lines of the PR. False-positive filtering keeps the noise down, matching the design of the official plugin. Copy .claude/commands/security-review.md into the repository and you can append organization-specific security requirements or a list of acknowledged known issues to customize the behavior.

As an aside, requests to teach AI internal security review criteria — reflecting each organization's policies — have been rising sharply. For cases that must stay fully on-premises, combining it with the ZEROCK GraphRAG architecture for knowledge management lets you feed internal documents, past incidents, and audit requirements into the review process. It's a way to stack internal context on top of Claude Code's general-purpose review.

Two practical tips when you roll this out. First, keep the list of acknowledged known issues in version control alongside the rest of the security configuration. Drift between "what the tool flags" and "what the team has already decided to accept" is the fastest way to make developers ignore security warnings wholesale. Second, review the false-positive filter's behavior quarterly. As the codebase and dependencies evolve, yesterday's "obviously safe" pattern can become genuinely risky, and you want the filter tuned to current reality rather than frozen at the time of initial setup.

Two implementation flows: solo developer and team

So how do you actually assemble these pieces? Here are two patterns split by scale.

The first is a lightweight flow for individual developers and small teams.

# .claude/settings.json (abridged)
{
  "hooks": {
    "PreToolUse": [
      { "matcher": "Write|Edit", "command": ".claude/hooks/block-secrets.sh" }
    ],
    "Stop": [
      { "command": ".claude/hooks/run-review.sh" }
    ]
  }
}

The flow looks like this. You start Claude Code locally with Superpowers installed and hand it a feature request. While it's writing, the PreToolUse hook mechanically blocks secret leakage and dangerous rewrites. Every time Claude finishes a task, requesting-code-review fires automatically and corrects any spec drift. Finally, the Stop hook runs a full-diff review. At that point, only "things a human should check" are left on your plate.

The second is a full-stack setup for company teams, split across three phases.

Phase A is during development. Hooks and Superpowers live locally, automating first-pass review at the individual developer level. Phase B is at PR creation. Both claude-code-security-review and the official Code Review plugin run as GitHub Actions, with security and logic review flowing through separate pipelines[^1][^8]. Results land as comments on the PR. Phase C is pre-merge. Human reviewers use the AI comments as input and focus exclusively on things AI cannot easily handle: architectural choices, alignment with product requirements, and non-functional impact.

Atlassian's internal case study reports a 30.8 percent improvement in PR throughput after implementing this exact role split[^9]. The total volume of review grows, but the human time spent on review drops. That is the realistic upside you can expect from AI review in 2026.

If you're wondering where to start, my recommendation is Phase A. Hooks alone have low adoption cost and immediately improve the individual developer experience. Phases B and C usually need organizational buy-in, which goes more smoothly if you can already point at results from phase A.

Wrapping up: AI writes, AI reads, humans decide

Before you sketch your rollout, three commonly missed points.

First: design for human-in-the-loop. Anthropic itself explicitly says "we do not auto-approve," and for good reason — you should not let AI take final judgment away from humans. Once reviewers get pulled by AI comments and become rubber-stampers, quality actually drops. Keep the posture of "humans supply the context AI misses." The major 2026 guidelines emphasize this point[^10].

Second: standardize PR descriptions. The culture of making PR bodies state "how much of this was AI-written" and "what prompts were used" is starting to spread[^10]. It helps reviewers catch up on context faster and, later, helps you trace why an implementation ended up the way it did. Bake these fields into the Claude Code commit template and the practice sticks.

Third: the cost model. The official Code Review plugin charges by PR size, so monorepos or migration PRs with unpredictable volume can produce unexpected bills. Mitigate this by keeping PRs small, or by adding a custom PreToolUse hook that warns when a change looks likely to become huge. This is operational design territory — something you either build out with a hands-on partner like WARP AI consulting or tackle over a six-month cycle with your internal developer productivity team.

As of April 2026, automated code review with Claude Code is well and truly practical. The official Code Review plugin, Superpowers' requesting-code-review, Hooks' PreToolUse and Stop, and /security-review. Each handles a different layer, and combining them makes the pattern "AI writes, AI does first-pass review, humans make the final call" actually achievable.

Writing this piece, I was reminded that code review is not a "cost to be minimized" — it is an organization's learning device. If AI takes care of the initial findings, human review time shifts toward more meaningful debates: architectural choices, product strategy alignment, and the unspoken design instincts your team carries. When human reviewers can concentrate on those, the ROI of AI adoption goes well beyond raw development speed.

Start by dropping a single Hook into your own project. You'll probably experience the shift in your review workflow within the weekend.

References

[^1]: Code Review - Claude Code Docs [^2]: Anthropic Introduces Agent-Based Code Review for Claude Code - InfoQ [^3]: obra/superpowers - GitHub [^4]: Superpowers – Claude Plugin | Anthropic [^5]: Automate workflows with hooks - Claude Code Docs [^6]: Auto-Reviewing Claude's Code - O'Reilly Radar [^7]: Automated Security Reviews in Claude Code | Claude Help Center [^8]: anthropics/claude-code-security-review - GitHub [^9]: 30.8% Faster PRs: How AI-Driven Rovo Dev Code Reviewer Improved the Developer Productivity at Atlassian [^10]: Code Reviews in the Age of AI: Best Practices for 2026 Teams - JavaWorld

How well do you understand AI?

Take our free 5-minute assessment covering 7 areas from AI comprehension to security awareness.

Share this article if you found it useful

シェア

Newsletter

Get the latest AI and DX insights delivered weekly

Your email will only be used for newsletter delivery.

無料診断ツール

あなたのAIリテラシー、診断してみませんか?

5分で分かるAIリテラシー診断。活用レベルからセキュリティ意識まで、7つの観点で評価します。

Learn More About テックトレンド

Discover the features and case studies for テックトレンド.