WARP

[2026-May Update] AI Operator Guideline Version 1.2 in Plain English: AI Agent Regulation, Human-in-the-Loop Mandate, and What It Means for Enterprises

2026-05-02Ryuta Hamamoto

A practitioner-friendly walkthrough of Japan's AI Operator Guideline Version 1.2, published by METI and MIC on March 31, 2026. Ryuta Hamamoto unpacks the new definitions of AI Agents and Physical AI, the scope of the Human-in-the-Loop mandate, the steps to build internal governance, and how WARP consulting supports each phase.

[2026-May Update] AI Operator Guideline Version 1.2 in Plain English: AI Agent Regulation, Human-in-the-Loop Mandate, and What It Means for Enterprises
シェア

Hello, this is Ryuta Hamamoto from TIMEWELL. The AI Operator Guideline was revised to Version 1.2 on March 31, 2026[^1]. The week after publication, the inbox filled up with one repeated question: "So what exactly do we need to fix in our AI operations?" This piece is my answer, written for the people who actually have to update the documents and the runbooks.

The full text runs more than a hundred pages once printed. Few practitioners have time to read it cover to cover. So I will narrow the scope to two things first: the diff against v1.1, and what the Human-in-the-Loop (HITL) mandate now demands. From there I will move into how to translate the changes into internal governance, how the new text lines up with overseas regulation, and what we cover in WARP consulting. By the end you should have a concrete picture of which documents to rewrite and which operational rules to add next week.

A side note before we start. Every time the guideline is revised, a chunk of practitioners shrug and say "it is soft law, so we can skip it." My read is the opposite. Soft law is exactly the kind of standard that gets thrown back at you retrospectively during contract negotiations, procurement reviews, and IPO prep — which means the cost of catching up later is brutal. The companies that quietly get ahead of these revisions are the ones that pay the smallest tax later.

What the AI Operator Guideline Version 1.2 Is, and How It Differs from v1.1

To set the stage: the AI Operator Guideline is a self-regulatory framework jointly maintained by Japan's Ministry of Economy, Trade and Industry (METI) and the Ministry of Internal Affairs and Communications (MIC) for organizations that develop, provide, or use AI. From the v1.0 launch in April 2024[^1], it moved to v1.1 in March 2025[^2] and now to v1.2 in March 2026 — roughly an annual cadence. There is no statutory penalty for ignoring it, but Japanese ministries reference it as the industry baseline, so it ends up functioning as a de facto specification in procurement criteria and security assessments.

The v1.2 revision was driven by the explosion of AI-agent deployments through 2025 and into 2026. PwC's 2025 survey found 64.4% of Japanese companies using generative AI and 29.7% already in pilots or production deployments of AI agents[^9]. McKinsey's 2026 report frames the same shift more bluntly: "Enterprises have crossed from the chatbot era into the era of autonomous agents."[^7] When the existing rulebook assumes a human approves every output before the AI acts, it cannot keep up with the way these systems are actually being built. v1.2 is the patch.

The diff comes down to three big items.

Change v1.1 and earlier v1.2
Definition of AI Agents Not defined as a formal term Defined as "AI systems that autonomously decompose and execute tasks and act on external systems"
Definition of Physical AI Not defined as a formal term Carved out as a separate category covering robotics, autonomous vehicles, and industrial machinery — AI that acts directly on the physical world
Human-in-the-Loop (HITL) A recommended practice Stated as a de facto requirement whenever the system takes an "external action"

The third one is the heavy hitter. v1.1 phrased HITL as something organizations were "encouraged to ensure." v1.2 names "AI agents that take external actions" explicitly and positions a human-confirmation step in front of those actions as an implementation-level requirement. That is the single biggest practical impact of the revision, in my reading.

Why now? The backdrop is a cluster of recent moves: AI made the IPA's annual "Top 10 Information Security Threats 2026" list at number three for the first time[^4]; the IPA's AI Safety Institute (AISI) published a healthcare-AI safety evaluation guide in April[^3]; and the Personal Information Protection Commission (PPC) released its "12-item institutional reform direction" in January 2026, which begins to redraw the line between AI and personal-data handling[^8]. Regulatory signals that used to land as isolated dots are now visibly converging into a surface.

One more note. v1.2 keeps the structural skeleton of v1.1 in the main text, but the appendices were heavily expanded. A new "Risk Scenarios" volume and a "Accountability Templates" volume were added, with checklists matched to the developer / provider / user roles. Reading only the main text is not enough — the right way to use this revision is to read the appendices alongside it and copy the relevant items into your own operations.

Looking for AI training and consulting?

Learn about WARP training programs and consulting services in our materials.

The Scope of AI Agent Regulation and the HITL Mandate

This is the section I want to walk through carefully, because the phrase "external action" is now floating around without much shared definition. Some concrete examples will pin down where HITL actually applies.

The "external actions" v1.2 has in mind cover roughly the following:

  • Sending email or chat messages to customers and partners
  • Writing to external APIs (CRM updates, SaaS record creation, social media posts)
  • Pushing code to production, changing infrastructure, updating databases
  • Executing payments, transfers, withdrawals, or any movement of money or points
  • Issuing instructions to physical devices (robotics, IoT switches, factory lines)
  • Publishing irreversible information (press releases, IPO-related disclosures, posts from corporate social accounts)

Reversible, internal-only actions — dropping a draft into an internal Slack channel, writing a local text file — are outside the direct scope. Anything that, once it leaves the building, cannot be pulled back is in scope. That is the safe mental model.

Where do you place the human checkpoints? The structure I usually recommend on the ground is three layers:

  1. Prompt-level checkpoint: a human defines the initial instruction given to the agent
  2. Plan-level checkpoint: a human reviews the execution plan (the sequence of steps) the agent has produced
  3. Action-level checkpoint: a human approves the action one more time, immediately before it touches anything external

Running all three layers every time will grind operations to a halt. So we calibrate the depth by reversibility and blast radius: three layers for high-risk actions (money, production environments, outbound communications), two for medium-risk (CRM updates, email send), one for low-risk (internal document generation). Forcing the choice between "approve everything" and "approve nothing" is the wrong frame; varying approval density by the irreversibility of the action is the realistic frame.

Engineering teams will recognize the design pattern from coding-agent operations. The deny rules in our internal Claude Code settings.json and pre-commit hook conventions translate almost directly into general-purpose agent guardrails: explicit allowlists, denied paths for secret material, and bounded write scopes. The same principles map cleanly to a HITL design for any agentic system that touches external resources.

For the orchestration side — wiring multiple agents together — see Claude Code Agent Team Operating Guide. Once agents start chaining, where you place the HITL checkpoint becomes a direct quality lever.

One point that gets misread on the ground: HITL does not mean "a human makes every decision." It means "a human always has the ability to intervene immediately before the final action." A workflow where the agent generates drafts and candidates at scale and the human only stamps approve or reject still satisfies the intent. In fact, if you do not exploit that asymmetry, the human becomes the bottleneck and most of the productivity gains evaporate.

A second nuance worth stating plainly: real-time per-action approval is not the only HITL pattern. The guideline accepts variants such as (a) batched review at the start and end of a job, (b) allowing autonomous execution for a fixed window with mandatory next-morning log review, or (c) escalating to a human only when an anomaly is detected. The essence is being able to demonstrate after the fact that human judgment was embedded in the loop. The design space is wider than people initially assume.

Three Steps to Build Internal Governance

This is the "start moving next week" section. The minimum response to v1.2 boils down to three steps.

Step 1: Sort out where your company sits

The guideline classifies actors into developers, providers, and users (with users further split into business and non-business). Inside one company, different services usually fall into different roles.

If you self-host your own model for internal use, you are a developer. If you ship a SaaS built on top of OpenAI or Anthropic APIs, you are a provider. If you only consume ChatGPT Enterprise as an end user, you are a user. Most enterprises end up being all three depending on the service.

The first deliverable is a simple inventory: every AI system in the company, mapped to which role you play for it. A spreadsheet is fine. Six columns will get you a workable v1: service name, owning team, our role, whether personal data is processed, type of external action, current HITL level.

Step 2: Document AI usage policy and operating rules

Once the inventory is done, the document set I usually recommend, in this order, is:

  • AI Usage Principles (a one-page declaration issued under the name of an executive)
  • AI Usage Guidelines (employee-facing: permitted uses, prohibited uses, escalation contact)
  • HITL Operating Rules (per business unit: which actions need how many layers of approval, where authority sits)
  • Data Handling Standards (a decision tree for handing personal data, trade secrets, or IP to AI)
  • Vendor / Model Evaluation Criteria (the checklist for adopting a new external AI service)

You do not need to nail it on the first pass. Start by writing down what you already do, then strengthen the obviously weak parts. That ordering ships fastest in practice.

A common failure mode is a guideline that lists only prohibitions. That freezes the front line and quietly pushes everyone into shadow AI (unmanaged personal ChatGPT accounts, etc.). Always pair prohibitions with examples of permitted use, so the official path is the path of least resistance.

Step 3: Build out audit logs and an incident-response playbook

v1.2 leans hard on being able to demonstrate accountability. Concretely, you need a log design that lets you reconstruct what an AI agent did, who approved it and when, and which model and prompt produced which decision.

The minimum fields to retain:

  • The input prompt (only for material business actions is fine)
  • Model name and version used
  • A summary of the output (or the output itself)
  • The content of the external action (recipient, write target, amount)
  • The approver and the approval timestamp
  • The reason, if there was an error or a rejection

For retention period, aligning with your other operational logs (access logs, command logs) is the easiest to maintain. One to three years is the typical band.

Think of the incident-response playbook as "a bulleted checklist of what to do in the first hour, organized by incident type." Typical AI incidents cluster into four shapes: confidential-information leakage, mistaken outbound communications, business-decision errors caused by hallucinations, and abuse (prompt injection, etc.). For each, prepare initial points of contact, the procedure for preserving evidence, and a draft of the external explanation. The day something actually happens, the cost of confusion drops dramatically.

Aligning with the EU AI Act, NIST AI RMF, and ISO/IEC 42001

It is no longer enough to satisfy domestic Japanese guidelines alone. Even companies with no overseas operations get asked about international standards through customer audits, due diligence from global investors, and vendor security assessments from overseas SaaS providers.

The three frameworks worth knowing:

Framework / Regulation Nature Key Milestone
EU AI Act Statute (penalties apply). Extraterritorial scope reaches non-EU operators High-risk system obligations fully in force on August 2, 2026[^5]
NIST AI RMF Voluntary US-government framework Critical Infrastructure Profile concept released April 7, 2026[^6]
ISO/IEC 42001 International standard. Open to third-party certification Issued December 2023; certified organizations growing globally[^10]

The EU AI Act's scope is "operators placing AI systems on the EU market." A single EU user is enough to bring you in. Japanese companies are not exempt by default. In particular, areas classified as high-risk — HR evaluation, hiring, education, healthcare, financial credit — deserve equivalent governance even for purely domestic deployments.

NIST AI RMF carries no statutory penalty, but it is referenced often in US federal procurement and in critical-infrastructure self-regulation, and it is becoming a de facto global standard. With the Critical Infrastructure Profile concept released in April 2026[^6], compliance momentum is likely to accelerate in energy, telecom, finance, water, and healthcare.

ISO/IEC 42001 is the international standard for AI Management Systems (AIMS). The structure mirrors ISO/IEC 27001 for information security, so any company already running an ISMS shares a common foundation. The practical benefit of getting certified: the cost of explaining your governance posture to customers drops dramatically[^10].

My read is that over the next one to two years, "AI Operator Guideline v1.2 + ISO/IEC 42001 + the Personal Information Protection Act" will settle in as the Japanese baseline, with the EU AI Act and NIST AI RMF layered on as needed. Treating them as three separate compliance programs is wasteful — the most cost-efficient move is to draft one unified governance document that maps the shared control items across all of them.

Building "Organizational Standard" AI Governance with WARP

If you have read this far, you are probably sitting in a state that looks something like this:

  • ChatGPT, Claude, and Gemini are being used actively on the front lines
  • But there is no unified internal rulebook, and operations vary by team
  • You want to deploy AI agents in earnest but lack confidence on HITL and audit-log design
  • You have heard about the guideline revision but cannot articulate what your company should do, and to what depth
  • An executive asked "is our AI governance actually fine?" and you could not answer immediately

This is the textbook situation our WARP consulting practice walks alongside. WARP is a monthly retainer engagement where AI program owners, executives, legal, and security teams can talk to us directly. A team of former large-enterprise DX and data-strategy specialists comes in and runs an end-to-end program from compliance assessment to internal documentation, training, and incident-response playbooks.

The main pillars of support:

  • Compliance assessment against AI Operator Guideline v1.2 (per-service role mapping, gap analysis)
  • Drafting support for AI usage policy, HITL operating rules, and audit-log standards
  • Designing and delivering training programs for developers and end users
  • Building incident-response playbooks and running tabletop exercises
  • Alignment checks with the EU AI Act, NIST AI RMF, and ISO/IEC 42001
  • Executive-level reporting on governance maturity, residual risk, and next-quarter actions

For companies in the "we want to use AI to the hilt, but we are nervous about regulation and governance" stage, this is the engagement that fits best — and I say that as the person who designs them. For details, see the WARP consulting page. To start a conversation, use this form. The first 30-minute online session covers a current-state interview and a prioritized set of next moves.

Summary

AI Operator Guideline v1.2 is the first version that elevates HITL from "recommended" to an implementation-level requirement, in response to the arrival of the AI agent era. It is not a statute, but it bites quietly during contracts, procurement, and IPO prep. If you start now, the order is straightforward: inventory, documentation, then logs and playbooks. Build one unified governance document that aligns with international standards, and you will have a meaningful weapon for the next several years.

[^1]: "AI Operator Guideline (Version 1.2)" — Ministry of Economy, Trade and Industry; Ministry of Internal Affairs and Communications — 2026-03-31 — https://www.meti.go.jp/shingikai/mono_info_service/ai_shakai_jisso/index.html [^2]: "AI Operator Guideline (Version 1.1)" — Ministry of Economy, Trade and Industry; Ministry of Internal Affairs and Communications — 2025-03-28 — https://www.meti.go.jp/shingikai/mono_info_service/ai_shakai_jisso/pdf/20250328_1.pdf [^3]: "Evaluation Perspective Guide on AI Safety in the Healthcare Domain" — Information-technology Promotion Agency, Japan AI Safety Institute (IPA AISI) — 2026-04-03 — https://www.ipa.go.jp/pressrelease/2026/press20260403.html [^4]: "Top 10 Information Security Threats 2026" — Information-technology Promotion Agency (IPA) — 2026-01-29 — https://www.ipa.go.jp/security/10threats/10threats2026.html [^5]: "EU AI Act Implementation Timeline" — Future of Life Institute — High-risk system obligations fully in force on August 2, 2026 — https://artificialintelligenceact.eu/implementation-timeline/ [^6]: "AI Risk Management Framework — Critical Infrastructure Profile (Concept)" — National Institute of Standards and Technology (NIST) — 2026-04-07 — https://www.nist.gov/itl/ai-risk-management-framework [^7]: "State of AI trust in 2026: Shifting to the agentic era" — McKinsey & Company — Published 2026 — https://www.mckinsey.com/capabilities/tech-and-ai/our-insights/tech-forward/state-of-ai-trust-in-2026-shifting-to-the-agentic-era [^8]: "Personal Information Protection Act — Institutional Reform Direction (12 items)" — Personal Information Protection Commission (PPC) — 2026-01-09 — https://www.ppc.go.jp/ [^9]: "Generative AI Survey 2025 Spring" — PwC Japan Group — Published 2025 — https://www.pwc.com/jp/ja/knowledge/thoughtleadership/generative-ai-survey2025.html [^10]: "ISO/IEC 42001:2023 Information technology — Artificial intelligence — Management system" — International Organization for Standardization (ISO) — https://www.iso.org/standard/81230.html

Considering AI adoption for your organization?

Our DX and data strategy experts will design the optimal AI adoption plan for your business. First consultation is free.

Share this article if you found it useful

シェア

Newsletter

Get the latest AI and DX insights delivered weekly

Your email will only be used for newsletter delivery.

無料診断ツール

あなたのAIリテラシー、診断してみませんか?

5分で分かるAIリテラシー診断。活用レベルからセキュリティ意識まで、7つの観点で評価します。

Learn More About WARP

Discover the features and case studies for WARP.

Related Articles