AIコンサル

Installing AI Agents into Your Organization: A Five-Phase Playbook | SKILL.md Sharing, In-House Marketplaces, Top-Down x Bottom-Up Operations [2026 Edition]

2026-04-24濱本 隆太

Installing AI agents is not procurement; it is a swap of the organization's operating system. This article walks through the full five phases (infrastructure, skill sharing, operational adoption, executive integration, continuous improvement) using SKILL.md, in-house marketplaces, and a deliberately balanced top-down x bottom-up operating model.

Installing AI Agents into Your Organization: A Five-Phase Playbook | SKILL.md Sharing, In-House Marketplaces, Top-Down x Bottom-Up Operations [2026 Edition]
シェア

Hello, this is Hamamoto from TIMEWELL.

A common question in AI agent conversations is, "In the end, which tool should we adopt?" But over the past six months, talking with CAIOs (Chief AI Officers) at large enterprises, I have become convinced that tool selection is no longer the issue. Claude, ChatGPT, Gemini, Copilot, custom in-house agents — the options are already there. What remains is the much messier question of how to "install" them into the organization.

KPMG's Q4 AI Pulse Survey, published in February of this year, reports that 54% of organizations now deploy AI agents in core business functions, up more than 20 points from 33% in mid-2024[^1]. At the same time, Gartner forecasts coldly that "more than 40% of agent projects will fail by 2027"[^2]. More than half have started, but nearly half will not stick. The gap is not in software; it is in the organizational OS.

This article is the sixth installment of the "AI-Agent-First Management" series. It shifts the lens from tools to organizations and shares the five-phase installation playbook I actually use in WARP engagements. Infrastructure, skill sharing, operational adoption, executive integration, continuous improvement. I will write honestly about the concrete mechanics of skill sharing and the operational design that runs top-down and bottom-up as a single loop.

"Tool deployment" and "organizational installation" are different things

Talking with executives, many still treat AI agent adoption as something resembling a Salesforce rollout. Sign the contract, distribute it to employees, run training, done. Unfortunately, that model does not work. Stanford's March analysis of 51 successful deployments reports that the tool-layering approach (placing agents on top of existing operations) yields meaningfully lower ROI than the process-first approach (redesigning the workflow before implementing agents)[^3].

The reason is simple: agents do not "substitute" for work; they "reconfigure" it. For example, when a sales team shifts to running its weekly Meddpicc check through an agent every Thursday, the timing of SFDC entries, the format of meeting notes, and the order of manager review all change in lockstep. Add an agent without moving the process and employees end up doing duplicate work, and the agent stops being used in three months. This is exactly the failure pattern I personally watched unfold across multiple client engagements in 2025.

KPMG's report includes another telling number. As the largest blockers of AI agent ROI, 65% of respondents cite "difficulty scaling use cases" and 62% cite "skill shortages"[^1]. The real issue with scale is not a lack of tools; it is the absence of a clear answer to who, with what authority, approves, distributes, and decommissions agents. The skill shortage is not a freshman training issue; it is the absence of "procedure docs that can be handed to agents" inside the organization.

My stance is unambiguous. AI agent adoption only succeeds when top-down and bottom-up run as two wheels of the same vehicle. Executives commit budget and authority, the front line grows skills and community, and the two mesh through KPIs. Without intentionally engineering this structure, six months in you land in the textbook defeat of "in the end, only email summarization is being used."

That is why I propose phases. Infrastructure, skill sharing, operational adoption, executive integration, continuous improvement. Each phase has its lead actors and tools, and skipping any one of them creates a jam later. Let's walk through them in order.

Phase 1: Infrastructure (API contracts, security, permissions, governance)

The first phase is dull, but doing it sloppily breaks every later phase. There are four concrete tasks: organizing AI API contracts, integrating SSO and identity providers, designing permissions and data boundaries, and building audit logging and governance.

For API contracts, compare at minimum the enterprise contracts of Anthropic, OpenAI, and Google. Look at training-data clauses, region, SLA, and committed-use discounts. In my experience, locking the entire company to one vendor up front is unrealistic. A more practical path is to let teams trial both Claude and ChatGPT for three months, then narrow down based on measured cost and output. For Japanese companies, whether everything can be completed within the AWS Tokyo region surprisingly matters. ZEROCK is designed on a domestic AWS infrastructure assumption precisely because of these data sovereignty requirements.

SSO and identity provider integration means linking Okta, Azure AD, or Google Workspace. Anthropic plans to implement OAuth 2.1 enterprise identity federation for MCP (Model Context Protocol) in Q2 2026, after which MCP server access permissions can be synchronized with HR systems[^4]. Letting departed employees retain access to internal agents indefinitely is unacceptable from an internal control standpoint. Treat SSO integration not as "nice to have" but as "the auditor will absolutely flag this if it is missing."

Permissions and data boundary design is more granular. Can the sales department's agent read accounting data? Can engineering's Claude Code write to the production database? How far can outside vendors reach? The number of public MCP servers Anthropic publishes exceeded 10,000 as of March 2026, with monthly SDK downloads reaching 97 million[^4]. Convenient, but careless connections leak information. At minimum, separate ReadOnly and ReadWrite per server, default-approve Read, and require explicit approval for Write.

Audit logging and governance refers to recording everything an agent observed, output, and executed. The WEF organizational transformation report from this year also notes that organizations that designed governance, people, and process as a set from the start are the ones producing results[^5]. It is unglamorous, but if this is loose, when an "AI did this without permission" incident happens, accountability is unclear and the front line freezes. The moment people freeze, operational adoption plummets. Infrastructure is both defense and the foundation of offense.

Looking for AI training and consulting?

Learn about WARP training programs and consulting services in our materials.

Phase 2: Skill sharing mechanics (SKILL.md, connectors, in-house marketplace)

Once infrastructure is up, build the mechanics for sharing skills. This is the biggest topic of 2026, and it barely existed six months ago.

At the center are Claude Code Skills and SKILL.md, which Anthropic has been promoting since the end of last year. SKILL.md is a document composed of YAML frontmatter (metadata) and a body that tells the agent "when to use this skill and in what order to execute it." Priority is split into a three-tier hierarchy of enterprise > personal > project, allowing organizational administrators to enforce company-wide skills from the top[^6]. When writing this very column I had Claude Code load our internal article-writing and human-tone skills before I began. By turning a process into a skill, quality becomes reproducible in anyone's hands. That is the essence of SKILL.md.

What kinds of skills should you actually share? From WARP client examples: "a skill to extract churn risk from sales meeting notes," "a skill to perform a first-pass classification check against METI's export control criteria," "a skill to produce a quarterly revenue report," "a skill to draft an initial incident report for a compliance issue." All of these are written as SKILL.md files and consolidated in an internal Git-shared repository. When new hires onboard, a single init command syncs all skills to their local Claude Code.

Then connectors. The work of wiring MCP servers to internal knowledge bases, SFDC, Notion, kintone, and internal databases. MCP is a specification Anthropic announced in November 2024, but in just a year and a half OpenAI, Google DeepMind, Microsoft, and Cloudflare have all declared support, making it an industry standard[^4]. By creating MCP connectors for every internal system, you can swap agents without breaking the knowledge layer. ZEROCK is built around the same idea, designing GraphRAG (retrieval-augmented generation that uses a graph structure) so that its outputs can be invoked from multiple agents.

Another big move arrived this year. The Claude Plugins Marketplace, which Anthropic opened up in February 2026. Plugins are a mechanism for packaging skills and sub-agents for distribution, and enterprise administrators can build private internal marketplaces[^6]. Auto-distribute to new hires, expose only to specific departments, distribute through approval workflows — everything is finished in a single management screen. I expect that within this year, "sharing the agents themselves," meaning internal agent marketplaces, will become standard equipment in enterprises. Readers who want to go deeper into the design philosophy of Claude Code Skills can also see the related article Claude Code Skills Custom Build Guide.

The skill-sharing mechanism is, in essence, the question of whether the organization owns "reusable units of knowledge." Without this, brilliant agents built individually by senior engineers never become company assets. With it, even a new hire can execute at expert-level cadence on day one. The gap shows up there.

Phase 3: Operational adoption (top-down KPIs x bottom-up community)

Skills and agents being deployed does not mean people use them. I can say that with conviction. Even at our clients, for the first three months most employees had Claude access but used it just twice a month. Lifting operational adoption requires intent and engineered mechanisms.

The top-down lever is simply hitting it with KPIs. McKinsey's State of AI 2025, published in January, reports that 88% of CEOs say "deployment velocity" matters more than "model accuracy"[^7]. If the CEO says so, frontline KPIs should reflect it. What I recommend is monitoring three numbers each week: the number of operating agents, SKILL.md execution counts, and the citation count from knowledge sources. Compare week-over-week and across teams. Review them at the start of every executive committee meeting for the first five minutes. This single move changes operational adoption dramatically.

Why does it work? The largest reason employees do not open Claude is "I don't feel forced to." Once KPIs are visible, managers start asking, "Zero this week — any reason?" Once asked, the employee starts using it. Primitive, but effective. I call this monitoring layer the "AI operations dashboard," and we provide it as standard equipment in WARP.

Top-down alone, however, makes employees feel forced and kills initiative. That is where the bottom-up mechanisms come in. Dedicated Slack channels, internal hackathons, study groups, technical office hours. A monthly ritual at some WARP clients is the "AI Lightning Talk," where anyone can share "the skill I built this month" in five minutes. No rankings, no prizes, just applause. Even so, attendance grows.

In an interview published this April, KPMG's global AI leader Steve Chase plainly stated that "the biggest barrier is not tools but culture"[^8]. Culture cannot be built by top-down decree. Only when employees have a venue in which they describe agents as interesting in their own words does culture take root. Top-down for budget and authority, bottom-up for practice and energy. The intentional design of this two-wheeled rotation is the heart of Phase 3.

Incidentally, the broader blueprint for AI talent development is laid out in AI Talent Development Strategy. Reading it alongside this piece will make the operational adoption levers stand out in three dimensions.

Phase 4: Executive integration (embedding into workflows and organizational structure)

By this point, agents shift from "useful tools" to "parts of the organization." Phase 4 is where executives accept that change and formally embed agents into business processes and organizational structure.

Process embedding looks like this. Have an agent run the monthly sales review first so that humans compress a two-hour meeting into 30 minutes. Have an agent run anomaly detection on monthly close journal entries so accounting staff only review. Have an agent perform first-pass legal contract review so attorneys focus only on judgment calls. Walmart embedded its in-house Wallaby LLM into supply chain and store operations to automate decisions like payment authorization and inventory replenishment[^1]. In Japan, examples of rewriting the workflow itself are finally starting to multiply.

Embedding into organizational structure is heavier. Where do you place the AI agent operations lead, what do you add to the evaluation system, how do you change the HR system. What I recommend is establishing a Chief AI Officer (CAIO) separate from the CIO or CTO and centralizing accountability for AI agent KPIs. KPMG's survey also found that 57% of leaders expect a structure in which "humans manage and direct agents"[^1]. This management role is a new species of manager.

On evaluation systems, companies are starting to add metrics like "number of agents used," "number of SKILL.md files authored," and "number of connectors shared" to the employee evaluation criteria. This is simple and powerful. Tied to compensation and promotion, people will reliably do it. Disconnected from compensation, training programs run and nothing changes.

Unavoidable here is organizational conflict. A divide always forms between "people who wield AI" and "people whom AI replaces." Left untended, it becomes the kindling for office politics. The WEF report points out that organizations redesigning HR systems, training programs, and career paths in parallel are the ones bridging this gap[^5]. My own stance: prioritize re-education for employees most exposed to replacement risk. Companies that coldly cut them lose the trust of the survivors as well. The big-picture strategy of agent-first management is also organized in the series cornerstone, Three Strategic Options for AI-Agent-Driven Management.

Executive integration is, in essence, the act of upgrading agents from standalone tools to "the driver's seat of management." Only after this step does AI agent adoption become irreversible.

Phase 5: Continuous improvement (a quarterly PMF re-check and organizational conflict response)

The final phase is keeping the wheel turning. The AI agent market changes scenery every three months. Best practices from six months ago are routinely outdated today. Continuous improvement is a learning cycle built into the organization.

The basic motion I recommend is a quarterly Product-Market Fit (PMF) re-check. For each agent in the organization, evaluate operational frequency, user satisfaction, hours saved, and new use cases, then decide whether to keep, refactor, or retire. Most of the "more than 40% of agent projects fail by 2027" risk that Gartner warns about stems from holding on to zombie agents because no one has the courage to retire them[^2]. The courage to keep going matters as much as the courage to stop.

Retirement sounds negative but is actually a signal of organizational health. If an agent has run for six months and is not being used, the tool is not at fault — the business need was misaligned. It is faster to rebuild on a fresh hypothesis. Putting a quarterly cadence in place keeps the organizational metabolism healthy.

Organizational conflict response is also a major theme of this phase. On top of the "users vs. replaced" conflict noted in Phase 4, you will see power struggles between IT and business units, and between governance-first and speed-first camps. From experience, these are not solved by argument. The realistic move is for executives to put "what we prioritize" in writing and revisit it every six months.

Keeping up with the latest developments is also non-negotiable. Anthropic plans to release the MCP Registry, an audited catalog of MCP servers, in Q4 2026[^4]. OpenAI's Responses API and the evolution of Agent Builder announced at Google Cloud Next move on a near-weekly basis. The latest from Google Cloud is also organized in Google Cloud Next 2025: Enterprise AI Agent Announcements Recap. Build an information intake line directly under the CAIO and a cycle that reflects intelligence into your strategy every quarter.

Another tip for continuous improvement is to keep an external sparring partner. Debate purely internally and in six months your topics calcify. WARP, which I lead, runs monthly agent operations retrospectives with our clients. When a third-party perspective enters, blind spots invisible from the inside surface, often one or two each session. This is not consulting sales talk; it is a cognitive science phenomenon.

Summary: dual-wheel design is the key to durability

I have written the five phases of installing AI agents into the organization with the texture of the actual field. Compressing the takeaways:

  • Phase 1 (Infrastructure) locks down API contracts, SSO, permissions, and audit logs as the foundation
  • Phase 2 (Skill sharing) puts SKILL.md, MCP connectors, and the in-house marketplace in place
  • Phase 3 (Operational adoption) rotates the dual wheels of top-down KPIs and bottom-up community
  • Phase 4 (Executive integration) embeds agents into workflows, organizational structure, and the CAIO role
  • Phase 5 (Continuous improvement) habituates a quarterly PMF re-check, retirement decisions, and external sparring

If there is one thing I want to emphasize across these five phases, it is this: installation only succeeds when top-down and bottom-up move as a single loop. Executive resolve alone, or frontline energy alone, cannot deliver durable adoption. Engineering the mechanism that meshes the two is, I believe, the actual job of a CAIO or AI deployment lead.

At TIMEWELL, ZEROCK provides a secure enterprise AI foundation, and WARP runs alongside clients across strategy, implementation, operations, and organizational transformation. WARP NEXT in particular is designed so a CAIO-class strategic partner stands beside the executive team monthly, supporting the in-house operationalization of the five phases. If you want to treat AI agents not as "tool deployment" but as "redesign of the organizational OS," please reach out.

The "AI-Agent-First Management" series wraps for now, but the issues on the ground are far from exhausted. Going forward I plan to publish case studies, post-mortems, and prescriptions by organizational type. I hope every reader's organization is one step further along six months from now.

References

[^1]: KPMG, "AI at Scale: How 2025 Set the Stage for Agent-Driven Enterprise Reinvention in 2026" https://kpmg.com/us/en/media/news/q4-ai-pulse.html [^2]: Gartner, "AI Agent Adoption 2026: What the Data Shows" https://joget.com/ai-agent-adoption-in-2026-what-the-analysts-data-shows/ [^3]: Stanford Digital Economy Lab, "The Enterprise AI Playbook: Lessons from 51 Successful Deployments" https://digitaleconomy.stanford.edu/app/uploads/2026/03/EnterpriseAIPlaybook_PereiraGraylinBrynjolfsson.pdf [^4]: Model Context Protocol Blog, "The 2026 MCP Roadmap" https://blog.modelcontextprotocol.io/posts/2026-mcp-roadmap/ [^5]: World Economic Forum, "Organizational Transformation in the Age of AI" https://reports.weforum.org/docs/WEF_Organizational_Transformation_in_the_Age_of_AI_How_Organizations_Maximize_AI%27s_Potential_2026.pdf [^6]: Anthropic, "Extend Claude with skills" https://code.claude.com/docs/en/skills [^7]: McKinsey, "The state of AI in 2025: Agents, innovation, and transformation" https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai [^8]: AI News, "KPMG: Inside the AI agent playbook driving enterprise margin gains" https://www.artificialintelligence-news.com/news/kpmg-inside-ai-agent-playbook-enterprise-margin-gains/

Considering AI adoption for your organization?

Our DX and data strategy experts will design the optimal AI adoption plan for your business. First consultation is free.

Share this article if you found it useful

シェア

Newsletter

Get the latest AI and DX insights delivered weekly

Your email will only be used for newsletter delivery.

無料診断ツール

あなたのAIリテラシー、診断してみませんか?

5分で分かるAIリテラシー診断。活用レベルからセキュリティ意識まで、7つの観点で評価します。

Learn More About AIコンサル

Discover the features and case studies for AIコンサル.