What should we do before deploying AI agents?

Start with a process audit. Sort every task into "stop," "reduce," "automate," and "humans handle it," then narrow the AI agent scope to the items in the automate bucket. Gartner forecasts that more than 40 percent of agentic AI projects will be canceled by the end of 2027, and most of the cause traces back to skipping the work redesign step.

Why look at "stop" first?

Because if you bolt an AI agent onto unnecessary work, you simply continue the unnecessary work at higher speed. McKinsey's research found only 6 percent of companies see AI contributing more than 5 percent to EBIT, which reflects the structural difficulty of generating returns when AI is layered on without retiring work. Eliminating what can be eliminated first concentrates investment on what remains.

Who should make the classification decisions?

Neither the field alone nor the executive alone—both need to sit at the same table. The field has no authority to retire work, and the executive does not know the operational detail. In practice, having a consultancy like WARP design the interview process and the decision venue is the realistic path.

How should we combine ECRS with AI agent deployment?

Respect the order of ECRS (Eliminate, Combine, Rearrange, Simplify) and place AI deployment downstream of Simplify. First consider elimination, then arrange the work via combine and rearrange, and finally automate with AI agents. Reverse the order and you end up using high-performance AI to keep wasteful work alive.

Work Classification in the AI Agent Era | Redesigning the Business with the Three Axes of Stop, Reduce, and Automate [2026 Edition]

Hello, this is Hamamoto from TIMEWELL.

Inquiries about AI agent deployments have surged over the last twelve months. Every email opens with some version of the same line. "We want to deploy AI agents too. Where do we start?" Before I answer, I always ask one question. "Are the workflows you want to deploy AI on actually workflows that need to exist?" The reply is almost always silence.

Gartner forecasts that more than 40 percent of agentic AI projects will be canceled by the end of 2027[^1]. McKinsey's data shows that only 6 percent of companies see AI contributing more than 5 percent to enterprise-wide EBIT (earnings before interest and taxes)[^2]. The technology has progressed further than ever, yet the outcomes are extraordinarily lopsided. The differentiator is not model selection or prompt engineering. It is whether the company decided up front which work to keep and which to drop.

In this piece I want to lay out a work-classification framework you must run before agentifying anything. Sort all work into "stop," "reduce," "automate," and "humans handle it," and aim AI agents only at the automate bucket. Just by enforcing that order, most of the wasted investment evaporates.

The thread that runs through every failed AI deployment is "no process audit"

Let me start with a real example. A mid-sized manufacturer recently asked me to "deploy AI agents in the sales department." When I dug in, they had listed 14 candidate workflows—quote generation, meeting minutes, CRM input, competitor research, and so on. I asked, "What is the purpose of each of these 14 tasks?" They answered for about half. The other half came back with "we've always done it this way" or "the boss requires the document."

This is the actual state of the field. NTT Data's April 2026 report likewise warns that simply "deploying" generative AI fails[^3]. The sales domain is heavily reliant on individual judgment, and many companies never managed to embed even SFA (sales force automation). Layering an agent on top does not magically structure the implicit judgment underneath. Dynatrace recorded that 50 percent of companies running agentic AI PoCs (proof of concept) cannot measure ROI, and 74 percent do not even have a measurement plan[^4]. Apply unmeasurable technology to work whose purpose nobody can state, and any positive ROI would be a miracle.

Domestic adoption of generative AI has climbed to 57.7 percent. Even so, only 6 percent of companies are realizing impact at the enterprise level[^3]. The split is not driven by whether you deployed; it is driven by what you kept and what you stopped.

My position is unambiguous. AI is the last resort. The first thing to consider is retirement. Next, reduction. Only the work that survives those two filters is a candidate for automation. Reverse the order, and you are simply using high-performance AI to extend the life of wasteful work.

The four-way framework: stop, reduce, automate, humans handle it

The classic frame underlying work classification is ECRS—Eliminate, Combine, Rearrange, Simplify. Asana's explainer notes that of the four steps, Eliminate produces the largest improvement[^5]. Toyota's "Seven Wastes" begins from the same instinct: find waste first and throw it out[^6].

What I add on top of that classic is one judgment axis built for the AI agent era. The result is the four-way classification below.

Class	Decision criterion	Action
Work to stop	Not tied to customer value, executive judgment, or regulatory compliance	Retire. Do not build a replacement.
Work to reduce	Necessary, but the frequency or scope is excessive	Cut down: monthly to quarterly, all targets to priority targets, and so on
Work to automate	Routine, repetitive, high volume, with clear rules	AI agents, RPA, SaaS integrations
Work humans handle	Customer relationships, judgment, creativity	Concentrate human investment; AI is a sidekick

The way to use the framework is to open the work inventory in a spreadsheet and judge it row by row. You will be surprised how many tasks fall cleanly into "stop." On the engagements I have run, 15 to 30 percent of the audited workflows land in "stop." Deciding whether the rest should be automated or handled by humans comes after that.

The reason it is the worst possible move to drop an AI agent into "work to stop" is straightforward: you simply keep the unnecessary work running at higher speed. Worse, you destroy the chance to retire it. The same logic applies to "work to reduce"—if you do not first narrow frequency or scope, you end up automating every instance of work you should not be doing in the first place. Apply AI to "work humans handle" and the quality of relationships and judgment degrades. What remains is "work to automate," and only that.

The boundary that gets misread most often is between "humans handle it" and "automate it." Here is how I think about it: leave four things with humans—first-touch customer interactions, final decisions, creative planning, and trust-building dialogue inside the company. Everything else—routine processing, data aggregation, document formatting, search, summarization—goes into the automation candidates. The line shifts by organization, but when in doubt, leave it with humans. Pulling work back to humans later is harder than pushing it to automation later.

The mechanics of running a process audit: department interviews, task inventory, classification

The theory is clean, but the actual audit work is grinding. Here is the procedure I run in the field.

The first stage is department interviews. Interview three to five people per department and walk them chronologically through "work you did last week." Anything that does not surface weekly is captured separately under monthly, quarterly, and annual buckets. The critical move here is asking about "actual actions performed," not "task names." A row labeled "sales activity" frequently breaks down into "90 minutes of meeting minutes," "30 minutes of follow-up email," "40 minutes of CRM entry," and "60 minutes of competitor research."

The second stage is the task inventory. Organize the items collected in interviews under four columns: who, for what purpose, at what frequency, and how much time. A typical department surfaces 200 to 400 tasks. Manage it in a spreadsheet or Notion, and always include a column titled "Who is inconvenienced if we stop this work?" If the answer is no one, it is a retirement candidate.

The third stage is classification. Map each task into the four-way framework. The ideal cast at the decision meeting is the field operator, the division head, and a representative of the executive layer. The field tends to argue "this is necessary"; viewed from the top, however, the inventory always includes work with no business reason to exist. Conversely, work the executive layer dismisses sometimes turns out to be the backbone of customer satisfaction. Only when both sides sit at the same table can you make the right calls.

Ninety days is a useful yardstick. For one department's classification: two weeks of interviews, three weeks of inventory, two weeks of decision meetings, and four weeks for downstream decisions and execution of retirements—roughly three months in total. Plans shorter than that have skipped a step somewhere. As I wrote in the related piece Three Strategic Options for Management in the AI Agent Era, the audit precedes the strategy. Companies that skip it end up with a strategy built on sand.

Separate the decision criteria and the decision-makers for each class

Even after the audit, you will hit cases where classification is genuinely hard. Without crisp decision criteria and named decision-makers, the meeting collapses into an "everything is essential" faction battling an "eliminate everything" faction. Here are the criteria I use in practice.

"Stop" has three tests. First, does the work tie to customer value? Second, is it used as input for executive judgment? Third, is it required by regulation, law, or contract? Anything that fails all three is, in principle, a retirement candidate. The decision-maker is the executive or the business owner. Never give the field veto rights over retirement; the "we're busy, let's keep it" instinct kicks in every time.

"Reduce" tests whether the work is needed but its frequency, scope, or granularity is excessive. Producing a monthly meeting deck on a weekly cadence; sending a DM to all customers when the top 20 percent would do; turning a five-page report into one page—these reductions can be driven at the middle-management level.

"Automate" multiplies four checks: routinization, repetition, volume, and rule clarity. Drop any one of them and the cost-effectiveness of automation collapses. Agentifying a workflow that fires three times a month, for example, costs more in development and operations than it saves. The realistic decision-makers are IT and the business unit, jointly.

"Humans handle it" tests whether the work involves customer relationships, creativity, final judgment, or trust-building. Items in this class become investment targets for talent. It is not that AI cannot replace them; it is that AI must not. Where you redirect the freed-up human hours is where executive skill shows up.

In my experience, the four classes settle into roughly 15 to 30 percent stop, 20 to 30 percent reduce, 30 to 40 percent automate, and 15 to 25 percent humans. The mix shifts by industry, but if your tally puts more than 80 percent into automation, you have almost certainly missed candidates for "stop" or "reduce." If "humans handle it" is over 50 percent, your judgment is probably too lenient.

Prioritizing AI agent work once the automation set is locked

Only now does the AI agent conversation begin in earnest. You do not have to agentify every task in the automation bucket at once. Use three axes to prioritize.

The first axis is reach. Automating work where one person is stuck has lower returns than work where 20 people repeat the same routine. The second axis is time scale. Aim at five hours per week of work, not five minutes per week. The third axis is risk. For work where a faulty judgment lands directly with the customer—issuing invoices, sending contracts, external communications—keep a human-in-the-loop checkpoint. As of 2026, the realistic scope for fully autonomous agents is internal intermediate processing.

A note on technology selection. Agents handling internal confidential data should not be sent straight to a public API. An enterprise AI like ZEROCK, which runs on AWS infrastructure inside Japan, lets you agentify without sending the knowledge outside the company. Combine it with GraphRAG (a retrieval-augmented generation technique that handles knowledge as a graph structure) and you can search, summarize, and generate while preserving the relationships between internal documents, which means the implicit know-how survives the move into automation.

Once prioritization and tech selection are done, you enter the PoC. Let me reuse the Dynatrace number: 50 percent of companies running PoCs cannot measure ROI[^4]. To avoid that, lock down "what counts as success" to a single metric before the PoC starts. Time saved, error rate, throughput, satisfaction—any one of them is fine. Hedge across multiple metrics and the PoC closes with "I think it might have worked." For the build of each phase, I have laid out the full sequence in The Five Phases of Implementing AI Agents in the Organization. Read both together and the flow gets easier to grasp.

At WARP (our AI consulting arm), engagements that arrive without a completed audit and classification spend the entire first month doing exactly that. Tech selection comes after. Clients sometimes ask me to flip the order. I refuse. Flipping it always ends in a redo.

A 90-day roadmap to complete one department's classification

Let me close with an executable roadmap. The premise is completing one department's classification in 90 days.

Month one is interviews and inventory. Weekly site interviews, weekly updates to the task ledger, and a complete department-wide work list by month-end. By this point, you should have around 30 candidates for "stop."

Month two is decision meetings and approvals. Sort the 200 to 400 tasks per department into the four classes. Hold one two-hour meeting per week, roughly eight sessions in total. Report to the executive layer at month-end and secure approvals on retirements. If approvals slip here, every downstream step slips, so set up a fast-decision route to the CEO or an officer-level sponsor at the start. That is non-negotiable.

Month three is execution and validation. Stop the workflows in the "stop" bucket within the month. Switch the workflows in "reduce" to their new operating cadence. The automation candidates enter tech selection and PoC design. Production AI agent build kicks off in month four and beyond.

Once one department has cycled through this, the company has a shared "classification language" internally. "This is a 'stop,' right?" "Let's 'reduce' this first, then think about automation." That kind of language enters the daily conversation. From there, the second and third departments roll out in roughly half the time.

One caveat. As I covered in the Google Cloud Next 2025 report[^7], agent technology is moving fast. Tasks judged "not yet automatable" six months ago may be automatable today. Without an annual review of the classification, the work you did once goes stale. My related piece The AI Agent Trends at Google Cloud Next 2025 traces the technology trajectory and is worth a read when you revisit the classification.

Let me circle back to the starting point. The thread that runs through failed AI deployments is "no process audit." Inversely, do the audit thoroughly and tool selection and vendor choice can be sorted out later. Skip nothing here, and you stay out of the 40 percent canceled projects Gartner is warning about. Before deploying AI agents, open one spreadsheet. Stop, reduce, automate, humans handle it. Sorting every workflow into those four columns is the only place where real AI-agent management begins.

References

[^1]: Gartner "Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027" https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027

[^2]: Fidex Inc. "[2026 Edition] AI Agent Deployment Guide: Comprehensive Coverage of the Latest from Gartner, McKinsey, and METI" https://www.fidx.co.jp/

[^3]: NTT Data "Why simply 'deploying' generative AI fails: the design sequence sales organizations overlook" https://www.nttdata.com/jp/ja/trends/data-insight/2026/0401/

[^4]: ITmedia @IT "'Most agentic AI today does not deliver ROI'—the basis for Gartner's harsh assessment" https://atmarkit.itmedia.co.jp/ait/articles/2506/27/news030.html

[^5]: Asana "What is the ECRS principle? Explaining the 'four principles of improvement' for process optimization" https://asana.com/ja/resources/what-is-ecrs

[^6]: Nikken Tsunagu "What are the seven wastes? The core philosophy of the Toyota Production System and how to remove waste" https://www.nikken-totalsourcing.jp/business/tsunagu/column/2503/

[^7]: TIMEWELL Column "AI Agent Trends at Google Cloud Next 2025" /en/columns/google-cloud-next-2025-ai-agents-enterprise

Work Classification in the AI Agent Era | Redesigning the Business with the Three Axes of Stop, Reduce, and Automate [2026 Edition]

The thread that runs through every failed AI deployment is "no process audit"

The four-way framework: stop, reduce, automate, humans handle it

The mechanics of running a process audit: department interviews, task inventory, classification

Separate the decision criteria and the decision-makers for each class

Prioritizing AI agent work once the automation set is locked

A 90-day roadmap to complete one department's classification

References

Considering AI adoption for your organization?

Newsletter

あなたのAIリテラシー、診断してみませんか？

Related Knowledge Base

Solutions

Learn More About AIコンサル

Related Articles

Three Strategic Options for AI-Agent-First Management: CEO, Talent Development, or M&A — Which Path to Choose [2026 Edition]

Execution Roadmap for AI-Agent-First Management | Concrete Moves Executives Should Make in 90 Days, 1 Year, and 3 Years (2026 Edition)

KPI Monitoring for AI Agent Operations | 7 Indicators Executives Should Track Weekly and How to Run the Cadence [2026 Edition]

Newsletter