Social Dolphin Services
SDS · Field notes

AI's hidden tax: why 73% of executives are missing ROI, and how to land in the 27%

The companies whose AI is paying back are not running better models. They are running better workflows.

Type
Field note
Date
12 May 2026
Audience
Executives evaluating their AI program

Globalization Partners released its 2026 AI at Work Report this week, surveying 2,850 leaders across six global markets. The headline: 73% of executives say at least some of their AI investments fell short of expectations over the past 12 months. The share of leaders describing their organizations as "aggressively using AI to innovate" dropped from 60% to 42% in a single year. Nearly seven in ten say they are prepared to cut AI budgets if goals are not met this year.

Read alongside PwC's 2026 AI Performance Study, which finds that 20% of firms are capturing roughly three-quarters of AI's economic value, the picture is not "AI does not work." It is that a small minority of companies are getting outsized returns, and most of the field is stuck in pilot purgatory. The boardroom mood has shifted from "let's try things" to "show me what we got."

SDS exists for the 27% who are getting it right, and to move more companies into that group. This piece is how we think about the move.

The hidden tax most teams are paying

The G-P report names a friction that does not show up in the AI vendor pitch deck: 69% of executives say the time employees spend monitoring, reviewing, and updating AI-generated work has gone up over the past year. The model produces output. A person checks it, rewrites parts of it, decides which parts to trust. The time saved on the first draft gets eaten by the time spent on the rework.

It gets worse. 88% of executives in the same study are concerned that employees are using AI to "perform productivity," with 47% very or extremely concerned this is already happening. And 82% of executives admit that AI has lowered the value they place on human employees, which is a quiet way of saying the social contract underneath the work is slipping.

None of these are model problems. They are deployment problems. The model is doing what it was asked to. The mistake was asking it without designing the workflow it would live inside.

The measurement problem under the measurement problem

Most AI programs are being scored on the wrong axis. The CFO asks: how many hours did this save us? The team reports a number. The number is small relative to what was promised, the program looks expensive, the budget gets cut.

The miss is that the early returns on AI are mostly not labor savings. They show up in decision quality, predictive power, cycle time on judgment-heavy work, and revenue protected by catching things humans would have missed. Katy George at Time put it sharply this week: leaders are measuring AI in the wrong places. Labor cost is a lagging indicator. Decision quality is the leading one. By the time the cost-savings number is big enough to make the spreadsheet land, the company is already three quarters behind whichever competitor scored their AI program on the right axis.

When we work with executives on this, the first move is almost never "build more." It is "define 1 to 3 outcomes the AI program is actually responsible for moving, and stop scoring it on everything else."

From experimentation to accountability

Most AI programs live in what we call "experiment land." The orgs in the 27% have crossed into "accountability land." The difference is not the budget or the model. It is the operating posture.

Experiment land Accountability land
Pilots with vague goals and tool-first experiments Outcome-first, with a roadmap tied to concrete business metrics
Measured by usage and hours saved Measured by revenue, risk, learning, and decision quality
Hidden human labor fixing AI outputs Designed workflows where AI surfaces exceptions and the human reviews those
Human value quietly devalued, trust eroded Human expertise explicitly elevated as the arbiter and trainer of the AI system
Stuck in pilot purgatory with no scale Repeatable patterns that compound across teams and geographies

Each row in that table is a design decision. None of them is a model upgrade. None of them requires you to bet the company on a new technology. They require someone to sit down and answer four questions for each AI use case in the portfolio, with the discipline to act on the answers.

The four questions we work through with clients

This is the work, condensed.

  1. What 1 to 3 outcomes is this AI program responsible for moving? Not "usage." Not "hours saved." Concrete business outcomes: time to decision on a class of cases, revenue per rep, churn, forecast accuracy, exception rate, customer retention, regulatory rework. If a use case cannot map to one of those, it should not be on the roadmap.
  2. Which current and proposed use cases actually move those outcomes? This is the kill list. Most AI portfolios have eight to fifteen pilots in flight. The PwC data implies that fewer than four of them are likely to matter. The other ten are running because nobody has been asked to stop them. We help clients ask.
  3. How do humans add leverage in the workflow, instead of doing rework? Scoped tool wrappers, not raw model access. Approval queues on the material decisions, auto-approve on the recoverable ones. Exception-driven review, not output-by-output babysitting. The 69% rework number is the symptom of a workflow that was never designed; building the workflow is the work.
  4. What leading indicators tell us whether this is working, before the P&L does? Cycle time on the workflow the AI lives inside. Rework rate on AI-generated output. Decision-quality scores on the cases the AI touched. Forecast accuracy where the AI is feeding a forecast. These are the metrics that change in week six, not quarter four.
If the answers to those four questions are honest, the AI portfolio shrinks by a third, the survivors get instrumented, and the rework tax starts to come down inside one quarter. That is the move from experiment land to accountability land. It is not a model purchase.

What this article is not

  • Not a critique of any specific AI vendor or model. The capability is real and is improving fast. The deployment posture is the leverage point.
  • Not a promise that every AI program can clear its ROI bar with the right scaffolding. Some pilots should be killed, not rescued. We will tell you which.
  • Not a pricing or timeline commitment. Every engagement gets a discovery call before we say what something costs.
  • Not a prescription for a specific company. We do not know your stack, your decision-making cadence, or which of your AI pilots is actually doing useful work yet. We know the shape of the questions, and we know which answers correlate with companies in the 27%.

One-sentence takeaway

The 27% of executives whose AI investments are paying back are not running better models, they are running better workflows: outcome-first goals, scoped tools, human expertise routed through approval queues, and leading indicators on decision quality instead of headcount math.

Talk to us

If you are reading this and the 73% number lands a little too close to home, the next move is a 30-minute conversation. Bring the list of AI initiatives you have in flight. We will walk through which ones are pointed at real outcomes, where the hidden rework tax is showing up in your team, and what would have to change for the program to land you in the 27%.

We do not take every engagement, and we will tell you on the call whether we are the right partner. Either way, you will leave the conversation with a sharper read on the portfolio than you came in with.

Sources