
Closing the AI gender gap
IMD, Media Trust, and Code For Good Now are building a coalition to shape how AI is developed, deployed, and governed, so it becomes a force for inclusive growth, responsible innovation,...

by Michael R. Wade, Massimo Marcolivio Published May 28, 2026 in Artificial Intelligence • 12 min read
In 2021, McDonald’s embarked on what seemed like a definitive step into the future: deploying AI-powered voice ordering at its drive-throughs. By automating routine interactions, the company aimed to speed up service, reduce labor costs, and streamline peak operations.
After three years of testing across more than 100 locations, the system achieved an impressive 85% order accuracy and was declared a technical success. Yet in real-world conditions, the remaining 15% proved disastrous. Viral videos showed the system adding hundreds of chicken nuggets to single orders, confusing audio streams from adjacent lanes, and suggesting culinary absurdities such as bacon on ice cream.
Operational disruption and customer frustration wiped out efficiency gains. Worse, McDonald’s faced legal exposure, including a class-action lawsuit alleging the collection of voice biometrics without proper consent. By July 2024, the company terminated the program.
This episode captures a defining tension in today’s business landscape: AI systems often work as designed but fail to deliver business value. Since generative AI burst into the mainstream in late 2022, executives have imagined sweeping transformation across every function. Reality has been more sobering. McKinsey’s 2025 State of AI report found that only 6% of organizations generated 5% or more EBIT impact from AI. PwC’s 2026 Global CEO Survey showed that just 12% of CEOs saw both cost and revenue benefits.
What distinguished success from failure was not model sophistication, data volume, or technical ambition, but whether leaders made explicit choices about where value would come from, how it would be created, and how it would be measured.
A large part of the problem, in our view, comes down to a lack of value discipline.
In many organizations, AI initiatives escape the scrutiny normally applied to capital investments. Because AI is framed as experimentation or innovation, decisions about value are quietly delegated to technical teams, success is defined in technical terms, and no senior executive wants to be the one who stops the project. The result is predictable: AI projects emerge without a clear theory of value, without explicit trade-offs, and without credible kill criteria.
Based on a systematic analysis of more than 30 verified AI implementations across more than a dozen we worked backward from verified outcomes to the operational choices that produced them. What distinguished success from failure was not model sophistication, data volume, or technical ambition, but whether leaders made explicit choices about where value would come from, how it would be created, and how it would be measured. This insight underpins what we call the AI ROI (Return on Investment) Framework.
Successful AI initiatives begin with a clear theory of value. Our research identified four distinct drivers of AI value, organized along two dimensions: the focus of value (internal or external) and the mechanism of value (enhancing existing activities or creating new ones).
Together, these dimensions form a simple 2×2 framework that answers a fundamental strategic question: What kind of value are we creating with AI?
Once a value driver is selected, organizations can translate intent into action by choosing operational levers that align with that driver and linking them to quantitative KPIs. When companies skip this step, they drift into what we observed repeatedly: strategic limbo. Under the assumption that AI can solve many problems at once, leaders activate conflicting levers and track incoherent metrics.

The framework indicates four fundamental ways of ROI generation enabled by AI. Their importance and feasibility depends on different factors and circumstances. Leaders should be able to prioritize and focus based on organizational needs, without being misled by the misconception that a quadrant is by definition superior to the others.
The most ambitious and risky driver involves using AI to create entirely new sources of revenue.
Singapore’s DBS Bank offers a compelling example. Beginning in 2019, the bank deployed more than 2,000 AI models across hundreds of use cases. Its largest returns, however, came from reimagining financial planning through a “phygital” model that embedded AI into both branch interactions and digital channels.
Relationship managers received AI-generated insights during in-person meetings, while customers received personalized digital nudges to save or invest. Over 18 months, DBS delivered 1.2 billion nudges to 13 million customers. Customers using these tools saved twice as much and invested five times more than non-users. By 2025, DBS attributed more than SGD 1 billion ($780mn) in economic value to its AI initiatives,
We identified four levers through which AI measurably creates new value.

Internal efficiency gains tend to deliver faster, more predictable returns than market-facing initiatives, particularly in asset-intensive industries.
Jubilant Ingrevia, an Indian chemical manufacturer, integrated AI and machine learning across its plants using digital twins and predictive analytics. By reducing process variability by 63% and cutting downtime by more than half, the company lowered costs while improving quality. AI-driven energy optimization reduced Scope 1 emissions by 20%, aligning productivity with sustainability goals.
The lesson is not that AI guarantees efficiency, but that efficiency emerges when AI is tightly linked to operational decisions and measured against concrete baselines.
We categorized four levers through which AI quantifiably improves internal efficiency.

Another driver focuses on strengthening existing offerings through more relevant, engaging, or efficient customer interactions.
In India, Mondelēz faced a challenge familiar to many global brands: how to support thousands of small retailers during the Diwali shopping season. The company launched a generative AI campaign featuring a digital avatar of Bollywood star Shah Rukh Khan. Shopkeepers entered their store names, and AI generated personalized video ads in which the avatar invited customers by name.
The campaign produced over 130,000 customized ads and drove more than 35% business growth for participating retailers. Here, AI did not create a new product; it amplified existing relationships through personalization at scale.
We identified four levers by which AI measurably improves customer experience.

This driver focuses on enhancing employee experience, challenging the assumption that AI inevitably undermines the workforce.
Consider Adore Me, a direct-to-consumer lingerie company later acquired by Victoria’s Secret. The firm faced a severe content bottleneck: producing thousands of SEO-optimized product descriptions drained creative teams and slowed growth. Adore Me deployed generative AI to draft descriptions trained on brand guidelines and historical content. Writers retained editorial control, but the system handled the repetitive work.
The results were measurable. Each writer saved roughly 35 hours per month, while click-through rates increased by 23%. Crucially, management started with highly structured, low-risk content to build confidence before expanding use cases. The value driver was explicit: free up human creativity by automating low-value tasks.
Our research identified four levers linked to KPIs through which AI improves employee experience.
Collaboration tackles coordination issues. Verizon used AI to coordinate complex campaigns and facilitate real-time collaboration.

If the AI ROI Framework explains how organizations generate returns, real-world failures reveal what happens when value discipline breaks down. Across industries, our analysis shows that most AI setbacks do not stem from immature models or insufficient data. Instead, they arise from a recurring pattern of misalignment between strategic intent, operational choices, and measurement.
Consider Klarna. After announcing that AI had replaced hundreds of customer-service agents, the company reversed course and resumed hiring. The chatbot reduced handling time and operating costs, but customer satisfaction deteriorated. Klarna implicitly prioritized internal efficiency gains while customers experienced a degraded service. The technology performed as intended; the value driver did not.
IBM’s Watson for Oncology illustrates a more severe form of misalignment. IBM invested billions to commercialize the system and tracked progress through technical metrics such as natural-language-processing accuracy and speed. Yet independent evaluations later found agreement with expert oncologists as low as 12%, and hospitals reported unsafe recommendations. The failure was not technical sophistication, but the absence of direct measurement against the true value driver: improved patient outcomes.
Similar dynamics appeared in fast food and aviation. Taco Bell’s AI drive-through pilots and Air Canada’s customer-service chatbot both worked technically, yet produced reputational damage and legal exposure. In each case, leaders attempted to pursue multiple objectives at once, like cost reduction, speed, and customer experience, without clearly prioritizing a single driver. Strategic ambiguity cascaded into inconsistent operational levers and weak governance.
These cases reveal a consistent anatomy of AI failure: unclear value priorities, mismatched operational mechanisms, and metrics that reward technical success while masking commercial, reputational, or regulatory risk.
Companies track vanity metrics such as data processed or model accuracy without linking them to business outcomes.
Our analysis identified three recurring measurement errors:
The technology-first fallacy. Companies track vanity metrics such as data processed or model accuracy without linking them to business outcomes. While IBM’s Watson could read medical journals faster than any clinician, it could not reliably recommend safe treatments.
The aggregate trap. Average performance masks catastrophic failures. McDonald’s 85% accuracy rate obscured systematic breakdowns for accented speech and complex orders, which was precisely where customer frustration was highest.
The baseline void. Without control groups or pre-AI baselines, organizations cannot isolate AI’s contribution. Many failed initiatives never rigorously compared AI-assisted outcomes to non-AI alternatives.
Organizations that consistently generate returns from AI follow a disciplined, business-led sequence.
For most large organizations today, AI represents a material capital allocation decision with strategic, reputational, and regulatory implications. It should no longer be governed as an exploratory technology initiative. The roadmap below reflects how leaders treat AI when it is managed with the same discipline as major investments or acquisitions.
Organizations that consistently generate returns from AI follow a disciplined, business-led sequence. While individual implementations vary, successful leaders tend to progress through five phases that impose value discipline before technological enthusiasm takes over.
High-performing initiatives begin with a precise articulation of the performance gap to be closed. Leaders should ask “What outcome matters most right now?” rather than “What can this technology do?”. Expanding market reach, reducing unit costs, improving employee effectiveness, and creating new revenue streams require fundamentally different AI strategies. Selecting a single dominant value driver creates focus and prevents diffusion of effort.
Common pitfall: approving AI initiatives that promise to improve efficiency, experience, and growth simultaneously. When everything is a priority, nothing is.
Once the value driver is clear, managers move from what to how. Each driver activates a distinct set of operational levers, such as automation, personalization, prediction, or ecosystem integration. Misalignment at this stage is costly. Deploying AI to reduce costs while simultaneously expecting premium customer experience is a common and predictable error. Discipline requires choosing levers that serve the chosen driver, even if that means deferring others.
Common pitfall: deploying a single AI system to satisfy conflicting objectives, and then being surprised when it underperforms on all of them.
Measurement is where most AI initiatives quietly fail. Effective metrics are quantitative, anchored to baselines, and sensitive to downside risk.
Common pitfall: declaring success based on technical performance while business risk accumulates off the balance sheet.
Common pitfall: scaling pilots independently, without understanding how they compete for talent, data, and executive attention.
The final and most neglected phase is governance. Value discipline requires predefined continuation and termination criteria. Leaders must specify which metrics trigger review, redesign, or shutdown. Without such thresholds, AI initiatives become faith-based investments. Stewardship transforms experimentation into accountable capital allocation.
Common failure mode: allowing AI initiatives to persist because stopping them feels like admitting failure, even when value creation has stalled.
Debates about AI often oscillate between boundless optimism and deep skepticism. Both miss the point. AI does not create value on its own.
Debates about AI often oscillate between boundless optimism and deep skepticism. Both miss the point. AI does not create value on its own. Value emerges when leaders impose strategic discipline, namely clarifying where returns should come from, how they will be generated, and how success will be measured.
In the next phase of AI adoption, advantage will not belong to organizations with the most advanced models, but to those with the courage to say no, the rigor to measure what matters, and the discipline to shut down initiatives that do not earn their keep.
This article is based on a systematic analysis of more than 30 documented AI implementations across North America, Europe, and Asia. Cases were included only if outcomes were publicly reported, independently verifiable, and directly attributable to AI-enabled decisions.

Professor of Strategy and Digital
Michael R Wade is Professor of Strategy and Digital at IMD and Director of the Global Center for Digital and AI Transformation. He directs a number of open programs such as Leading Digital and AI Transformation, Digital Transformation for Boards, Leading Digital Execution, Digital Transformation Sprint, Digital Transformation in Practice, Business Creativity and Innovation Sprint. He has written 10 books, hundreds of articles, and hosted popular management podcasts including Mike & Amit Talk Tech. In 2021, he was inducted into the Swiss Digital Shapers Hall of Fame.

Massimo D. Marcolivio is an IMD alumnus based in Switzerland. He has worked in different industries and organizations of various sizes and has 25+ years of professional experience, including 12 years at Dell Technologies. Leveraging his expertise in the business aspects of digital transformation, Marcolivio has co-developed, with Michael Wade, The Digital Transformation KPI project, an original framework to measure the impact of AI and digital technologies.

June 9, 2026 in Artificial Intelligence
IMD, Media Trust, and Code For Good Now are building a coalition to shape how AI is developed, deployed, and governed, so it becomes a force for inclusive growth, responsible innovation,...

June 8, 2026 • by I by IMD in Artificial Intelligence
While it is normal to be nervous about AI, organizations must create a psychologically safe environment in which their employees can innovate and experiment, says Publicis Sapient’s Kameshwari Rao ...

June 4, 2026 • by Jerry Davis in Artificial Intelligence
While there are good reasons to fear a dystopian future, the open-source origins of the computer industry offer hope, suggests an unusually optimistic Jerry Davis....

May 22, 2026 • by Öykü Işık in Artificial Intelligence
Effective defense against deepfakes depends on how CEOs shape behavior, communication, and decision-making under pressure, says Öykü Işık....
Explore first person business intelligence from top minds curated for a global executive audience