95% of AI Pilots Deliver No Economic Return

The report that generated that headline and caused a stir is this one:
https://mlq.ai/media/quarterly_decks/v0.1_State_of_AI_in_Business_2025_Report.pdf

This headline was somewhat controversial a week ago, where, in magazines like Forbes, it was given an interesting twist:
https://www.forbes.com/sites/jaimecatmull/2025/08/22/mit-says-95-of-enterprise-ai-failsheres-what-the-5-are-doing-right/

Into a headline like this:

“95% of enterprise AI fails”

The way it’s phrased changes things quite a lot.

Because what is actually being discussed is a PILOT, not a product launched into production and operating with real users 24/7.

In fact, going back to what the MIT report’s executive summary says:

“Despite $30–40 billion in enterprise investment into GenAI, this report uncovers a surprising result in that 95% of organizations are getting zero return. The outcomes are so starkly divided across both buyers (enterprises, mid-market, SMBs) and builders (startups, vendors, consultancies) that we call it the GenAI Divide. Just 5% of integrated AI pilots are extracting millions in value, while the vast majority remain stuck with no measurable P&L impact. This divide does not seem to be driven by model quality or regulation, but seems to be determined by approach.”

So, there are a couple of interesting takeaways behind this story, which are also recurring in the IT world with other technologies we’ve now normalized:


1. 95% of pilots are worthless, but the 5% generate millions.

This is nothing new; the same thing happened with apps in the Android and iOS stores.

At the beginning of the app store era, there was a gold rush to create apps for everything, even with absurd use cases. But people on their phones (check yours if you want) only really keep a handful of useful apps, or ones that provide satisfaction or connection:

WhatsApp/Telegram/Signal, TikTok, X, Instagram/Facebook, LinkedIn, YouTube, Spotify, Duolingo, Tinder, banking app, Uber, Maps, web browser(s), a game or two, notes app, and maybe one specific work-related app.

And that’s it. You don’t need more.

With AI (and I especially emphasize Generative AI), the same thing happens. And since Generative AI makes it easy to spin up pilots with minimal effort, the percentage of “failures” obviously rises.

This leads to a classic truth of crappy software projects that has existed since the beginning of this industry:

Low cost, low risk aversion, low commitment, low quality, zero results.


2. Tangible profit depends on the approach.

Or, in business jargon: there must be a business case with a positive ROI that justifies the investment.

Or, in plain language: build something useful.

I often scroll through LinkedIn to stay updated, and I’m honestly sick of all the automation hype with tools like n8n and similar systems sold as a panacea. They are very useful for automating repetitive task chains with some variability—something nearly impossible before LLMs. But if you really had someone employed solely for those kinds of tasks, you now need to rethink your entire business model.

That said, the good thing is that this wave of Generative AI digitalization is forcing businesses to formalize and document their work. Something that would have been useful anyway, but now they see it as necessary to avoid being left behind.


How does this affect us?

Some friends asked me recently:

“Will we go back to an AI winter?”

I doubt it. Rather, we’re in an AI bubble that will burst sooner rather than later. On top of that, the global economy is in terrible shape, so bursting a bubble is actually a welcome corrective effect.

The irony is that even the very person who helped create the bubble, Sam Altman—CEO of OpenAI and former president of Y Combinator, one of the world’s top startup investment funds—admits it:
https://arstechnica.com/information-technology/2025/08/sam-altman-calls-ai-a-bubble-while-seeking-500b-valuation-for-openai/


So where does this leave those of us who work in this field?

There are several types of users coexisting in the AI world, each approaching it differently. Here’s a summary:


AI Adoption Groups

Group
Description
Education Required
Data Governance & Privacy
ROI of Investment
Challenges
Researchers and cutting-edge foundational model developers
Develop the scientific and technical foundation
Very high (PhD, research)
Very strict
Long-term
1. Sustained funding.
2. Already thinking about next architectures. Transformers are “old” (almost 10 years—https://arxiv.org/abs/1706.03762). Now working on World Models to fix known LLM issues. As Turing Award winner Yann LeCun argues, LLMs are “boring”: https://youtu.be/YLDUYm_46n0?si=a3k5IzzVZ_NiUD_H&t=24295.
Cutting-edge AI startups
Build innovative solutions trying to apply the latest AI
High (technical + entrepreneurial)
Intermediate
High risk, high reward
1. Fierce competition.
2. Lack of proprietary data.
3. Must find viable business models. Will users pay 20x current costs to sustain them if model prices don’t drop? Example: Cursor: https://youtu.be/cMLqa7cJ64I?si=ymBxBWOceJJRaRQi&t=247.
Mid-to-large enterprises
Co-create data solutions, classical AI, and Generative AI. This is actually the space where we work on Galde, by the way.
Mixed (technology + business)
High, necessary to deploy AI effectively
Medium-high
1. Legacy system integration.
2. Organizational cultural change.
3. Regulatory barriers (e.g., AI Act).
4. Create technically and economically valid prototypes that pass the business case beyond the pilot.
5. Keep up with trends in data products, classical AI, and Generative AI.
Micro and small businesses
AI-powered automation
Medium
Low, though they must have documented processes to automate
Quick, limited
1. Limited resources.
2. Need to identify critical processes to leverage investment.
3. Must beware of reliability issues in automation and not “cut corners” by skipping professional review (lawyers, designers, etc.).
4. Lack of training in using Generative AI and understanding realistic expectations.
Individual users
Personal use with free tools
Variable
Practically none
Personal, not economic
1. Privacy risks with uploaded docs, especially in free versions.
2. Must beware of result reliability and not skip professional review (lawyers, designers, etc.).
3. Lack of training in using Generative AI and understanding realistic expectations.

As you can see, there are two key takeaways here: education and privacy.

That’s why I’ll leave you with two videos from people who explain this far better than I do: Andrew Ng, on education, and Meredith Whittaker, on privacy.

Expanded reflection in The Batch: https://www.deeplearning.ai/the-batch/issue-292/

 

 


Oh, and while you’re at it, don’t forget to subscribe to the Bennytacora newsletter in the box above. Let’s keep in touch.

Scroll to Top