Companies may be betting on a world where they replace humans with artificial intelligence (AI), but that bid is failing 95% of the time, according to MIT’s NANDA Program. OpenAI’s much hyped GPT-5 model, hyped to gain ‘PhD-level intelligence’, was so underwhelming new model that paying subscribers revolted to have an older GPT-4o model back. At the same time, CEO Sam Altman went about changing the definition and goalposts, for the very foundation of company’s existence — a still elusive artificial general intelligence, or AGI.

Apple’s AI deficit may seem most visible one to thread addicts on X, but look beyond the hype, and you’ll realise AI models cannot even paraphrase information from their training sets with any sort of reliability. The problem is partly technology, which I just touched upon, and partly businesses across verticals over-estimating what an AI can do within their workflow. The fact that a ChatGPT or a Grok may work well for you in personal usage, doesn’t mean it’ll understand organisational nuances including legacy systems, real-world workflows and most importantly for the accountant, whether AI is translating into a profit.

For the past few weeks, OpenAI’s disappointing (in the context of the pre-launch hype, for months) GPT-5 release seemed to provide for the perfect metaphor for the overestimation of AI. Altman’s Death Star social media post ahead of the reveal was perhaps ironic (and of course, didn’t age well). Those familiar with the Star Wars movie know, Rebel Alliance destroys the Death Star — the moment in the movie was in the Battle of Yavin during the Galactic Civil War, when Luke Skywalker fired two proton torpedoes into its vulnerable reactor core via an exhaust port, creating a chain reaction.

OpenAI didn’t in a way just blow up GPT-5’s perceived proposition, it has blown a massive hole in the grand perception generative AI enjoyed over the past couple of years, and found a safe cocoon. Overestimated capabilities, big pitches, grand valuations and the intended destination — funding. I’ve long said AI works only when there is a sensible enough individual prompting it, one who sees through the absolute balderdash generative AI keeps throwing up from time to time. Unsupervised, it tends to have a mind of its own, which shouldn’t ideally be leading decision making.

One of the reasons why I say this is, a rather worrying statistic that emerged from digital marketing platform Semrush’s AI Mode Study, released this July. They suggest that the most frequent citations by large language models (LLMs) including ChatGPT, Gemini including AI Overviews, Perplexity and so on, are as follows — Reddit (40.11% citation frequency), Wikipedia (26.33%) and YouTube (23.52%). Google Search, our traditional way to hunt for information on the internet (and by no means foolproof, thats where human instincts and context is necessary), comes in only fourth. The top three, tend to be rife with misinformation. If that is the foundation for AI’s knowledge, we aren’t exactly best placed for grand artificial intelligence dreams.

If we are still looking for metaphors, Meta provides one. Just this week, they’ve been heard saying there’s a hiring freeze, after reportedly having thrown money at the problem, with recent high-profile hirings. This is the same company that once told us to dream about a metaverse (tech companies began to do ‘metaverse press conferences’, which were basically shambolic 3D) and even changed the name from Facebook to Meta. The metaverse isn’t happening. The hiring freeze might well be the closest we’ll come to any recognition that the AI emperor might be wearing fewer clothes than Silicon Valley would like to admit.

Back to the MIT study I had referenced earlier. The numbers from MIT’s NANDA Program paint a rather sobering picture — 95% of companies attempting AI pilots are failing to achieve meaningful results. This isn’t a minor stumble — it’s a systematic failure that suggests something fundamental is broken in how we’re approaching AI implementation. Companies across industries are discovering that the gap between AI’s promise and its practical application is far wider than anyone anticipated.

The other part of the problem is that businesses are systematically overestimating what AI can do within their existing workflows. What compounds this is AI models struggling with basic tasks like reliably paraphrasing information from their training data, let alone handling complex, nuanced business operations. Businesses including the ones that make AI, and the ones that fall from the tall claims made by AI companies to deploy ‘copilots’ at the cost of human employment.

Investments and funding have poured unprecedented amounts of money into AI startups, often based on little more than algorithmic promise, a glittering presentation, and vague market potential laced with complicated terminology. Reality may well be settling in, and the discontent could well stem from what has unfolded in the past few weeks. Altman admitted this week there may well be an AI “bubble”, and coupled with the MIT report putting reality in the spotlight, tech stocks have been sliding since.

The great AI bubble isn’t just an economic phenomenon—it’s a cultural one, and most of all, a human one. No matter how incompetent a human employee may be, they’ll likely still be smarter than a bumbling chatbot. It may well be a question of when and not if, but this bubble of hype will eventually burst. The only question is whether the landing will be soft enough for the industry (and I count enterprises who happily replace humans with AI experiments within this) to learn from its mistakes, be pragmatic, and find a more sustainable use for AI? One that doesn’t overestimate machines and undervalues humans.

Vishal Mathur is the Technology Editor at HT. Tech Tonic is a weekly column that looks at the impact of personal technology on the way we live, and vice versa. The views expressed are personal.