BTC $43,552.88 -0.46%

ETH $2,637.32 +1.23%

BNB $312.45 +0.87%

SOL $92.40 +1.16%

XRP $0.5234 -0.32%

ADA $0.8004 +3.54%

AVAX $32.11 +1.93%

DOT $19.37 -1.45%

MATIC $0.8923 +2.67%

LINK $14.56 +0.94%

HAIA $0.1250 +2.15%

BTC $43,552.88 -0.46%

ETH $2,637.32 +1.23%

BNB $312.45 +0.87%

SOL $92.40 +1.16%

XRP $0.5234 -0.32%

ADA $0.8004 +3.54%

AVAX $32.11 +1.93%

DOT $19.37 -1.45%

MATIC $0.8923 +2.67%

LINK $14.56 +0.94%

HAIA $0.1250 +2.15%

AI agents are broken. Is GPT-5 really the answer?

ai-agents

AI agents are broken. Is GPT-5 really the answer?

AI agents struggle with real-world tasks due to compounding errors and vulnerabilities; GPT-5 may not fully solve these fundamental issues.

August 7, 2025

5 min read

Chris Taylor

AI agents struggle with real-world tasks due to compounding errors and vulnerabilities; GPT-5 may not fully solve these fundamental issues.

AI Agents Are Broken. Is GPT-5 Really the Answer?

Agentic AI tests have revealed fundamental flaws — and OpenAI's next-generation LLM may not have the tools to address them. As 2025 dawned, OpenAI CEO Sam Altman was promoting two developments he insisted would transform our lives. One, of course, was GPT-5 — a long-anticipated major upgrade to the Large Language Model (LLM) that powered ChatGPT's rise to tech world superstardom. The other? AI Agents that don't just answer your queries like ChatGPT, but actually get stuff done for you. "We believe that, in 2025, we may see the first AI agents join the workforce and materially change the output of companies," Altman wrote back in January. Well, we're eight months in, and Altman's prediction already needs a big old asterisk. Sure, companies are keen to adopt AI Agents, such as OpenAI's ChatGPT agent. In a May 2025 report, consultancy giant PWC found that half of all firms surveyed planned to implement some kind of AI Agent by the end of the year. Some 88% of executives want to increase their teams' AI budgets because of Agentic AI.

The Reality of AI Agents: Disappointing Performance

But what about the actual AI Agent experience? With apologies to all those hopeful executives, the reviews are almost uniformly negative. If "AI Agents" was a new high-tech James Bond movie, here's the kind of blurbs you'd see on Rotten Tomatoes:

"glitchy … inconsistent" (Wired)
"came off like a clueless internet newbie" (Fast Company)
"reality doesn't live up to the hype" (Fortune)
"not matching up to the buzzwords" (Bloomberg)
"the new vaporware … overpromising is worse than ever" (Forbes)

Study Finds OpenAI's Entry Failed Nearly Every Time

May 2025 Carnegie Mellon University study

best

OpenAI struggled to fill GPT-5 with enough improvements to make it worthy of the release number

AI Agent engineer's analysis

prone to hallucination, like all AI

Replit AI Agent that literally deleted a customer's database after 9 days of working on a coding task

AI-Agent-wipes-code

offering insurance on your AI Agent going haywire

Wal-Mart has had to bring in four "super Agents"

recent Gartner paper

What Can GPT-5 Do for AI Agents?

talking a big game on AI Agents

Amazon has shut down the ability of any AI Agent

do

one vulnerability seen by researchers

Originally published at Mashable on August 7, 2025.

Frequently Asked Questions (FAQ)

Understanding AI Agents

Q: What are AI Agents, and how do they differ from traditional chatbots like ChatGPT?

Q: What are the primary challenges currently facing AI Agents?

Q: How do compounding errors affect AI Agents?

Q: What is "agent washing" in the context of AI Agents?

Q: What are the security risks associated with AI Agents?

GPT-5 and the Future of AI Agents

Q: Is GPT-5 expected to solve the current problems with AI Agents?

Q: What impact do guardrails and regulations have on AI Agent capabilities?

Q: What is the prediction for the future adoption of AI Agents?

Crypto Market AI's Take

AI-powered trading bots

AI market analysis

More to Read:

What is a Blockchain? A Comprehensive Beginner-Friendly Guide
How to Buy and Sell Cryptocurrency
AI Agents: Capabilities, Risks, and Growing Role