August 6, 2025
5 min read
Sharon Goldman
Top AI researchers agree that despite hype, AI agents remain unreliable and have significant challenges before widespread real-world use.
From OpenAI to Nvidia: Why AI Agents Still Have a Long Road Ahead
Only in the Bay Area does spending a Saturday geeking out about AI agentsâalongside 2,000 students, researchers, and tech insiders crammed into UC Berkeleyâfeel like a totally normal weekend plan. At the day-long Agentic AI Summit, the atmosphere was less like an academic conference and more like Silicon Valleyâs buzzy brunch spot. The speaker lineup was impressive, featuring top AI researchers and scientists including Jakob Pachocki, chief scientist at OpenAI; Ed Chi, VP of research at Google DeepMind; Bill Dally, chief scientist at Nvidia; Ion Stoica, cofounder at Databricks & Anyscale and UC Berkeley professor; and Dawn Song, a pioneering UC Berkeley professor focused on AI security. The summitâs topicâAI agentsâhas attracted significant attention. AI agents are generally defined as AI-powered systems that can autonomously complete tasks using other software tools. Imagine a chatbot that not only suggests a vacation itinerary but also books flights and hotel reservations. As my colleague Jeremy Kahn noted in a recent article, âThis kind of automation is a perennial C-suite fever dream.â Over the past decade, companies embraced robotic process automation (RPA), software that automates repetitive tasks like cutting and pasting between databases. However, traditional RPA systems are inflexible and handle only narrow tasks. Agentic AI aims to be more flexible and powerful, adapting to diverse business needs. In a January 2025 blog post, OpenAI CEO Sam Altman said, âWe believe that, in 2025, we may see the first AI agents âjoin the workforceâ and materially change the output of companies.â Despite the hype, the overall message at the Agentic AI Summit was cautious and grounded: AI agents still have a long way to go. They arenât always reliable and often fail to remember prior context. Google DeepMindâs Ed Chi emphasized the gap between what agents can do in curated demos versus real-world production environments. Jakob Pachocki highlighted concerns about the safety, security, and trustworthiness of agentic systems, especially when integrated into sensitive or autonomous applications. âI still donât think agents have really lived up to their promise,â said Sherwin Wu, head of engineering at OpenAI API. âCertain more generic cases have worked, but my day-to-day work doesnât really feel that different with agents.â While todayâs AI agents may not yet fulfill the massive hypeâconsider Salesforce CEO Marc Benioffâs recent claim that a shift to digital labor means he will be the âlast CEO of Salesforce who only managed humansââthere remains optimism among experts. Ion Stoica from Databricks expressed enthusiasm about infrastructure improvements that simplify building agentic systems. Nvidiaâs Bill Dally pointed to ongoing hardware advances that will enable more powerful and efficient agent behaviors. Several speakers noted ânarrow winsâ in specific domains such as coding assistance. Though AI agents face growing pains, the packed UC Berkeley ballroom reflected the industryâs focus on the ultimate goal: AI agents that can reliably operate in the real world. The payoff, they believe, will be well worth the wait.AI in the News
- U.S. agency approves OpenAI, Google, Anthropic for federal AI vendor list. The General Services Administration added OpenAIâs ChatGPT, Googleâs Gemini, and Anthropicâs Claude to an approved AI vendor list to accelerate government adoption. These providers commit to responsible use and federal compliance.
- The AI spending boomâs impact on the U.S. economy. Big Techâs $350 billion AI investment this year is fueling data center construction and chip demand, potentially boosting GDP growth by up to 0.7% in 2025. Economists warn of risks if the AI boom slows.
- AI sales tool Clay raises $100 million at a $3.1 billion valuation. Clay, which helps sales reps find leads and convert customers, secured funding led by CapitalG, Alphabetâs investment arm.
- Google DeepMindâs Genie 3 world model creates real-time interactive simulations. Genie 3 generates rich, interactive virtual worlds from simple text prompts, enabling navigation at 24 frames per second. This advances DeepMindâs goal of AI systems that understand and simulate real-world environments, key to training advanced agents and artificial general intelligence. Access is currently limited to select researchers.
- AI Agents Capabilities and Risks: A Growing Role
- AI-Driven Crypto Trading Tools Reshape Market Strategies
- The Future of Cryptocurrency Explained: What's Changing and Why It Matters
Eye on AI Research
This article was originally published at Fortune on August 5, 2025.