August 6, 2025
5 min read
Sharon Goldman
Top AI researchers agree AI agents hold promise but face major challenges before becoming reliable and autonomous in real-world tasks.
From OpenAI to Nvidia: Why AI Agents Still Have a Long Road Ahead
Only in the Bay Area does spending a Saturday geeking out about AI agentsâalongside 2,000 students, researchers, and tech insiders crammed into UC Berkeleyâfeel like a totally normal weekend plan. At the day-long Agentic AI Summit, the atmosphere was less like an academic conference and more like Silicon Valleyâs buzzy hotspot. The speaker lineup was impressive, featuring top AI researchers and scientists including Jakob Pachocki, chief scientist at OpenAI; Ed Chi, VP of research at Google DeepMind; Bill Dally, chief scientist at Nvidia; Ion Stoica, cofounder at Databricks & Anyscale and UC Berkeley professor; and Dawn Song, a pioneering UC Berkeley professor focused on AI security. The summitâs focus was AI agentsâAI-powered systems that can complete tasks mostly autonomously by using other software tools. Imagine a chatbot that not only suggests a vacation itinerary but also books flights and hotel reservations. As my colleague Jeremy Kahn recently noted, this kind of automation has long been a C-suite dream. Over the past decade, companies embraced robotic process automation (RPA), software that automates repetitive tasks like cutting and pasting between databases. However, traditional RPA is inflexible and limited to narrow tasks. Agentic AI aims to be more flexible and powerful, adapting to various business needs. OpenAI CEO Sam Altman expressed optimism in a January 2025 blog post, stating, âWe believe that, in 2025, we may see the first AI agents âjoin the workforceâ and materially change the output of companies.â Despite this hype, the overall message at the summit was cautious and grounded. AI agents remain far from fully reliable or autonomous. They often struggle to remember past interactions and can fail in real-world environments. Google DeepMindâs Ed Chi emphasized the gap between polished demos and what is needed for production-ready agents. Jakob Pachocki raised concerns about the safety, security, and trustworthiness of agentic systems, especially when deployed in sensitive or autonomous settings. Sherwin Wu, head of engineering at OpenAI API, admitted, âI still donât think agents have really lived up to their promise. Certain more generic cases have worked, but my day-to-day work doesnât really feel that different with agents.â While todayâs AI agents donât yet meet the massive hypeâcontrasting with claims like Salesforce CEO Marc Benioffâs recent statement that he might be the âlast CEO of Salesforce who only managed humansââthere was still plenty of optimism at the summit. Ion Stoica highlighted infrastructure improvements that simplify building agentic systems. Bill Dally pointed to ongoing hardware advances that will enable more powerful and efficient agent behaviors. Several speakers noted ânarrow winsâ in specific domains such as coding assistance. Though AI agents face growing pains, the packed UC Berkeley ballroom showed the industryâs commitment to advancing these systems. The ultimate goal remains clear: AI agents capable of reliably operating in the real world. The payoff, researchers believe, will be worth the wait.AI in the News
- U.S. agency approves OpenAI, Google, Anthropic for federal AI vendor list. The General Services Administration added OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude to an approved AI vendor list to accelerate government adoption. Providers commit to responsible use and compliance with federal standards.
- The AI spending boomâs impact on the U.S. economy. Big Techâs $350 billion AI investment in 2025 is fueling data center construction and chip demand, potentially boosting GDP growth by up to 0.7%. Economists warn of risks if the AI boom slows.
- AI sales tool Clay raises $100 million at a $3.1 billion valuation. Clay helps sales reps find leads and convert customers. The funding round was led by CapitalG, Alphabetâs investment arm.
- Google DeepMindâs Genie 3 creates real-time interactive simulations. Genie 3 generates dynamic virtual worlds from text prompts, allowing navigation and interaction in AI-generated environments. Access is currently limited to select researchers as responsible deployment is explored.
- AI Agents: Capabilities, Risks, and the Growing Role
- The Rise of AI in Cryptocurrency Trading
- Understanding AI-Powered Crypto Scams
Eye on AI Research
This article was originally published at Fortune on August 5, 2025.