August 6, 2025
5 min read
Sharon Goldman
Top AI researchers agree AI agents show promise but still face reliability, safety, and real-world deployment challenges.
From OpenAI to Nvidia, researchers agree: AI agents have a long way to go
Only in the Bay Area does spending a Saturday geeking out about AI agentsâalongside 2,000 students, researchers, and tech insiders crammed into UC Berkeleyâfeel like a totally normal weekend plan. At the day-long Agentic AI Summit, the atmosphere was less like an academic conference and more like Silicon Valleyâs buzzy hotspot. The speaker lineup was impressive, featuring Jakob Pachocki, chief scientist at OpenAI; Ed Chi, VP of research at Google DeepMind; Bill Dally, chief scientist at Nvidia; Ion Stoica, cofounder at Databricks & Anyscale and UC Berkeley professor; and Dawn Song, a UC Berkeley professor specializing in AI security. The summitâs focus was on AI agentsâAI-powered systems that can autonomously complete tasks using other software tools. Imagine a chatbot that not only suggests a vacation itinerary but also books flights and hotel reservations. As my colleague Jeremy Kahn noted in a recent article, this kind of automation has long been a C-suite dream. Over the past decade, companies have used robotic process automation (RPA) to automate repetitive tasks like data entry. However, traditional RPA is inflexible and limited to narrow tasks. Agentic AI aims to be more flexible and powerful, adapting to complex business needs. OpenAI CEO Sam Altman expressed optimism in a January 2025 blog post, saying, âWe believe that, in 2025, we may see the first AI agents âjoin the workforceâ and materially change the output of companies.â However, despite the hype, the overall message at the summit was cautious. AI agents remain unreliable and often fail to remember previous interactions. Ed Chi from Google DeepMind emphasized the gap between impressive demos and real-world production readiness. Jakob Pachcki highlighted concerns about safety, security, and trustworthiness, especially when agents operate autonomously in sensitive areas. Sherwin Wu, head of engineering at OpenAI API, remarked, âI still donât think agents have really lived up to their promise. Certain more generic cases have worked, but my day-to-day work doesnât really feel that different with agents.â Though todayâs AI agents may not yet fulfill the massive expectationsâcontrasting with Salesforce CEO Marc Benioffâs recent claim that digital labor will transform leadershipâthe summit speakers remained optimistic. Ion Stoica pointed to infrastructure improvements easing agent development. Bill Dally from Nvidia noted that hardware advances will enable more powerful and efficient agents. Several experts cited ânarrow winsâ in specific domains such as coding. Despite current limitations, the packed UC Berkeley ballroom reflected the industryâs focus on achieving AI agents capable of reliable real-world operation. The potential payoff, they believe, will be worth the wait.AI in the News
- U.S. agency approves OpenAI, Google, Anthropic for federal AI vendor list. The General Services Administration added OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude to an approved AI vendor list to accelerate government adoption. These providers commit to responsible use and federal compliance.
- AI spending boom impacts U.S. economy. Big Techâs $350 billion AI investment in 2025 is fueling data center construction and chip demand, potentially boosting GDP growth by up to 0.7%. Economists warn of risks if the AI boom slows.
- Sales tool Clay raises $100 million at $3.1 billion valuation. Clay, which helps sales reps find leads, raised funds led by CapitalG, Alphabetâs investment arm.
- Google DeepMindâs Genie 3 'world model' creates real-time interactive simulations. Genie 3 generates dynamic virtual worlds from text prompts, enabling navigation at 24 frames per second. It represents a step toward AI systems that understand and simulate real-world environments, crucial for advanced agents and general AI.
- AI-driven Crypto Trading Tools Reshape Market Strategies in 2025
- The Future of Cryptocurrency: What's Changing and Why It Matters
- Understanding AI Agent Washing: Risks and Realities
Eye on AI Research
This article was originally published by Sharon Goldman at Fortune on August 5, 2025.