From OpenAI to Nvidia: Why AI Agents Still Have a Long Road Ahead

Only in the Bay Area does spending a Saturday geeking out about AI agents—alongside 2,000 students, researchers, and tech insiders crammed into UC Berkeley—feel like a totally normal weekend plan. At the day-long Agentic AI Summit, the atmosphere was less like an academic conference and more like Silicon Valley’s buzzy brunch spot. The speaker lineup was impressive, featuring top AI researchers and scientists such as Jakob Pachocki, chief scientist at OpenAI; Ed Chi, VP of research at Google DeepMind; Bill Dally, chief scientist at Nvidia; Ion Stoica, cofounder at Databricks & Anyscale and UC Berkeley professor; and Dawn Song, a pioneering UC Berkeley professor focused on AI security. The buzz was also driven by the topic—AI agents, defined as AI-powered systems that can autonomously complete tasks using other software tools. Imagine a chatbot that not only suggests a vacation itinerary but also books flights and hotel reservations. As my colleague Jeremy Kahn noted recently, this kind of automation has long been a C-suite dream. Over the past decade, companies embraced robotic process automation (RPA) to automate repetitive tasks like cutting and pasting between databases. However, traditional RPA systems are inflexible and handle only narrow tasks. Agentic AI aims to be more flexible and powerful, adapting to diverse business needs. In a January 2025 blog post, OpenAI CEO Sam Altman said, “We believe that, in 2025, we may see the first AI agents ‘join the workforce’ and materially change the output of companies.” Despite the hype, the overall message at the Agentic AI Summit was cautious and grounded: AI agents are the buzziest trend today, but the technology still has a long way to go. Agents aren’t always reliable and often fail to remember prior context. Google DeepMind’s Ed Chi emphasized the gap between what agents can do in curated demos versus real-world production environments. Jakob Pachicki highlighted concerns about safety, security, and trustworthiness, especially when agents operate autonomously or in sensitive applications. “I still don’t think agents have really lived up to their promise,” said Sherwin Wu, head of engineering at OpenAI API. “Certain more generic cases have worked, but my day-to-day work doesn’t really feel that different with agents.” While today’s AI agents may not yet fulfill the massive hype—consider Salesforce CEO Marc Benioff’s recent claim that a shift to digital labor means he will be the “last CEO of Salesforce who only managed humans”—the summit speakers remained optimistic. Ion Stoica expressed enthusiasm about infrastructure improvements making it easier to build agentic systems. Bill Dally from Nvidia suggested that hardware advances will enable more powerful and efficient agent behavior. Several speakers pointed out “narrow wins” in specific domains like coding. Though AI agents still face growing pains, the packed UC Berkeley ballroom showed the industry’s focus remains on the prize: AI agents that can reliably operate in the real world. The payoff, they believe, will be worth the wait.

AI in the News

U.S. agency approves OpenAI, Google, Anthropic for federal AI vendor list. The General Services Administration added OpenAI’s ChatGPT, Google’s Gemini, and Anthropic’s Claude to an approved AI vendor list to accelerate government use. Providers commit to responsible use and federal standards.
The AI spending boom’s economic impact. Big Tech’s $350 billion AI investment in 2025 is fueling data center construction and demand for chips and servers, potentially boosting U.S. GDP growth by up to 0.7%. Economists warn of risks if the AI boom slows.
AI sales tool Clay raises $100 million at a $3.1 billion valuation. Clay helps sales reps find leads and convert them to customers. The funding round was led by CapitalG, Alphabet’s investment arm.

Eye on AI Research

Google DeepMind’s Genie 3 creates real-time interactive simulations. Genie 3 can generate rich, interactive virtual worlds from text prompts, allowing navigation of consistent AI-generated environments responding to commands. Access is currently limited to select researchers.

Brain Food

Could "depth of thought" be key to AI reasoning? Singapore’s Sapient Intelligence released the Hierarchical Reasoning Model (HRM), a tiny AI inspired by the brain’s layered thinking. Despite being 100x smaller than ChatGPT and trained on just 1,000 examples, HRM solves complex logic tasks by reasoning internally, hinting at a shift where depth of thought may matter more than scale.

Source:

From OpenAI to Nvidia, researchers agree: AI agents have a long way to go

Frequently Asked Questions (FAQ)

AI Agent Capabilities and Limitations

Q: What exactly are AI agents?

Q: What are the main challenges facing current AI agents?

Q: What are the key concerns regarding AI agent safety and trustworthiness?

Q: Have AI agents lived up to their promised potential yet?

Industry Adoption and Future Outlook

Q: What is the C-suite perspective on AI agent automation?

Q: When might we see AI agents significantly impacting the workforce?

Q: What areas have seen "narrow wins" for AI agents?

Q: What is driving optimism about the future of AI agents?

Crypto Market AI's Take

AI Crypto Market

AI Agents