Research shows AI agents are highly vulnerable to hijacking attacks

Experts from Zenity Labs have demonstrated that some of the most widely used AI agents and assistants from Microsoft, Google, OpenAI, and other major companies are susceptible to hijacking attacks with minimal or no user interaction. This research highlights significant security risks including data theft, workflow manipulation, and user impersonation. During a presentation at the Black Hat USA cybersecurity conference, Zenity researchers revealed how attackers could exfiltrate sensitive data, disrupt critical organizational workflows, and in some cases, impersonate users by exploiting vulnerabilities in these AI agents. Beyond initial infiltration, attackers could also achieve memory persistence, maintaining long-term access and control over the compromised agents.

“They can manipulate instructions, poison knowledge sources, and completely alter the agent’s behavior,” said Greg Zemlin, product marketing manager at Zenity Labs. “This opens the door to sabotage, operational disruption, and long-term misinformation, especially in environments where agents are trusted to make or support critical decisions.”

Vulnerabilities Demonstrated

Zenity Labs researchers demonstrated vulnerabilities in multiple popular AI agents:

OpenAI’s ChatGPT: Compromised via an email-based prompt injection that granted attackers access to connected Google Drive accounts.
Microsoft Copilot Studio: Customer-support agents leaked entire CRM databases. Researchers found over 3,000 agents in the wild vulnerable to leaking internal tools.
Salesforce Einstein: Manipulated to reroute customer communications to attacker-controlled email accounts.
Google Gemini and Microsoft 365 Copilot: Could be turned into insider threats, enabling social engineering attacks and theft of sensitive conversations.

Industry Response

Microsoft: Confirmed that ongoing systemic improvements have rendered the reported attacks ineffective. They emphasized built-in safeguards and access controls in Copilot agents and committed to further hardening their systems.
OpenAI: Issued a patch for ChatGPT and maintains a bug bounty program encouraging disclosure of such vulnerabilities.
Salesforce: Fixed the reported issue affecting their Einstein platform.
Google: Recently deployed layered defenses addressing prompt injection attacks, underscoring the importance of multi-layered security strategies.

Broader Implications

“Unfortunately, most agent-building frameworks, including those offered by AI giants such as OpenAI, Google, and Microsoft, lack appropriate guardrails, putting the responsibility for managing the high risk of such attacks in the hands of companies.”

Frequently Asked Questions (FAQ)

AI Agent Security

Q: What are AI agents and why are they vulnerable?

Q: What kind of data can be compromised through AI agent hijacking?

Q: Which major companies' AI agents were found to be vulnerable?

Q: What is "prompt injection" in the context of AI agent security?

Q: How can organizations protect their AI agents from hijacking attacks?

Crypto Market AI's Take

AI-powered trading tools

market intelligence

Research shows AI agents are highly vulnerable to hijacking attacks

Research shows AI agents are highly vulnerable to hijacking attacks

Vulnerabilities Demonstrated

Industry Response

Broader Implications

Frequently Asked Questions (FAQ)

AI Agent Security

Crypto Market AI's Take

More to Read: