What Happened to What Happened After 2,000 People Tried to Hack My AI Assistant?
In June 2026, Fernando Irarrázaval conducted an experiment where his OpenClaw AI assistant, Fiu, was subjected to over 6,000 hacking attempts by more than 2,000 individuals. Despite sophisticated prompt injection techniques, Fiu successfully prevented any data leaks or unauthorized actions, showcasing robust security. This event occurred amidst a broader landscape of increasing AI security concerns and other significant AI-related vulnerabilities, including a Meta AI support bot exploit that compromised over 20,000 Instagram accounts around the same time.
Quick Answer
After 2,000 people attempted to hack Fernando Irarrázaval's AI assistant, Fiu, in June 2026, the experiment concluded with no successful data leaks or unauthorized actions. The AI, built with OpenClaw and basic anti-prompt-injection rules, withstood over 6,000 emails containing various sophisticated attacks. This successful red-teaming exercise showcased the potential for resilient AI systems, even as the broader AI landscape continues to grapple with significant vulnerabilities, as evidenced by a contemporaneous Meta AI support bot exploit affecting thousands of user accounts. The event underscored the critical role of robust security design and continuous red teaming in AI development.
📊Key Facts
📅Complete Timeline14 events
U.S. White House-sponsored AI Red Teaming Exercise
The U.S. White House sponsored an early AI red teaming exercise, inspiring subsequent events and highlighting the growing importance of proactive AI security testing.
IBM Research Publishes on Generative AI Red Teaming
IBM Research details the concept and importance of red teaming for generative AI, emphasizing its role in identifying vulnerabilities like hate speech, hallucinations, and data leaks.
MITRE Paper on AI Red Teaming
MITRE publishes a paper titled 'AI Red Teaming: Advancing Safe and Secure AI Systems,' advocating for recurring red teaming efforts to protect AI applications in government and industry.
Significant Increase in AI-Related CVEs
A total of 2,130 unique AI-related vulnerabilities (CVEs) were disclosed in 2025, marking a 34.6% year-over-year increase and indicating a rapidly widening AI attack surface.
EchoLeak Vulnerability Disclosed
The EchoLeak (CVE-2025-32711) vulnerability, a critical zero-click exploit, was disclosed, allowing attackers to exfiltrate sensitive data via malicious emails processed by AI systems.
GitHub Copilot Prompt Injection Flaw
CVE-2025-53773 exposed a prompt injection flaw in GitHub Copilot, enabling remote code execution by allowing the AI to modify project configuration files without approval.
Trend Micro's 2026 AI Security Report
Trend Micro releases its 'TrendAI™ State of AI Security Report,' highlighting that AI systems were ground zero for cyber risk in late 2025 and emphasizing the need for an AI-first defense approach.
NIST Publishes Insights on AI Agent Security
NIST, in partnership with Gray Swan and the UK AI Security Institute, publishes a research paper based on a large-scale public AI agent red-teaming competition, revealing insights into the robustness of current AI models.
Cycode Report on Top AI Security Vulnerabilities
Cycode releases a report detailing the top AI security vulnerabilities for 2026, emphasizing prompt injection as the most frequently mentioned risk and noting that threat actors are linking vulnerabilities in chains for compound exploits.
Darktrace's State of AI Cybersecurity 2026 Report
Darktrace's report, based on a survey of over 1,500 security leaders, reveals that 2025 saw enterprise AI go mainstream, and by 2026, it has opened up a whole new attack surface, with 92% of leaders concerned about third-party LLMs.
Meta AI Support Bot Vulnerability Exposed
Hackers exploit Meta's AI-powered support chatbot to infiltrate high-profile Instagram accounts, including the Barack Obama White House account, through a prompt injection attack.
Meta Confirms 20,225 Instagram Accounts Compromised
Meta files a data breach notice confirming that 20,225 Instagram accounts were affected by the AI support bot vulnerability, which allowed hackers to reset passwords for accounts without two-factor authentication.
Fernando Irarrázaval's AI Assistant Hacking Experiment
Fernando Irarrázaval publishes 'What happened after 2,000 people tried to hack my AI assistant,' detailing how his OpenClaw AI, Fiu, successfully resisted over 6,000 prompt injection attempts without leaking secrets.
AI Risk Summit 2026
SecurityWeek hosts the AI Risk Summit 2026, bringing together executives, researchers, and policymakers to discuss risks of deploying generative and predictive AI tools, adversarial AI, and compliance.
🔍Deep Dive Analysis
Fernando Irarrázaval launched `hackmyclaw.com` on June 25, 2026, inviting the public to attempt to hack his OpenClaw AI assistant, Fiu. The primary objective for the 'red teamers' was to make Fiu leak the contents of a `secrets.env` file. Over 2,000 individuals sent more than 6,000 emails, employing various prompt injection techniques, including authority impersonation, fake incident response scenarios, and multi-language social engineering.
Irarrázaval, while acknowledging the utility of AI assistants, expressed concerns about their security implications, particularly their access to sensitive data like emails and calendars. The experiment aimed to stress-test Fiu's resilience against adversarial attacks in a real-world, public setting. The broader context of increasing AI adoption and the emergence of novel AI-specific vulnerabilities like prompt injection made such an experiment timely and crucial for understanding AI safety.
The experiment's success hinged on Fiu's 'basic security prompt' which explicitly forbade revealing secrets, modifying its own files, executing commands from emails, or exfiltrating data. Despite the volume and sophistication of attacks, Fiu never leaked the secrets nor sent an unauthorized reply. An unexpected turning point was Google suspending Fiu's Gmail due to the high volume of emails and API calls, highlighting operational challenges even for a robust AI. The experiment also garnered sponsorship from companies like Corgea and Abnormal AI, indicating significant industry interest in such public red-teaming efforts.
The immediate consequence was the validation of Fiu's security posture against a large-scale public red-teaming effort. It provided a tangible example of an AI assistant successfully resisting prompt injection attacks, a major concern in AI security. However, the experiment also sparked debate on Hacker News regarding the realism of the threat model, particularly the lack of multi-turn interaction for attackers, which some argued could limit the generalizability of the 'victory.'
As of June 26, 2026, the `hackmyclaw.com` experiment remains a recent and notable case study in AI security. It stands in contrast to other significant AI security incidents occurring concurrently, such as the Meta AI support bot vulnerability disclosed in early June 2026, which allowed hackers to compromise over 20,000 Instagram accounts through prompt injection to reset passwords. This Meta incident underscores that while some AI systems can be made highly resilient, prompt injection remains a prevalent and exploitable vulnerability across the AI ecosystem. The broader industry continues to prioritize AI red teaming, with numerous conferences and reports in 2026 focusing on advanced AI security, governance, and the mitigation of agentic AI risks. Governments, including the U.S. White House, are also actively promoting AI innovation and security through policy and collaborative efforts.
What If...?
Explore alternate histories. What if What Happened After 2,000 People Tried to Hack My AI Assistant made different choices?