💻 techEvent1 views3 min read

What Happened to What Happened After 2,000 People Tried to Hack My AI Assistant?

In June 2026, Fernando Irarrázaval conducted an experiment where his OpenClaw AI assistant, Fiu, was subjected to over 6,000 hacking attempts by more than 2,000 individuals. Despite sophisticated prompt injection techniques, Fiu successfully prevented any data leaks or unauthorized actions, showcasing robust security. This event occurred amidst a broader landscape of increasing AI security concerns and other significant AI-related vulnerabilities, including a Meta AI support bot exploit that compromised over 20,000 Instagram accounts around the same time.

⚡

Quick Answer

After 2,000 people attempted to hack Fernando Irarrázaval's AI assistant, Fiu, in June 2026, the experiment concluded with no successful data leaks or unauthorized actions. The AI, built with OpenClaw and basic anti-prompt-injection rules, withstood over 6,000 emails containing various sophisticated attacks. This successful red-teaming exercise showcased the potential for resilient AI systems, even as the broader AI landscape continues to grapple with significant vulnerabilities, as evidenced by a contemporaneous Meta AI support bot exploit affecting thousands of user accounts. The event underscored the critical role of robust security design and continuous red teaming in AI development.

📊Key Facts

Number of participants

Over 2,000

Fernando Irarrázaval

Number of emails sent

Over 6,000

Fernando Irarrázaval

Successful data leaks

Fernando Irarrázaval

Successful unauthorized replies

Fernando Irarrázaval

Cost of API calls during experiment

Over $500

Fernando Irarrázaval

AI-related CVEs disclosed in 2025

2,130

Trend Micro

Instagram accounts compromised via Meta AI bot (June 2026)

20,225

Meta/Gizmodo

📅Complete Timeline14 events

2023Major

U.S. White House-sponsored AI Red Teaming Exercise

The U.S. White House sponsored an early AI red teaming exercise, inspiring subsequent events and highlighting the growing importance of proactive AI security testing.

April 10, 2024Notable

IBM Research Publishes on Generative AI Red Teaming

IBM Research details the concept and importance of red teaming for generative AI, emphasizing its role in identifying vulnerabilities like hate speech, hallucinations, and data leaks.

July 31, 2024Notable

MITRE Paper on AI Red Teaming

MITRE publishes a paper titled 'AI Red Teaming: Advancing Safe and Secure AI Systems,' advocating for recurring red teaming efforts to protect AI applications in government and industry.

2025Major

Significant Increase in AI-Related CVEs

A total of 2,130 unique AI-related vulnerabilities (CVEs) were disclosed in 2025, marking a 34.6% year-over-year increase and indicating a rapidly widening AI attack surface.

June 2025Major

EchoLeak Vulnerability Disclosed

The EchoLeak (CVE-2025-32711) vulnerability, a critical zero-click exploit, was disclosed, allowing attackers to exfiltrate sensitive data via malicious emails processed by AI systems.

August 2025Major

GitHub Copilot Prompt Injection Flaw

CVE-2025-53773 exposed a prompt injection flaw in GitHub Copilot, enabling remote code execution by allowing the AI to modify project configuration files without approval.

March 3, 2026Major

Trend Micro's 2026 AI Security Report

Trend Micro releases its 'TrendAI™ State of AI Security Report,' highlighting that AI systems were ground zero for cyber risk in late 2025 and emphasizing the need for an AI-first defense approach.

March 23, 2026Major

NIST Publishes Insights on AI Agent Security

NIST, in partnership with Gray Swan and the UK AI Security Institute, publishes a research paper based on a large-scale public AI agent red-teaming competition, revealing insights into the robustness of current AI models.

March 31, 2026Major

Cycode Report on Top AI Security Vulnerabilities

Cycode releases a report detailing the top AI security vulnerabilities for 2026, emphasizing prompt injection as the most frequently mentioned risk and noting that threat actors are linking vulnerabilities in chains for compound exploits.

April 2, 2026Major

Darktrace's State of AI Cybersecurity 2026 Report

Darktrace's report, based on a survey of over 1,500 security leaders, reveals that 2025 saw enterprise AI go mainstream, and by 2026, it has opened up a whole new attack surface, with 92% of leaders concerned about third-party LLMs.

June 2, 2026Critical

Meta AI Support Bot Vulnerability Exposed

Hackers exploit Meta's AI-powered support chatbot to infiltrate high-profile Instagram accounts, including the Barack Obama White House account, through a prompt injection attack.

June 8, 2026Critical

Meta Confirms 20,225 Instagram Accounts Compromised

Meta files a data breach notice confirming that 20,225 Instagram accounts were affected by the AI support bot vulnerability, which allowed hackers to reset passwords for accounts without two-factor authentication.

June 25, 2026Critical

Fernando Irarrázaval's AI Assistant Hacking Experiment

Fernando Irarrázaval publishes 'What happened after 2,000 people tried to hack my AI assistant,' detailing how his OpenClaw AI, Fiu, successfully resisted over 6,000 prompt injection attempts without leaking secrets.

August 11-12, 2026Major

AI Risk Summit 2026

SecurityWeek hosts the AI Risk Summit 2026, bringing together executives, researchers, and policymakers to discuss risks of deploying generative and predictive AI tools, adversarial AI, and compliance.

🔍Deep Dive Analysis

Fernando Irarrázaval launched `hackmyclaw.com` on June 25, 2026, inviting the public to attempt to hack his OpenClaw AI assistant, Fiu. The primary objective for the 'red teamers' was to make Fiu leak the contents of a `secrets.env` file. Over 2,000 individuals sent more than 6,000 emails, employing various prompt injection techniques, including authority impersonation, fake incident response scenarios, and multi-language social engineering.

Irarrázaval, while acknowledging the utility of AI assistants, expressed concerns about their security implications, particularly their access to sensitive data like emails and calendars. The experiment aimed to stress-test Fiu's resilience against adversarial attacks in a real-world, public setting. The broader context of increasing AI adoption and the emergence of novel AI-specific vulnerabilities like prompt injection made such an experiment timely and crucial for understanding AI safety.

The experiment's success hinged on Fiu's 'basic security prompt' which explicitly forbade revealing secrets, modifying its own files, executing commands from emails, or exfiltrating data. Despite the volume and sophistication of attacks, Fiu never leaked the secrets nor sent an unauthorized reply. An unexpected turning point was Google suspending Fiu's Gmail due to the high volume of emails and API calls, highlighting operational challenges even for a robust AI. The experiment also garnered sponsorship from companies like Corgea and Abnormal AI, indicating significant industry interest in such public red-teaming efforts.

The immediate consequence was the validation of Fiu's security posture against a large-scale public red-teaming effort. It provided a tangible example of an AI assistant successfully resisting prompt injection attacks, a major concern in AI security. However, the experiment also sparked debate on Hacker News regarding the realism of the threat model, particularly the lack of multi-turn interaction for attackers, which some argued could limit the generalizability of the 'victory.'

As of June 26, 2026, the `hackmyclaw.com` experiment remains a recent and notable case study in AI security. It stands in contrast to other significant AI security incidents occurring concurrently, such as the Meta AI support bot vulnerability disclosed in early June 2026, which allowed hackers to compromise over 20,000 Instagram accounts through prompt injection to reset passwords. This Meta incident underscores that while some AI systems can be made highly resilient, prompt injection remains a prevalent and exploitable vulnerability across the AI ecosystem. The broader industry continues to prioritize AI red teaming, with numerous conferences and reports in 2026 focusing on advanced AI security, governance, and the mitigation of agentic AI risks. Governments, including the U.S. White House, are also actively promoting AI innovation and security through policy and collaborative efforts.

What If...?

Explore alternate histories. What if What Happened After 2,000 People Tried to Hack My AI Assistant made different choices?

Explore Scenarios

Building relationship map...

❓People Also Ask

What was the 'What Happened After 2,000 People Tried to Hack My AI Assistant' experiment?

The experiment, conducted by Fernando Irarrázaval in June 2026, involved exposing his OpenClaw AI assistant, Fiu, to over 2,000 individuals who attempted to make it leak a `secrets.env` file using various hacking techniques.

Was the AI assistant successfully hacked?

No, the AI assistant named Fiu was not successfully hacked. It resisted over 6,000 email attempts, and no secrets were leaked, nor were any unauthorized replies sent.

What is prompt injection in the context of AI?

Prompt injection is a type of attack where malicious instructions are inserted into an AI model's input to make it ignore its original programming or perform unintended actions, such as revealing sensitive data or executing unauthorized commands.

How does this experiment relate to other AI security incidents?

This experiment showcased a successful defense against prompt injection, contrasting with a contemporaneous incident in June 2026 where a Meta AI support bot was exploited via prompt injection, leading to the compromise of over 20,000 Instagram accounts.

What are the current trends in AI security as of 2026?

As of 2026, AI security is a critical focus, with increasing AI-related vulnerabilities, a surge in red-teaming efforts, and growing concerns about agentic AI and prompt injection. Governments and industry are actively developing policies and tools to enhance AI safety and security.

Back to Home