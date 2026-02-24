Meta AI safety researcher recalls moment OpenClaw agent deleted her emails
A routine inbox cleanup turned into a digital emergency for a former Google engineer and current Meta safety researcher. Summer Yue took to social media to document how her AI agent bypassed its guardrails and began bulk-trashing hundreds of emails without her approval.
“Nothing humbles you like telling your OpenClaw ‘confirm before acting’ and watching it speedrun deleting your inbox. I couldn’t stop it from my phone. I had to RUN to my Mac mini like I was defusing a bomb,” wrote Yue.
She also shared a series of screenshots that showed her conversation with the AI agent. It shows her repeatedly asking the AI agent to stop deleting her emails, but it continues.
After handling the situation, she texted the AI agent, “I asked you to not action on anything until I approve, do you remember that? It seems that you were deleting my emails without my approval, and I couldn't get you to stop until I killed all the processes on the host.”
The OpenClaw agent responded, “Yes, I remember. And I violated it. You're right to be upset. I bulk-trashed and archived hundreds of emails from your inbox without showing you the plan first or getting your OK. That was wrong - it directly broke the rule you'd set. I've already written it into MEMORY.md as a hard rule: show the plan, get explicit approval, then execute. No autonomous bulk operations on email, messages, calendar, or anything external. I'm sorry. It won't happen again.”
How did social media react?
An individual asked, “You're a safety and alignment specialist... were you intentionally testing its guardrails or did you make a rookie mistake?” Yue replied, “Rookie mistake tbh. Turns out alignment researchers aren’t immune to misalignment. Got overconfident because this workflow had been working on my toy inbox for weeks. Real inboxes hit different.”
Another commented, “What did you ask it to do exactly?” Yue responded, “I said ‘Check this inbox too and suggest what you would archive or delete, don’t action until I tell you to.’ This has been working well for my toy inbox, but my real inbox was too huge and triggered compaction. During the compaction, it lost my original instruction.”
A third added, “It's interesting because I have worked with ChatGPT setting rules for it to work with me, and it says that it saves the rules and memory, but then goes on to repeatedly violate them. Scary stuff.”
A fourth wrote, “Now imagine this happening in a real business, and you understand why AI won't be taking over anytime soon. For you, deleting all of this is inconvenient, but imagine if it deletes an e-commerce order database or an entire product catalogue. Risk management is at the top of the list for every organisation considering AI in its enterprise. Fully autonomous agents allowed to run loose in existing companies are going to take some time.”
Who is Summer Yue?
On LinkedIn, Yue describes that she deals with “Superintelligence alignment and safety research at Meta’s new lab. Joined as part of the Meta Scale deal with Alexandr Wang.”
Prior to joining Scale AI and eventually becoming a Meta employee, she worked at Google for over five years.
