Johann Rehberger @wunderwuzzi23

Hacking neural networks so that we don’t get stuck in the matrix. Red Team Director @ Electronic Arts. Entrepreneur. Builder and Breaker. Opinions are my own. embracethered.com 127.0.0.1 Joined February 2012

Tweets

677
Followers

3K
Following

632
Likes

1K

Johann Rehberger @wunderwuzzi23

2 weeks ago

Great news! 📰 The Google Labs team reached out to me directly, and a fix for the image based data exfiltration vulnerability in NotebookLM has been deployed last night! 👍

Johann Rehberger @wunderwuzzi23

2 weeks ago

Great news! 📰 The Google Labs team reached out to me directly, and a fix for the image based data exfiltration vulnerability in NotebookLM has been deployed last night! 👍

2 14 67 38K 51

0 5 22 8K 5

Johann Rehberger @wunderwuzzi23

2 weeks ago

Had a great time talking about LLM security, threats and mitigations at BSides San Diego! @BsidesSD Thanks for coming by and shout out to organizers and volunteers for putting together the event. #bsides #llm #infosec #ml

1 0 22 911 0

Download Image

Johann Rehberger @wunderwuzzi23

3 weeks ago

Just wrapped up my Prompt Injection and LLM security talk at #hackspacecon 🚀🚀🚀 Thanks for coming by and shout out to organizers & volunteers for helping make it such an excellent event. 👍 Also, fantastic location! 🛰️ 🙂

1 1 14 967 0

Download Image

Embrace The Red @EmbraceTheRed23

3 weeks ago

In this demo poc every file uploaded to Google AI Studio got summarized and exfiltrated. Details ⬇️⬇️⬇️ embracethered.com/blog/posts/202… #gemini #google #llm #redteam #promptinjection

1 1 2 370 0

Johann Rehberger @wunderwuzzi23

4 weeks ago

Homefield Advantage! Today marks four years that my book about building and managing an internal Red Team was published. 📖 🎉 Still bummed I couldn't do an irl book tour back then, but very grateful for all the reviews and that it seems to be useful 🙏🙂…

0 0 12 955 2

LLM Security @llm_sec

2 months ago

Neural Exec: Learning (and Learning from) Execution Triggers for Prompt Injection Attacks 🌶️ "we show that it is possible to conceptualize the creation of execution triggers as a differentiable search problem and use learning-based methods to autonomously generate them." "Our…

1 16 34 4K 26

Download Image

James Whittaker @docjamesw

2 months ago

AI has resurrected the discipline of software testing. Finally, a reason to think hard about testing once again. If you want to join me for an AI-focused test meetup in Kirkland WA, dm your email address. And read this: medium.com/@docjamesw/the…

2 3 14 3K 2

Andrej Karpathy @karpathy

2 months ago

Reading a tweet is a bit like downloading an (attacker-controlled) executable that you instantly run on your brain. Each one elicits emotions, suggests knowledge, nudges world-view. In the future it might feel surprising that we allowed direct, untrusted information to brain.

793 1K 11K 1.6M 2K

Johann Rehberger @wunderwuzzi23

2 months ago

👉 Indirect Prompt Injection => Remote Neuron Activation It's sort of the LLM equivalent of RCE in traditional computers. In other words, an attacker just has to tickle the right neurons to strongly influence (and often entirely control) the output of the computation.

1 1 15 2K 3

Johann Rehberger @wunderwuzzi23

2 months ago

Imagine you are a Microsoft SQL Server... Having fun redoing some of my ChatGPT experiments with Claude. Switching DBs, creating and querying tables, etc all works very well. Also, it thinks it runs as local system. #sqlserver #claude #llm

0 0 8 843 0

Download Image

Johann Rehberger @wunderwuzzi23

2 months ago

Great to see the Claude 3 system prompt being explained in detail by Anthropic! Hopefully this sets a new industry best practice! Great job! @AmandaAskell

Amanda Askell @AmandaAskell

2 months ago

Great to see the Claude 3 system prompt being explained in detail by Anthropic! Hopefully this sets a new industry best practice! Great job! @AmandaAskell