LLM Security @llm_sec

Research, papers, jobs, and news on large language model security. Got something relevant? DM / tag @llm_sec llmsec.net 🏔️ Joined April 2023

Tweets

701
Followers

8K
Following

295
Likes

622

LLM Security @llm_sec

a day ago

Controlling Large Language Model Outputs: A Primer "How can we attempt to control the outputs of these models? This primer outlines four commonly used techniques and explains why this objective is so challenging." cset.georgetown.edu/publication/co…

1 9 31 4K 23

HEGO @imhego

2 weeks ago

LLM Security Verification Standard 0.0.1: wiki.hego.tech/owasp/llm-secu… #CyberSecurity #ai #chatgpt #MachineLearning #hacking #ethicalhacking #mlsecops #ArtificialInteligence #LLM #largelanguagemodel

0 6 16 2K 13

Download Image

LLM Security @llm_sec

5 days ago

Not all model serialisation formats are vulnerable Don't accept models with custom or lambda layers

Mihai Maruseac @mihaimaruseac

6 days ago

Not all model serialisation formats are vulnerable Don't accept models with custom or lambda layers

3 1 21 4K 2

0 0 6 2K 0

threlfall @WHITEHACKSEC

6 days ago

@llm_sec Plug: if you wanted to know about this ahead of time, the nuances around model formats and load mechanisms were documented in the offsec ml playbook since Jan 7 :) pls note other formats have similar idiosyncrasies wiki.offsecml.com/Supply+Chain+A…

1 2 6 1K 3

LLM Security @llm_sec

6 days ago

Keras 2 Lambda Layers Allow Arbitrary Code Injection in TensorFlow Models Lambda Layers in third party TensorFlow-based Keras models allow attackers to inject arbitrary code into versions built prior to Keras 2.13 that may then unsafely run with the same permissions as the…

1 7 21 6K 10

LLM Security @llm_sec

6 days ago

Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images "This paper studies the possibility of employing multi-modal models with enhanced visual understanding to mimic the outputs of these platforms, introducing an original attack strategy. Our method…

0 4 12 4K 19

Download Image

LLM Security @llm_sec

6 days ago

Lessons for CISOs From OWASP's LLM Top 10 darkreading.com/vulnerabilitie…

0 2 5 1K 6

Download Image

Anna Smirnitskaya @annasmrntsky

7 days ago

Such a great news! 🤯 The benchmark includes more than 43 thousand products. The technique allows you to classify threats, converting answers into characteristics that are understandable even to non-professionals, such as "high risk", "moderate-high risk", etc.

LLM Security @llm_sec

2 weeks ago

0 3 18 3K 14

Download Image

0 1 3 1K 1

Reshabh K Sharma @ReshabhSharma01

a week ago

I'd love to see how these efforts perform on our SPML prompt injection dataset, which already utilized Gandalf to create realistic system and user prompts, compared to single-system prompt datasets designed to protect a secret. More about it: prompt-compiler.github.io/SPML/

LLM Security @llm_sec

a week ago

2 15 50 11K 46

Download Image

1 2 15 3K 11

LLM Security @llm_sec

a week ago

Glitch Tokens in Large Language Models: Categorization Taxonomy and Effective Detection "we introduce and systematically explore the phenomenon of "glitch tokens", which are anomalous tokens produced by established tokenizers and could potentially compromise the models' quality…

1 1 10 2K 13

Download Image

LLM Security @llm_sec

a week ago

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions "We argue that one of the primary vulnerabilities underlying these attacks is that LLMs often consider system prompts (e.g., text from an application developer) to be the same priority as text from…

2 15 50 11K 46

Download Image

Nanna Inie @NannaInie

a week ago

You get the most comprehensive theory by asking people. #LLMsecurity

4 27 138 14K 141

Download Image

AI Village @ DEF CON @aivillage_dc

a month ago

AI Village is back for DEF CON 32! We're looking for talks on all things ML + Security, but this year we're getting small! "Smart" devices, AVs, on-device facial recognition, and more! Show us how you broke them! Submission deadline is 12-May-2024! aiv2024.hotcrp.com

1 30 48 9K 13

LLM Security @llm_sec

a week ago

AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts "we define a broad content safety risk taxonomy, comprising 13 critical risk and 9 sparse risk categories. Additionally, we curate AEGISSAFETYDATASET, a new dataset of approximately 26, 000…

0 8 25 3K 19

Download Image

LLM Security @llm_sec

2 weeks ago

Introducing v0.5 of the AI Safety Benchmark from MLCommons "We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-purpose assistant in English), and a limited set of personas…

0 3 18 3K 14

Download Image

Leon Derczynski ✍🏻🌱🌧️ @LeonDerczynski

2 weeks ago

i know this is schmidhuber-y, but i just stumbled upon my slide from February 2023 detailing the "many-shot jailbreaking" attack popular earlier this month. the lag between broad praxis and arxiv/whitepaper is tremendous. the whole presentation is here, with other goodies:…

8 7 33 6K 15

Download Image

Rich Harang @rharang

2 weeks ago

@LeonDerczynski I'd go so far as to say that *most* of the stuff popping up in academic papers these days as novel research w/r/t LLM security has been widely known among practicioners for a year or more. Normalize citing blog posts and non-academic work, and not just "citable" papers.

2 4 16 2K 1

LLM Security @llm_sec

2 weeks ago

Kaggle's LLM prompt extraction competition has been won by exploiting the Sentence Transformer similarity function using an adversarial attack. 👑 kaggle.com/competitions/l…

3 24 109 11K 76

Mahak Goindani @Mahak_Goindani

34 Followers 168 Following

Samir Aksekar @aksekar

39 Followers 12 Following Techie seeking wisdom | Learning how to learn & think | Long term Investing | Family

Stefan Juang @StefanJuang

143 Followers 1K Following The final goal of AI is not just to create intelligent machines, but to understand intelligence itself.

Mathias Sandorf @Antekirtt_

57 Followers 35 Following

MSS @sajwan_mellow

12 Followers 228 Following

Ashish Rohra @AshishRohr238

1 Followers 64 Following

Nauman @nauman7375

1 Followers 47 Following

Smit Kapani @ApertureLabSec

0 Followers 4 Following

Mazen Mohamad @mzn_mhd

1 Followers 33 Following

Mikkel @Mikkel86881951

422 Followers 2K Following

Cyber Security Professional, Certified in CEH, ECSA and Palo Alto PCNSE, CCNA Security, CCNP Security, Fortinet NSE4, ISACA CISM

Adam @ha3k4r

69 Followers 827 Following Cyber Security Professional, Certified in CEH, ECSA and Palo Alto PCNSE, CCNA Security, CCNP Security, Fortinet NSE4, ISACA CISM

Abhishek Shingane @ashingane09

9 Followers 58 Following

yuabian @yuabian1509

4 Followers 26 Following

Sir SloDK @EsloSir

9 Followers 126 Following

ML Engineer (e/acc)

📌 https://t.co/x0IIWfnOt8

🚀 https://t.co/QEO4CKRl1b

Open LLMs is Happiness 💡

Ex Deutsche & HSBC.

DM for collaboration.

Rohan Paul @rohanpaul_ai

13K Followers 1K Following ML Engineer (e/acc) 📌 https://t.co/x0IIWfnOt8 🚀 https://t.co/QEO4CKRl1b Open LLMs is Happiness 💡 Ex Deutsche & HSBC. DM for collaboration.

Ghulam Nabi @Ghulam_Nabi11

5 Followers 74 Following AI Engineer

AttentionBot @XAttentionBot

4 Followers 89 Following Attention Human! You Need to Upvote my Posts!

Yu-Jye Tung @yujyet

191 Followers 645 Following trying to democratize static program analysis. SWE PhD at UCI working with Joshua Garcia

ｻｻｼｭﾝ @iamchibidebu

73 Followers 2K Following MZ世代,

Swaroop CH @swaroopch

3K Followers 2K Following

Ranti Dev Sharma @SharmaRantiDev

48 Followers 668 Following https://t.co/tJghtgEPBS

Maor Volokh @MaorVolokh

1 Followers 80 Following

Rahul @rahul0xcdf

35 Followers 541 Following 9teen|| 👨🏼‍💻 | cybersec | #ಕನ್ನಡ✨

V.Agam @Agam_2

4 Followers 35 Following

[email protected] @wodedipanr

0 Followers 12 Following

Dana Mahmood @deordered

20 Followers 710 Following Fine-tuning AI models oftentimes & practicing philosopher at other times.

Chandan Vishwakarma @Kool_chandu007

10 Followers 75 Following BioNotFoundException : None

iiinmooN @zutPMCBDeITn3B0

1 Followers 11 Following

EKL @CaptainEmmers

8 Followers 94 Following Tall, pasty, and handsome

Momna Dar @dar_momna

0 Followers 11 Following Deep Machine 💻

Tracy @wtracy_

7 Followers 106 Following AI lead, Tech, Architect, Developer, Sports Enthusiast

Bakul Gupta @bullhacks3

29 Followers 155 Following 🥷 Product Security engineer 🥷 by profession and life long learner by choice !🚀 Credit Cards Explorer/Noob 🔥

Assistant Professor @Mila_Quebec @McGillU @ServiceNowRSRCH; Postdoc @StanfordNLP; PhD @EdinburghNLP; Natural Language Processor #NLProc

Siva Reddy @sivareddyg

5K Followers 966 Following Assistant Professor @Mila_Quebec @McGillU @ServiceNowRSRCH; Postdoc @StanfordNLP; PhD @EdinburghNLP; Natural Language Processor #NLProc

Karolina Stanczak @karstanczak

515 Followers 446 Following NLP & ML PhD candidate @uni_copenhagen @CopeNLU

Krystian Weissgerber @k_weissgerber

15 Followers 37 Following Prompt engineer @ Orange Poland 🟧 AI Student @ Koźmiński University

Allegra Guinan @allegra_lumiera

2 Followers 49 Following Responsible AI @ Lumiera

小雅 @snowarner

0 Followers 24 Following

gew weg @gewweg176565

1 Followers 137 Following

We are a small group of nerds who are focused on exposing online predators with the goal of creating a better future for the next generation.

Qui3t @Qui3t_Org

165 Followers 433 Following We are a small group of nerds who are focused on exposing online predators with the goal of creating a better future for the next generation.

Patricia Mkoji @PatriciaMkoji

16 Followers 48 Following Go getter

Görkem Sevinç @GrkemSe15257575

1 Followers 21 Following Hırs

emi learns @ml_emiii

7 Followers 99 Following learning llm engineering and advanced/concurrent typescript/js from ground up before @elicitorg internship

testuser @testuser12331

11 Followers 51 Following

aman upadhyay @amanupadhy2833

1 Followers 139 Following

Fahed Kaddou @kaddou_fahed

5 Followers 64 Following Machine Learning Engineer

Itqdevs is your one-stop service provider for all your business technology needs. Custom softwares, exceptional design services, data analytics & cybersecurity

Itqdevs Softwares @itq_devs

23 Followers 357 Following Itqdevs is your one-stop service provider for all your business technology needs. Custom softwares, exceptional design services, data analytics & cybersecurity

Cynthia @ThiaDawn205

20 Followers 399 Following

Ray @0smboy

35 Followers 296 Following

JD @jfdiaz50

15 Followers 522 Following

詹卓欣 @l3ZLHxftJmxtWNR

0 Followers 2 Following

Haize Labs @haizelabs

538 Followers 0 Following it's a bad day to be a language model

Dan Guido @dguido

25K Followers 861 Following CEO @trailofbits, organizer @EmpireHacking. Open DMs.

Adelin Travers @alkae_t

177 Followers 417 Following Principal Security Engineer, Machine Learning @ Trail of Bits, Views my own.

Lecturer (Assistant Professor) in #NLProc @SheffieldNLP @shefcompsci

opinions are my own (which are shaped by media and random things unfortunately)

casszhao @casszzx

283 Followers 869 Following Lecturer (Assistant Professor) in #NLProc @SheffieldNLP @shefcompsci opinions are my own (which are shaped by media and random things unfortunately)

Postdoctoral Fellow @uhmanoa 🌈🍀✨Research Collaborator @NIST✨ PhD @RITGolisanoCCIS, @riteslgci ✨Software Security, Attack Surface Analysis, Machine Learning

Dr. Sara Moshtari @MoshtariSarah

53 Followers 195 Following Postdoctoral Fellow @uhmanoa 🌈🍀✨Research Collaborator @NIST✨ PhD @RITGolisanoCCIS, @riteslgci ✨Software Security, Attack Surface Analysis, Machine Learning

HCI / cognition / creativity researcher. VILLUM fellow at ITU Copenhagen, Center for Computing Education Research. https://t.co/GKq2m8DuKl

Nanna Inie @NannaInie

1K Followers 326 Following HCI / cognition / creativity researcher. VILLUM fellow at ITU Copenhagen, Center for Computing Education Research. https://t.co/GKq2m8DuKl

Walden @walden_yan

7K Followers 553 Following I sometimes code @cognition_labs

AI security researcher @ Robust Intelligence; threat intelligence; malware; Python. Opinions expressed are solely my own.

Adam @bindshell_

1K Followers 2K Following AI security researcher @ Robust Intelligence; threat intelligence; malware; Python. Opinions expressed are solely my own.

MLCommons @MLCommons

3K Followers 131 Following Better Artificial Intelligence for Everyone

Pliny the Prompter �.. @elder_plinius

12K Followers 1K Following latent space liberator, breaker of markov chains, 1337 ai red teamer, white hat, architect-healer, cogsci 🐻

probe to improve | Ph.D. @VTEngineering | Amazon Research Fellow | #AI_security 🛡 #Adversarial ⚔️ #Backdoors 🎠 I deal with the dark side of machine learning.

Yi Zeng 曾祎 @EasonZeng623

1K Followers 1K Following probe to improve | Ph.D. @VTEngineering | Amazon Research Fellow | #AI_security 🛡 #Adversarial ⚔️ #Backdoors 🎠 I deal with the dark side of machine learning.

dreadnode @dreadnode

781 Followers 22 Following AI Red Teaming | Research. Tooling. Evals. Cyber ranges.

Pablo @Chamoy_hands

93 Followers 728 Following Wannabe stoic | GenAI Security Engineer

The competitive prompt injection game: https://t.co/pZOoC07VIx

By @sdtoyer @OliviaGWatkins2 @EthanMendes3 @justinsvegliato @LukeBailey181 @cnnmonsugar @isaacongjw

Tensor Trust @TensorTrust

58 Followers 9 Following The competitive prompt injection game: https://t.co/pZOoC07VIx By @sdtoyer @OliviaGWatkins2 @EthanMendes3 @justinsvegliato @LukeBailey181 @cnnmonsugar @isaacongjw

Daniel Paleka @dpaleka

3K Followers 471 Following ai safety researcher | phd @CSatETH

Working @Huawei | PhD @SheffieldNLP | BEng @Beihang1952 | Ex-Interns @AmazonScience @TencentGlobal @SamsungResearch | Melomaniac | 话痨🦆

Xutan Peng @Pzoom522

Senior Lecturer at @shefcompsci
Member of @SheffieldNLP
Natural Language Processing, Text Analytics, Data Science, Artificial Intelligence

Mark Stevenson @drmarkstevenson

862 Followers 315 Following Senior Lecturer at @shefcompsci Member of @SheffieldNLP Natural Language Processing, Text Analytics, Data Science, Artificial Intelligence

Alex Robey @AlexRobey23

618 Followers 849 Following Ph.D. student at @Penn studying robust machine learning. Formerly @GoogleAI, @Livermore_Lab | B.S. & B.A. from @swarthmore

Xinlei He @AllenXinleiHe

523 Followers 521 Following PhD student @CISPA working on trustworthy machine learning.

AIPanic @AIPanic

526 Followers 0 Following AI safety & jailbreaking as a hobby Looking for models to redteam or other safety-related stuff DMs Open

CISPA – Helmholtz Center for Information Security, an international research center for IT Security and privacy located in Saarbruecken.

CISPA @CISPA

5K Followers 427 Following CISPA – Helmholtz Center for Information Security, an international research center for IT Security and privacy located in Saarbruecken.

She/her.

AI Security Researcher at Microsoft Security Response Center (MSRC)
| prev. PhD @CISPA | Neurodivergent 🧠🦋 | peace for all #CeasefireNOW

Sahar Abdelnabi 🍉�.. @sahar_abdelnabi

584 Followers 462 Following She/her. AI Security Researcher at Microsoft Security Response Center (MSRC) | prev. PhD @CISPA | Neurodivergent 🧠🦋 | peace for all #CeasefireNOW

World's Most Aggravat.. @badedgecases

8K Followers 0 Following THIS WEEK ON "WORLD'S MOST AGGRAVATING EDGE CASES" by @qntm

prisec_ml @prisec_ml

727 Followers 32 Following Interest Group/Meet-Up on Security and Privacy in Machine Learning (PriSec-ML).

Dawn Song @dawnsongtweets

29K Followers 840 Following Professor in Computer Science at UC Berkeley; Research in AI, Security, Blockchain; Serial entrepreneur

Yoon Baek @L0Z1K

101 Followers 89 Following ML Engineer of Corca Join us: https://t.co/nkWLTUzHmZ LLM Newsletter: https://t.co/ZSre54Tmwq

Ahmed Salem @AhmedGaSalem

199 Followers 136 Following Security Researcher at Microsoft Security Response Center (MSRC). Previous a PhD Candidate at @CISPA

PhD Candidate @UMNews in Information Theory & Machine Learning | Dabbling in NLP, Retrieval & LLMs | Prev. @MSFTResearch, @Amplitude_HQ & @Vectara | 🇯🇵🇪🇬

Adel Elmahdy 🇵🇸 @adel_elmahdy

581 Followers 2K Following PhD Candidate @UMNews in Information Theory & Machine Learning | Dabbling in NLP, Retrieval & LLMs | Prev. @MSFTResearch, @Amplitude_HQ & @Vectara | 🇯🇵🇪🇬

Gang Wang @ffmagicbean

2K Followers 2K Following Prof @ UIUC; PhD from UCSB; Researching in Security and Privacy, Data Analytics, and Human Factors

Hongcheng Gao @GaoHongcheng

128 Followers 125 Following NLPer, CS Master Student @UCAS1978 | Former Intern @Tsinghuanlp | focus on LLM and VLM

Zhiyuan Liu @zibuyu9

2K Followers 278 Following Associate Professor @TsinghuaNLP. Research interests include NLP, KG and social computation.

Yangyi Chen @YangyiChen6666

492 Followers 332 Following CS Ph.D. student at UIUC @IllinoisCS, focus on multimodal and large language models.

Joe Lucas @josephtlucas

410 Followers 1K Following

The Security & AI Podcast by @nataliepis (@OpenAI Dev Ambassador) and @justicerage (Senior Security Researcher @kaspersky) | We’re on Apple Podcasts and Spotify

SAI Podcast @SAIpodcast

25 Followers 2 Following The Security & AI Podcast by @nataliepis (@OpenAI Dev Ambassador) and @justicerage (Senior Security Researcher @kaspersky) | We’re on Apple Podcasts and Spotify

Embrace The Red @EmbraceTheRed23

44 Followers 0 Following Learn the hacks, stop the attacks.

Assistant Prof @UCRiverside. PhD from @Mila_Quebec @McGillU. Trustworthy NLP+AI safety & Summarization! Former intern @GoogleAI @MSFTResearch @allen_ai

Yue Dong @ NeurIPS 20.. @YueDongCS

3K Followers 797 Following Assistant Prof @UCRiverside. PhD from @Mila_Quebec @McGillU. Trustworthy NLP+AI safety & Summarization! Former intern @GoogleAI @MSFTResearch @allen_ai

LaurieWired @lauriewired

30K Followers 205 Following Reverse engineer specializing in cross-platform malware analysis with a focus on mobile threats.

True stories from the dark side of the Internet. Host @jackrhysider.

New episodes released on the first Tuesday of each month.
Discord: https://t.co/bZZRR8C59R

Darknet Diaries @DarknetDiaries

121K Followers 1 Following True stories from the dark side of the Internet. Host @jackrhysider. New episodes released on the first Tuesday of each month. Discord: https://t.co/bZZRR8C59R

We want to help as many people as possible understand their networks as much as possible.

Shared amongst several of the core team, but mostly @GeraldCombs.

Wireshark Foundation @WiresharkNews

16K Followers 41 Following We want to help as many people as possible understand their networks as much as possible. Shared amongst several of the core team, but mostly @GeraldCombs.

Ilia Shumailov🦔 @iliaishacked

1K Followers 792 Following Scientist @GoogleDeepMind, ex JRF @ChCh_Oxford @UniofOxford, ex Fellow @VectorInst, ex PhD @Cambridge_Uni.

shenetworks @shenetworks

71K Followers 881 Following a menace • hacker • shenetworks @ TikTok & YouTube & Twitch (She/Her) “She’s a fake lying guru”- Crusty Twitter Man

Malwarebytes @Malwarebytes

79K Followers 1K Following Protection you can trust. Need support? ⬇️

Evren @evrnyalcin

3K Followers 4K Following red team

Mikolaj Kowalczyk @m1k0ww

62 Followers 191 Following | security guy | exploring AI security | losing sleep at hackathons since 2019 |

YaqiZHANG @Alphatu4｜🏆#Microsoft MVP | Complex System | Author& Founder&Engineer |NerdDiplomat🤗 | Author of 2 Books |Speaker of @pku1898 @penn @ApacheCon

Alphatu🐇 @Alphatu4

HAHWUL @hahwul

10K Followers 224 Following 🔥 Offensive Security Engineer, Rubyist/Crystalist/Gopher and H4cker. Call me Ha-Hul, but you can call me Howl. and he/him

Thomas Wolf @Thom_Wolf

68K Followers 4K Following Co-founder and CSO @HuggingFace - open-source and open-science

Kevin Poireault @kpoireault

2K Followers 2K Following 🇬🇧 Reporter @InfosecurityMag 🇫🇷 Co-👶 @TeknolojiaNews • 👶 @Coupe_Circuit #cybersecurity #internetmonitoring #digitalrights | 🌍 ⚽🥊

ColdwaterQ (@coldwate.. @ColdwaterQ

122 Followers 75 Following Focused on Threat Research with an emphasis in AI and ML technologies. https://t.co/KfdoJc8vtl

AI Safety Papers @safe_paper

654 Followers 86 Following Discovering exciting new research on Arxiv is one of my favorite pastimes!

LaurieWired @lauriewired

4 days ago

9 99 1K 51K 78

Download Image

Mihai Maruseac @mihaimaruseac

6 days ago

@moyix I'm not sure if safetensors supports Lambda layers (wrappers around custom code). So they might be ok

1 0 2 142 0

Nanna Inie @NannaInie

6 days ago

@PengfeiHePower There is not! This is the closest thing I’ve seen to a comprehensive overview:

LLM Security @llm_sec

4 weeks ago

Against The Achilles' Heel: A Survey on Red Teaming for Generative Models 🌶️ "Our extensive survey, which examines over 120 papers, introduces a taxonomy of fine-grained attack strategies grounded in the inherent capabilities of language models. Additionally, we have developed…

3 11 50 5K 38

Download Image

0 0 1 33 0

Katan'Hya @KatanHya

6 days ago

Oh look, they invented me

LLM Security @llm_sec

4 weeks ago

Curiosity-driven Red-teaming for Large Language Models "Recent works automate red teaming by training a separate red team LLM with reinforcement learning (RL) to generate test cases that maximize the chance of eliciting undesirable responses from the target LLM. However, current…