Post-doctoral Researcher at BIFOLD / TU Berlin interested in interpretability and analysis of language models. Guest researcher at DFKI Berlin.nfelnlp.github.io Berlin, GermanyJoined June 2017
Happy to share that our PRISM paper has been accepted at #NeurIPS2025 🎉
In this work, we introduce a multi-concept feature description framework that can identify and score polysemantic features.
📄 Paper: arxiv.org/abs/2506.15538#NeurIPS#MechInterp#XAI
🔍 When do neurons encode multiple concepts?
We introduce PRISM, a framework for extracting multi-concept feature descriptions to better understand polysemanticity.
📄 Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework
arxiv.org/abs/2506.15538
🧵
Presenting my poster at @inlgmeeting today on political bias evaluation assessing sycophancy in (German-language) LLMs:
ACL Anthology: aclanthology.org/2024.inlg-main…
This paper resulted from the great Bachelor thesis of Maximilian Bleick co-supervised with @albu and Sebastian Möller.
The submission deadline (15 aug) for BlackboxNLP is slowly approaching! We're very excited to see your approaches to open up the black box 🤩
The submission portal has now been opened on OpenReview:
openreview.net/group?id=EMNLP…
If you haven't been invited to review for ARR 2024 June but are interested in helping us, please fill out this form by June 19: forms.office.com/pages/response…
Excited to share our paper, "Symmetric Dot-Product Attention for Efficient Training of BERT Language Models," accepted at #ACL2024 Findings. This is joint work with Malte Ostendorff, Leonhard Hennig, and Georg Rehm.
arXiv: arxiv.org/abs/2406.06366
Github: github.com/mcrts/ACL2024-…
@InseqLib v0.6 is out now on PyPI! 🔥
New CLI command for context attribution (@gsarti_), new perturbation-based methods by @hmohebbi75
& @casszzx and optimizations incl. multi-gpu support! ⚡️
Huge shoutout to our contributors! ❤️
Release notes ⬇️
github.com/inseq-team/ins…
New open #phd position: Contribute to the "FakeXplain - Development of transparent and meaningful explanations in the disinformation detection context " project. Research Assistant - salary grade E 13 TV-L Berliner Hochschulen
jobs.tu-berlin.de/en/job-posting…
Thanks a lot to all emergency reviewers who helped fill in the gaps for the #ARR February 2024 cycle! 🫶
We're good to go for the author response period.
Thanks a lot to all emergency reviewers who helped fill in the gaps for the #ARR February 2024 cycle! 🫶
We're good to go for the author response period.
Looking for potential emergency reviewers for submissions in Interpretability and Model Analysis/NLP Applications! Topics include: LLM Hallucination, Alignment, Privacy.
Please reach out if you have the bandwidth to help!🙏 #NLProc#ACL2024
@InseqLib v0.5 is finally out! 🐛 New tutorial, distributed and 4-bit quantized models, easier & better contrastive attribution, and more! 🎉 Thanks to @daniel_sc4@peppeatta and all other contributors!
Find out more in the release notes 👀 github.com/inseq-team/ins…
430 Followers 4K FollowingEconomic responsibility with a social consciousness. Teach them how to fish, never fish for them. What society sows today will harvest tomorrow.
20 Followers 229 Followingconnecting intelligence - cultural evolution, becoming answers to the problems we are co-creating #evolvingwithAI #traumainformed
166 Followers 216 FollowingPhysicist gone Data Scientist. Ph.D student at ML group @TUBerlin in the field of Explainability (XAI). If not busy with research, busy with sports/climbing
809 Followers 5K FollowingMachine Learning and Knowledge Extraction (ISSN 2504-4990) is a peer-reviewed, #scholarly #openaccess journal focus on #machinelearning and applications.
298 Followers 3K Followingsystems engineer, design thinker, AI led innovation, Blockchain, Digital, banking & financial services @ IBM . views are my own
119 Followers 873 FollowingTechnophile👑
A serial entrepreneur with a relentless drive to explore new opportunities and push the boundaries of what's possible.
#climate #Ai
2K Followers 2K FollowingNLP postdoc at @SheffieldNLP
Ex @Imperial_NLP PhD, @Apple AI/ML Scholar, @UCL MSc
Model robustness and now uncertainty quantification
166 Followers 216 FollowingPhysicist gone Data Scientist. Ph.D student at ML group @TUBerlin in the field of Explainability (XAI). If not busy with research, busy with sports/climbing
645K Followers 35 FollowingWe're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.
10K Followers 1K FollowingWaiting on a robot body. All opinions are universal and held by both employers and family. Now a dedicated grok hate account.
Accepting ML/NLP PhD students.
453 Followers 530 FollowingPhD student at Brown interested in deep learning + cog sci, but more interested in playing guitar. @NSF GRFP Fellow, @GoogleDeepMind Intern. He/Him.
2K Followers 649 FollowingMostly gone to better places :) I like language, birds, cats, trains, buses, long walks, cities, and other things 🌻 "vulnerable road user" opinions mine
2K Followers 2K FollowingNLP postdoc at @SheffieldNLP
Ex @Imperial_NLP PhD, @Apple AI/ML Scholar, @UCL MSc
Model robustness and now uncertainty quantification