Nils Feldhus (@nfelnlp) 's Twitter Profile
Nils Feldhus

@nfelnlp

Post-doctoral Researcher at BIFOLD / TU Berlin interested in interpretability and analysis of language models. Guest researcher at DFKI Berlin.

ID: 876490374642106374

linkhttps://nfelnlp.github.io/ calendar_today18-06-2017 17:22:44

88 Tweet

235 Followers

387 Following

Inseq (@inseqlib) 's Twitter Profile Photo

Inseq v0.5 is finally out! 🐛 New tutorial, distributed and 4-bit quantized models, easier & better contrastive attribution, and more! 🎉 Thanks to Daniel Scalena Giuseppe Attanasio and all other contributors! Find out more in the release notes 👀 github.com/inseq-team/ins…

Inseq (@inseqlib) 's Twitter Profile Photo

Value Zeroing, a faithful approach for analyzing context mixing in Transformers, is now available on Inseq main branch for all Hugging Face text generation models! 🔀 🔍Paper introducing VZ: aclanthology.org/2023.eacl-main… 🐛VZ in Inseq: tinyurl.com/inseq-vz

Value Zeroing, a faithful approach for analyzing context mixing in Transformers, is now available on <a href="/InseqLib/">Inseq</a> main branch for all <a href="/huggingface/">Hugging Face</a> text generation models! 🔀 

🔍Paper introducing VZ: aclanthology.org/2023.eacl-main…
🐛VZ in Inseq: tinyurl.com/inseq-vz
Abhilasha Ravichander (@lasha_nlp) 's Twitter Profile Photo

Looking for potential emergency reviewers for submissions in Interpretability and Model Analysis/NLP Applications! Topics include: LLM Hallucination, Alignment, Privacy. Please reach out if you have the bandwidth to help!🙏 #NLProc #ACL2024

Nils Feldhus (@nfelnlp) 's Twitter Profile Photo

Thanks a lot to all emergency reviewers who helped fill in the gaps for the #ARR February 2024 cycle! 🫶 We're good to go for the author response period. x.com/nfelnlp/status…

BIFOLD (@bifoldberlin) 's Twitter Profile Photo

New open #phd position: Contribute to the "FakeXplain - Development of transparent and meaningful explanations in the disinformation detection context " project. Research Assistant - salary grade E 13 TV-L Berliner Hochschulen jobs.tu-berlin.de/en/job-posting…

New open #phd position: Contribute to the "FakeXplain - Development of transparent and meaningful explanations in the disinformation detection context " project. Research Assistant - salary grade E 13 TV-L Berliner Hochschulen
jobs.tu-berlin.de/en/job-posting…
Inseq (@inseqlib) 's Twitter Profile Photo

Inseq v0.6 is out now on PyPI! 🔥 New CLI command for context attribution (Gabriele Sarti), new perturbation-based methods by Hosein Mohebbi & Cass Zhixue and optimizations incl. multi-gpu support! ⚡️ Huge shoutout to our contributors! ❤️ Release notes ⬇️ github.com/inseq-team/ins…

ACLRollingReview (@reviewacl) 's Twitter Profile Photo

If you haven't been invited to review for ARR 2024 June but are interested in helping us, please fill out this form by June 19: forms.office.com/pages/response…

BlackboxNLP (@blackboxnlp) 's Twitter Profile Photo

The submission deadline (15 aug) for BlackboxNLP is slowly approaching! We're very excited to see your approaches to open up the black box 🤩 The submission portal has now been opened on OpenReview: openreview.net/group?id=EMNLP…

The submission deadline (15 aug) for BlackboxNLP is slowly approaching! We're very excited to see your  approaches to open up the black box 🤩

The submission portal has now been opened on OpenReview:

openreview.net/group?id=EMNLP…
Nils Feldhus (@nfelnlp) 's Twitter Profile Photo

Presenting my poster at INLG 2025 today on political bias evaluation assessing sycophancy in (German-language) LLMs: ACL Anthology: aclanthology.org/2024.inlg-main… This paper resulted from the great Bachelor thesis of Maximilian Bleick co-supervised with Aljoscha Burchardt and Sebastian Möller.

Presenting my poster at <a href="/inlgmeeting/">INLG 2025</a> today on political bias evaluation assessing sycophancy in (German-language) LLMs:

ACL Anthology: aclanthology.org/2024.inlg-main… 

This paper resulted from the great Bachelor thesis of Maximilian Bleick co-supervised with <a href="/albu/">Aljoscha Burchardt</a> and Sebastian Möller.
NAACL HLT 2025 (@naaclmeeting) 's Twitter Profile Photo

📢 NAACL needs Reviewers & Area Chairs! 📝 If you haven't received an invite for ARR Oct 2024 & want to contribute, sign up by Oct 22nd! ➡️AC form: forms.office.com/r/8j6jXLfASt ➡️Reviewer form: forms.office.com/r/cjPNtL9gPE Please RT 🔁 and help spread the word! 🗣️ #NLProc ACLRollingReview

Laura Kopf (@lkopf_ml) 's Twitter Profile Photo

🔍 When do neurons encode multiple concepts? We introduce PRISM, a framework for extracting multi-concept feature descriptions to better understand polysemanticity. 📄 Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework arxiv.org/abs/2506.15538 🧵

🔍 When do neurons encode multiple concepts?

We introduce PRISM, a framework for extracting multi-concept feature descriptions to better understand polysemanticity.

📄 Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework
arxiv.org/abs/2506.15538
🧵
Laura Kopf (@lkopf_ml) 's Twitter Profile Photo

Happy to share that our PRISM paper has been accepted at #NeurIPS2025 🎉 In this work, we introduce a multi-concept feature description framework that can identify and score polysemantic features. 📄 Paper: arxiv.org/abs/2506.15538 #NeurIPS #MechInterp #XAI