Itay Itzhak (@itay_itzhak_) 's Twitter Profile
Itay Itzhak

@itay_itzhak_

NLProc, deep learning, and machine learning. Ph.D. student @TechnionLive and @HebrewU

ID: 1195653141934542848

linkhttp://itay1itzhak.github.io calendar_today16-11-2019 10:41:51

133 Tweet

282 Followers

220 Following

Itay Itzhak (@itay_itzhak_) 's Twitter Profile Photo

In Vienna for #ACL2025, and already had my first (vegan) Austrian sausage! Now hungry for discussing: – LLMs behavior – Interpretability – Biases & Hallucinations – Why eval is so hard (but so fun) Come say hi if that’s your vibe too!

In Vienna for #ACL2025, and already had my first (vegan) Austrian sausage!

Now hungry for discussing:
– LLMs behavior
– Interpretability
– Biases & Hallucinations
– Why eval is so hard (but so fun)
Come say hi if that’s your vibe too!
BlackboxNLP (@blackboxnlp) 's Twitter Profile Photo

📝 Technical report guidelines are out! If you're submitting to the MIB Shared Task at #BlackboxNLP, feel free to take a look to help you prepare your report: blackboxnlp.github.io/2025/task/

📝 Technical report guidelines are out!

If you're submitting to the MIB Shared Task at #BlackboxNLP, feel free to take a look to help you prepare your report: blackboxnlp.github.io/2025/task/
Tal Haklay (@tal_haklay) 's Twitter Profile Photo

Had my oral presentation at ACL ACL 2025 today! Big thanks to my collaborators, advisor, parents, and partner - and a special thanks to the “Goodbye Stress” gummies I picked up at the supermarket. Couldn’t have done it without any of you 🙈

Had my oral presentation at ACL <a href="/aclmeeting/">ACL 2025</a> today!
Big thanks to my collaborators, advisor, parents, and partner  - and a special thanks to the “Goodbye Stress” gummies I picked up at the supermarket. Couldn’t have done it without any of you 🙈
Itay Itzhak (@itay_itzhak_) 's Twitter Profile Photo

At #ACL2025 and not sure what to do next? GEM 💎² is the place to be for awesome talks on the future of LLM evaluation. Come hear Gabriel Stanovsky, Eliya Habba, Leshem (Legend) Choshen 🤖🤗 and others rethink what it means to actually evaluate LLMs beyond accuracy and vibes. Thursday @ Hall C!

Sebastian Gehrmann (@sebgehr) 's Twitter Profile Photo

This year's GEM workshop is happening *today* starting at 9am in Vienna at #acl2025 in Hall C. I am looking forward to a day of evaluations.

This year's GEM workshop is happening *today* starting at 9am in Vienna at #acl2025 in Hall C. I am looking forward to a day of evaluations.
Enrico Santus (@enricosantus) 's Twitter Profile Photo

I swear I warned all the romantics in the room — especially after the #Coldplay scandal! 😄🎶 If you were there (or wish you had been), tag yourself and your friends in the comments 👇 Bye bye from the #Gem organizers and speakers! #ACL2025 #ACL2025NLP #GEM2 #LLMs #NLP #Vienna

I swear I warned all the romantics in the room — especially after the #Coldplay scandal! 😄🎶

If you were there (or wish you had been), tag yourself and your friends in the comments 👇

Bye bye from the #Gem organizers and speakers!

#ACL2025 #ACL2025NLP #GEM2 #LLMs #NLP #Vienna
Tomer Ashuach (@tomerashuach) 's Twitter Profile Photo

🚨 New preprint out! CRISP: Persistent Concept Unlearning via SAEs LLMs often encode knowledge we want to remove. CRISP enables persistent, interpretable, precise unlearning while keeping models useful & coherent—tested on bio & cyber safety tasks🧵👇 📄arxiv.org/abs/2508.13650

🚨 New preprint out!

CRISP: Persistent Concept Unlearning via SAEs
LLMs often encode knowledge we want to remove.

CRISP enables persistent, interpretable, precise unlearning while keeping models useful &amp; coherent—tested on bio &amp; cyber safety tasks🧵👇
📄arxiv.org/abs/2508.13650
Adi Simhi (@adisimhi) 's Twitter Profile Photo

Very pleased that "Trust me I'm Wrong" was accepted to EMNLP 2025 findings! Trust me I'm Wrong shows that LLMs can hallucinate with high certainty even when they know the correct answer! Check our latest work with Itay Itzhak, Fazl Barez, Gabriel Stanovsky, and Yonatan Belinkov.

Very pleased that "Trust me I'm Wrong" was accepted to <a href="/emnlpmeeting/">EMNLP 2025</a> findings!

Trust me I'm Wrong shows that LLMs can hallucinate with high certainty even when they know the correct answer!

Check our latest work with <a href="/Itay_itzhak_/">Itay Itzhak</a>, <a href="/FazlBarez/">Fazl Barez</a>, <a href="/GabiStanovsky/">Gabriel Stanovsky</a>, and <a href="/boknilev/">Yonatan Belinkov</a>.
Noam Dahan (@dahan_noam) 's Twitter Profile Photo

Old news: Single-prompt eval is unreliable🤯 New news: PromptSuite🌈 - an easy way to augment your benchmark with thousands of paraphrases ➡️ robust eval, zero sweat! - Works on any dataset! - Python API + web UI Eliya Habba, Gili Lior, Gabriel Stanovsky eliyahabba.github.io/PromptSuite/

Eliya Habba (@eliyahabba) 's Twitter Profile Photo

Proud to share PromptSuite! 🌈 A flexible framework for generating thousands of prompt variations per instance, enabling robust multi-prompt LLM evaluation across diverse tasks. Python API & web UI included. Check it out: eliyahabba.github.io/PromptSuite/

Dana Arad 🎗️ (@dana_arad4) 's Twitter Profile Photo

Next Tuesday I’ll be giving a talk at MIT CSAIL about two of our recent papers on Sparse Autoencoders for Content Control 🧠✨ If you’re around, come by and say hi! csail.mit.edu/event/saes-con…

Yonatan Belinkov (@boknilev) 's Twitter Profile Photo

Opportunities to join my group in fall 2026: * PhD applications direct or via ELLIS (ellis.eu/news/ellis-phd…) * Post-doc applications direct or via Azrieli Azrieli Foundation (azrielifoundation.org/fellows/intern…) or Zuckerman Zuckerman STEM Leadership Program (zuckermanstem.org/ourprograms/po…)