Gabriel Stanovsky
@gabistanovsky
Assistant Professor at @CseHuji
ID: 792053594
https://gabrielstanovsky.github.io 30-08-2012 17:36:20
242 Tweet
752 Followers
262 Following
Ever wondered how Transformers refine their top-k predictions over their layers? 📊 Is there an order to the madness? Come find out at my poster presentation tommorow at ICML Conference 📍East Exhibition Hall E-2512, 11:00-13:30
At #ACL2025 and not sure what to do next? GEM 💎² is the place to be for awesome talks on the future of LLM evaluation. Come hear Gabriel Stanovsky, Eliya Habba, Leshem (Legend) Choshen 🤖🤗 and others rethink what it means to actually evaluate LLMs beyond accuracy and vibes. Thursday @ Hall C!
Very pleased that "Trust me I'm Wrong" was accepted to EMNLP 2025 findings! Trust me I'm Wrong shows that LLMs can hallucinate with high certainty even when they know the correct answer! Check our latest work with Itay Itzhak, Fazl Barez, Gabriel Stanovsky, and Yonatan Belinkov.
Happy to share that our Image Captioning evaluation survey was accepted to TACL! I will be presenting the paper EMNLP 2025
Old news: Single-prompt eval is unreliable🤯 New news: PromptSuite🌈 - an easy way to augment your benchmark with thousands of paraphrases ➡️ robust eval, zero sweat! - Works on any dataset! - Python API + web UI Eliya Habba, Gili Lior, Gabriel Stanovsky eliyahabba.github.io/PromptSuite/
Our 🌈 PromptSuite paper has been accepted to #EMNLP2025 🇨🇳 (System Demonstrations)! 🎉 🌈 PromptSuite is a flexible framework for generating thousands of prompt variations per instance - enabling robust, task-agnostic evaluation of LLMs. Noam Dahan, Gili Lior, Gabriel Stanovsky