Noy Sternlicht (@noysternlicht) 's Twitter Profile
Noy Sternlicht

@noysternlicht

PhD candidate at @nlphuji | Using NLP to help scientists ๐Ÿ“š

ID: 1174288985016999937

calendar_today18-09-2019 11:48:15

5 Tweet

67 Takipรงi

308 Takip Edilen

Dana Arad ๐ŸŽ—๏ธ (@dana_arad4) 's Twitter Profile Photo

Tried steering with SAEs and found that not all features behave as expected? Check out our new preprint - "SAEs Are Good for Steering - If You Select the Right Features" ๐Ÿงต

Tried steering with SAEs and found that not all features behave as expected?

Check out our new preprint - "SAEs Are Good for Steering - If You Select the Right Features"  ๐Ÿงต
Iddo Yosha (@iddoyosha) 's Twitter Profile Photo

1/5 ๐Ÿšจ New paper alert! StressTest: Can YOUR Speech LM Handle the Stress? Sentence stress = emphasis on words to signal intent, contrast, or new info. We built StressTest โ€” a benchmark for testing stress reasoning.๐Ÿ—ฃ๏ธ๐Ÿ’ฌ Then, meet StresSLM who finally gets it! Insights & Links ๐Ÿ‘‡

Kevin Lu (@kevinlu4588) 's Twitter Profile Photo

When we "erase" a concept from a diffusion model, is that knowledge truly gone? ๐Ÿค” We investigated, and the answer is often 'no'! Using simple probing techniques, the knowledge traces of the erased concept can be easily resurfaced ๐Ÿ” Here is what we learned ๐Ÿงต๐Ÿ‘‡

When we "erase" a concept from a diffusion model, is that knowledge truly gone? ๐Ÿค”

We investigated, and the answer is often 'no'!

Using simple probing techniques, the knowledge traces of the erased concept can be easily resurfaced ๐Ÿ”

Here is what we learned ๐Ÿงต๐Ÿ‘‡
Esther Shizgal (@esthershizgal) 's Twitter Profile Photo

๐Ÿ‡ต๐Ÿ‡น Spoke at #DH2025 about Religious Journeys in Holocaust Testimonies (arXiv link in thread) ๐ŸŸ Connecting with researchers using novel computational tools on real-world challenges in the humanities was inspiring! ๐Ÿฐ Excited to build on these interdisciplinary methods!

๐Ÿ‡ต๐Ÿ‡น Spoke at #DH2025 about Religious Journeys in Holocaust Testimonies (arXiv link in thread)

๐ŸŸ Connecting with researchers using novel computational tools on real-world challenges in the humanities was inspiring!

๐Ÿฐ Excited to build on these interdisciplinary methods!
Eliya Habba (@eliyahabba) 's Twitter Profile Photo

Presenting my poster : ๐Ÿ•Š๏ธ DOVE - A large-scale multi-dimensional predictions dataset towards meaningful LLM evaluation, Monday 18:00 Vienna, #ACL2025 Come chat about LLM evaluation, prompt sensitivity, and our 250M COLLECTION OF MODEL OUTPUTS!

Presenting my poster :
๐Ÿ•Š๏ธ DOVE - A large-scale multi-dimensional predictions dataset towards meaningful LLM evaluation, Monday 18:00 Vienna, 
#ACL2025

Come chat about LLM evaluation, prompt sensitivity, and our 250M COLLECTION OF MODEL OUTPUTS!
Asaf Yehudai (@asafyehudai) 's Twitter Profile Photo

๐Ÿšจ Benchmarks tell us which model is better โ€” but not why it fails. For developers, this means tedious, manual error analysis. We're bridging that gap. Meet CLEAR: an open-source tool for actionable error analysis of LLMs. ๐Ÿงต๐Ÿ‘‡

๐Ÿšจ Benchmarks tell us which model is better โ€” but not why it fails.

For developers, this means tedious, manual error analysis. We're bridging that gap.

Meet CLEAR: an open-source tool for actionable error analysis of LLMs.

๐Ÿงต๐Ÿ‘‡