Arianna Bisazza (@ariannabisazza) 's Twitter Profile
Arianna Bisazza

@ariannabisazza

Associate Prof #NLProc | Find me on the other platform

ID: 882521793239691264

linkhttp://www.cs.rug.nl/~bisazza calendar_today05-07-2017 08:49:26

375 Tweet

1,1K Followers

266 Following

Slator (@slatornews) 's Twitter Profile Photo

👉 slator.ch/WordLevelQEAIT… Word-level quality estimation promises to help post-editors work more efficiently, but does it deliver? 🤔 A new study finds that while highlights may improve quality, ✅ they don’t always speed up editing ⏳ — and many post-editors find them

Arianna Bisazza (@ariannabisazza) 's Twitter Profile Photo

RAG is a powerful way to improve LLMs' answering abilities across many languages. But how do LLMs deal with multilingual contexts? Do they answer consistently when the retrieved info is provided to them in different languages? Joint work w/ Jirui Qi @EMNLP25 ✈️ & Raquel Fernández See thread! ⤵️

Arianna Bisazza (@ariannabisazza) 's Twitter Profile Photo

Applying interpretability techniques to speech LMs is far from being a solved problem! Read why in Gaofei Shen’s paper, fruit of a nice collaboration w/ Hosein Mohebbi afra alishahi Grzegorz Chrupała 🇪🇺🇺🇦 where I keep learning interesting stuff about speech and SLMs! :-)

Arianna Bisazza (@ariannabisazza) 's Twitter Profile Photo

Large Reasoning Models are raising the bar for answer accuracy & transparency, but how does that work in multilingual settings? Can LRMs reason in your language, and what does that entail? See new preprint led by Jirui Qi & Shan Chen!

Arianna Bisazza (@ariannabisazza) 's Twitter Profile Photo

One step further in our quest to bring interpretability techniques to the service of MT end users: Are uncertainty & model-internals based metrics a viable alternative to supervised word-level quality estimation? New paper w/ Gabriele Sarti Vilém Zouhar #EMNLP Malvina Nissim!

Jirui Qi (@jirui_qi) 's Twitter Profile Photo

[1/2] Heading to #EMNLP2025 to present our work on multilingual reasoning. (Fri Nov 7, 12:30-13:30) We analyze the trade-off between controlling reasoning languages and accuracy. We also explore mitigations like prompt hack, post-train (and GRPO🤩) for this issue. Come say hi!

[1/2] Heading to #EMNLP2025 to present our work on multilingual reasoning. (Fri Nov 7, 12:30-13:30)

We analyze the trade-off between controlling reasoning languages and accuracy. We also explore mitigations like prompt hack, post-train (and GRPO🤩) for this issue.

Come say hi!
Jirui Qi (@jirui_qi) 's Twitter Profile Photo

1/ Multilinguality & RL folks: Previously, we found LMs often fail to produce reasoning traces in the user's language; prompting/SFT helps, but hurts accuracy. (To be presented on Fri Nov 7, 12:30-13:30 #EMNLP2025 ) ⚠️ More importantly, we already tested an RL fix! Thread below.

1/ Multilinguality & RL folks: Previously, we found LMs often fail to produce reasoning traces in the user's language; prompting/SFT helps, but hurts accuracy. (To be presented on Fri Nov 7, 12:30-13:30 #EMNLP2025 )

⚠️ More importantly, we already tested an RL fix! Thread below.
Gabriele Sarti (@gsarti_) 's Twitter Profile Photo

Presenting today our work "Unsupervised Word-level Quality Estimation Through the Lens of Annotator (Dis)agreement" at the #EMNLP2025 Machine Translation morning session (Room A301, 11:45 China time). See you there! 🤗