Joachim Baumann (@joabaum) Twitter Tweets • TwiCopy

Joachim Baumann

@joabaum

+ Follow

PhD student @UZH_en visiting @MilaNLProc | algorithmic fairness | NLP

ID: 1360270456985690113

linkhttps://www.ifi.uzh.ch/en/scg/people/Baumann.html calendar_today12-02-2021 17:08:15

0 Tweet

70 Followers

675 Following

MilaNLP

@milanlproc

5 months ago

🎉 The MilaNLP lab is excited to present 15 papers and 1 tutorial at #ACL2025 & workshops! Grateful to all our amazing collaborators, see everyone in Vienna! 🚀

🎉 The <a href="/MilaNLProc/">MilaNLP</a> lab is excited to present 15 papers and 1 tutorial at #ACL2025 & workshops! Grateful to all our amazing collaborators, see everyone in Vienna! 🚀

thumb_up_off_alt11

chat_bubble_outline0

repeat5

shareShare

[CL] Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation J Baumann, P Röttger, A Urman, A Wendsjö... [Bocconi University & University of Zurich] (2025) arxiv.org/abs/2509.08825

thumb_up_off_alt16

chat_bubble_outline1

repeat5

shareShare

Sayash Kapoor

@sayashk

2 months ago

📣New paper: Rigorous AI agent evaluation is much harder than it seems. For the last year, we have been working on infrastructure for fair agent evaluations on challenging benchmarks. Today, we release a paper that condenses our insights from 20,000+ agent rollouts on 9

thumb_up_off_alt399

chat_bubble_outline20

repeat91

shareShare

MilaNLP

@milanlproc

2 months ago

We’re delighted to welcome Eve Fleisig to our MilaNLP lab as a visiting PhD student! ✨

We’re delighted to welcome <a href="/enfleisig/">Eve Fleisig</a> to our <a href="/MilaNLProc/">MilaNLP</a> lab as a visiting PhD student! ✨

thumb_up_off_alt22

chat_bubble_outline1

repeat1

shareShare

Joachim Baumann

@joabaum

a month ago

Cool paper by Eddie Yang, confirming our LLM hacking findings (arxiv.org/pdf/2509.08825): ✓ LLMs are brittle data annotators ✓ Downstream conclusions often flip: *LLM hacking risk* is real! ✓ Bias correction methods can help but have tradeoffs ✓ Use human expert whenever possible

thumb_up_off_alt15

chat_bubble_outline1

repeat6

shareShare

Manoel

@manoelribeiro

a month ago

The debate over “LLMs as annotators” feels familiar: excitement, backlash, and anxiety about bad science. My take in a new blogpost is that LLMs don’t break measurement; they expose how fragile it already was. doomscrollingbabel.manoel.xyz/p/labeling-dat…

thumb_up_off_alt21

chat_bubble_outline1

repeat8

shareShare

Joachim Baumann

MilaNLP

fly51fly

Sayash Kapoor

MilaNLP

Joachim Baumann

Manoel