Jared Moore (@jaredlcm) 's Twitter Profile
Jared Moore

@jaredlcm

@jaredlcm.bsky.social
AI Researcher, Writer
Stanford

ID: 874693103306846209

linkhttp://jaredmoore.org calendar_today13-06-2017 18:21:01

78 Tweet

172 Takipçi

295 Takip Edilen

Harvey Yiyun Fu (@harveyiyun) 's Twitter Profile Photo

LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing? 🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative space” in documents. paper:

LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing?

🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative space” in documents.

paper:
Jared Moore (@jaredlcm) 's Twitter Profile Photo

I'll be presenting this paper as a poster (number 56) next Wednesday from 4:30 to 6:30 at Conference on Language Modeling. Please reach out if you'd like to chat about this or any of my other work in Montreal!

Taylor Sorensen (@ma_tay_) 's Twitter Profile Photo

Did you know that LLMs suffer from serious mode collapse? For example, if you ask models to tell you a joke, they almost always tell you the same joke? This is true across samples and even across model families! Why does this happen? Can we improve it? x.com/artetxem/statu…

Did you know that LLMs suffer from serious mode collapse?

For example, if you ask models to tell you a joke, they almost always tell you the same joke? This is true across samples and even across model families!

Why does this happen? Can we improve it?

x.com/artetxem/statu…
Taylor Sorensen (@ma_tay_) 's Twitter Profile Photo

🤖➡️📉 Post-training made LLMs better at chat and reasoning—but worse at distributional alignment, diversity, and sometimes even steering(!) We measure this with our new resource (Spectrum Suite) and introduce Spectrum Tuning (method) to bring them back into our models! 🌈 1/🧵

🤖➡️📉 Post-training made LLMs better at chat and reasoning—but worse at distributional alignment, diversity, and sometimes even steering(!)

We measure this with our new resource (Spectrum Suite) and introduce Spectrum Tuning (method) to bring them back into our models! 🌈

1/🧵