Faeze Brahman (@faeze_brh) 's Twitter Profile
Faeze Brahman

@faeze_brh

Postdoc @allen_ai @uw | Ph.D. from UCSC | Former Intern @MSFTResearch , @allen_ai | Researcher in #NLProc, #ML #AI

ID: 994736527421849602

linkhttps://fabrahman.github.io calendar_today11-05-2018 00:30:44

674 Tweet

1,1K Followers

1,1K Following

Xuhui Zhou (@nlpxuhui) 's Twitter Profile Photo

Hi friends, I will be at #NAACL2025 to present: 🧷 AI-LieDar: a framework to study LLMs navigating truthfulness-utility conflicts in interactions, and we found agents "lie" in goal-driven tasks with truthfulness rates below 50% 🫨 🧷 Sotopia-S4: a demo for our Sotopia

Maarten Sap (he/him) (@maartensap) 's Twitter Profile Photo

(((ل()(ل() 'yoav))))👾 yeah we some debates about what models "lying" really means and whether to use those words; some related terms have been used before, which we discuss in the paper. I like the conclusion you reached, but agree with the fear of wrongly anthropomorphizing / attributing intent

<a href="/yoavgo/">(((ل()(ل() 'yoav))))👾</a> yeah we some debates about what models "lying" really means and whether to use those words; some related terms have been used before, which we discuss in the paper. I like the conclusion you reached, but agree with the fear of wrongly anthropomorphizing / attributing intent
Vishakh Padmakumar (@vishakh_pk) 's Twitter Profile Photo

What does it mean for #LLM output to be novel? In work w/ John(Yueh-Han) Chen, Jane Pan, Valerie Chen, He He we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵

What does it mean for #LLM output to be novel?
In work w/ <a href="/jcyhc_ai/">John(Yueh-Han) Chen</a>, <a href="/JanePan_/">Jane Pan</a>, <a href="/valeriechen_/">Valerie Chen</a>,  <a href="/hhexiy/">He He</a> we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵
Faeze Brahman (@faeze_brh) 's Twitter Profile Photo

Would you trust an AI that chooses deception over truth when faced with conflicting goals? 📅 Checkout our poster at #NAACL2025 on April 30 @ 11am poster session 1 presented by Xuhui Zhou and led by Zhe Su

Would you trust an AI that chooses deception over truth when faced with conflicting goals?

📅 Checkout our poster at #NAACL2025 on April 30 @ 11am poster session 1 presented by <a href="/nlpxuhui/">Xuhui Zhou</a> and led by Zhe Su
Ai2 (@allen_ai) 's Twitter Profile Photo

Have questions? We’re an open book! We’re excited to host an AMA to answer your Qs about OLMo, our family of open language models. 🗓️ When: May 8, 8-10 am PT 🌐 Where: r/huggingface 🧠 Why: Gain insights from our expert researchers Chat soon!

Have questions? We’re an open book!

We’re excited to host an AMA to answer your Qs about OLMo, our family of open language models. 

🗓️ When: May 8, 8-10 am PT
🌐 Where: r/huggingface
🧠 Why: Gain insights from our expert researchers

Chat soon!
Shayne Longpre (@shayneredford) 's Twitter Profile Photo

Delighted to see BigGen Bench paper receive the 🏆best paper award 🏆at NAACL HLT 2025 2025! BigGen Bench introduces fine-grained, scalable, & human-aligned evaluations: 📈 77 challenging, diverse tasks 🛠️ 765 instances w/ ex-specific scoring rubrics 📋More human-aligned than

Delighted to see BigGen Bench paper receive the 🏆best paper award 🏆at <a href="/naaclmeeting/">NAACL HLT 2025</a> 2025!

BigGen Bench introduces fine-grained, scalable, &amp; human-aligned evaluations:

📈 77 challenging, diverse tasks
🛠️ 765 instances w/ ex-specific scoring rubrics
📋More human-aligned than
Ai2 (@allen_ai) 's Twitter Profile Photo

The story of OLMo, our Open Language Model, goes back to February 2023 when a group of researchers gathered at Ai2 and started planning. What if we made a language model with state-of-the-art performance, but we did it completely in the open? 🧵

Ai2 (@allen_ai) 's Twitter Profile Photo

📢We’re taking your questions now on Reddit for tomorrow’s AMA! Ask us anything about OLMo, our family of fully-open language models. Our researchers will be on hand to answer them Thursday, May 8 at 8am PST.

📢We’re taking your questions now on Reddit for tomorrow’s AMA! 

Ask us anything about OLMo, our family of fully-open language models. Our researchers will be on hand to answer them Thursday, May 8 at 8am PST.
Philippe Laban (@philippelaban) 's Twitter Profile Photo

🆕paper: LLMs Get Lost in Multi-Turn Conversation In real life, people don’t speak in perfect prompts. So we simulate multi-turn conversations — less lab-like, more like real use. We find that LLMs get lost in conversation. 👀What does that mean? 🧵1/N 📄arxiv.org/abs/2505.06120

🆕paper: LLMs Get Lost in Multi-Turn Conversation

In real life, people don’t speak in perfect prompts.
So we simulate multi-turn conversations — less lab-like, more like real use.

We find that LLMs get lost in conversation.
👀What does that mean? 🧵1/N
📄arxiv.org/abs/2505.06120
Faeze Brahman (@faeze_brh) 's Twitter Profile Photo

One of the trickiest problems in LLM deployment: preventing models from mindlessly reproducing training data while keeping intentional recall capabilities intact. Our ParaPO approach achieves this "smart memorization" elegantly through post-training preference optimization. 🎯

One of the trickiest problems in LLM deployment: preventing models from mindlessly reproducing training data while keeping intentional recall capabilities intact.

Our ParaPO approach achieves this "smart memorization" elegantly through post-training preference optimization.  🎯
Hyunwoo Kim (@hyunw_kim) 's Twitter Profile Photo

📢I'm thrilled to announce that I’ll be joining @KAIST_AI as an Assistant Professor in 2026, leading the Computation & Cognition (COCO) Lab🤖🧠: coco-kaist.github.io We'll be exploring reasoning, learning w/ synthetic data, and social agents! +I'm spending a gap year NVIDIA

📢I'm thrilled to announce that I’ll be joining @KAIST_AI as an Assistant Professor in 2026, leading the Computation &amp; Cognition (COCO) Lab🤖🧠: coco-kaist.github.io
We'll be exploring reasoning, learning w/ synthetic data, and social agents!
+I'm spending a gap year <a href="/nvidia/">NVIDIA</a>✨
Yapei Chang (@yapeichang) 's Twitter Profile Photo

🤔 Can simple string-matching metrics like BLEU rival reward models for LLM alignment? 🔍 We show that given access to a reference, BLEU can match reward models in human preference agreement, and even train LLMs competitively with them using GRPO. 🫐 Introducing BLEUBERI:

Stella Li (@stellalisy) 's Twitter Profile Photo

🤯 We cracked RLVR with... Random Rewards?! Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by: - Random rewards: +21% - Incorrect rewards: +25% - (FYI) Ground-truth rewards: + 28.8% How could this even work⁉️ Here's why: 🧵 Blogpost: tinyurl.com/spurious-rewar…

🤯 We cracked RLVR with... Random Rewards?!
Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by:
- Random rewards: +21%
- Incorrect rewards: +25%
- (FYI) Ground-truth rewards: + 28.8%
How could this even work⁉️ Here's why: 🧵
Blogpost: tinyurl.com/spurious-rewar…
Jaehun Jung (@jaehunjung_com) 's Twitter Profile Photo

Data curation is crucial for LLM reasoning, but how do we know if our dataset is not overfit to one benchmark and generalizes to unseen distributions? 🤔 𝐃𝐚𝐭𝐚 𝐝𝐢𝐯𝐞𝐫𝐬𝐢𝐭𝐲 is key, when measured correct—it strongly predicts model generalization in reasoning tasks! 🧵

Data curation is crucial for LLM reasoning, but how do we know if our dataset is not overfit to one benchmark and generalizes to unseen distributions? 🤔

𝐃𝐚𝐭𝐚 𝐝𝐢𝐯𝐞𝐫𝐬𝐢𝐭𝐲 is key, when measured correct—it strongly predicts model generalization in reasoning tasks! 🧵
Sahil Verma (@sahil1v) 's Twitter Profile Photo

🚨 New Paper! 🚨 Guard models slow, language-specific, and modality-limited? Meet OmniGuard that detects harmful prompts across multiple languages & modalities all using one approach with SOTA performance in all 3 modalities!! while being 120X faster 🚀 arxiv.org/abs/2505.23856

🚨 New Paper! 🚨
Guard models slow, language-specific, and modality-limited?

Meet OmniGuard that detects harmful prompts across multiple languages &amp; modalities all using one approach with SOTA performance in all 3 modalities!! while being 120X faster 🚀

arxiv.org/abs/2505.23856
Saumya Malik (@saumyamalik44) 's Twitter Profile Photo

I’m thrilled to share RewardBench 2 📊— We created a new multi-domain reward model evaluation that is substantially harder than RewardBench, we trained and released 70 reward models, and we gained insights about reward modeling benchmarks and downstream performance!

I’m thrilled to share RewardBench 2 📊— We created a new multi-domain reward model evaluation that is substantially harder than RewardBench, we trained and released 70 reward models, and we gained insights about reward modeling benchmarks and downstream performance!
Mohit Iyyer (@mohitiyyer) 's Twitter Profile Photo

Tired of AI slop? Our work on "Frankentexts" shows how LLMs can stitch together random fragments of human writing into coherent, relevant responses to arbitrary prompts. Frankentexts are weirdly creative, and they also pose problems for AI detectors: are they AI? human? More 👇

Jiacheng Liu (@liujc1998) 's Twitter Profile Photo

We enabled OLMoTrace for Tülu 3 models! 🤠 Matched spans are shorter than for OLMo models, bc we can only search in Tülu's post-training data (base model is Llama). Yet we thought it'd still bring some value. Try yourself on the Ai2 playground -- playground.allenai.org

We enabled OLMoTrace for Tülu 3 models! 🤠

Matched spans are shorter than for OLMo models, bc we can only search in Tülu's post-training data (base model is Llama). Yet we thought it'd still bring some value.

Try yourself on the Ai2 playground -- playground.allenai.org