Yunjae Won (@yunjae_won_) 's Twitter Profile
Yunjae Won

@yunjae_won_

Ms+PhD @kaist_ai Language & Knowledge Lab
Research Interests: Preference Optimization, Continual Learning, and LLMs.
Also a huge fan of Jazz, Rock, and Fusion.

ID: 1927184020107825152

linkhttps://yunjae-won.github.io/ calendar_today27-05-2025 02:04:24

27 Tweet

38 Followers

87 Following

Yunhao (Robin) Tang (@robinphysics) 's Twitter Profile Photo

Online interaction is probably a defining property of RL. But with the rise of offline algo, it is not clear if the “online” bit of RL is necessary for RLHF. We hypothesis test the causes of the perf gap between online and offline alignment. arxiv.org/pdf/2405.08448… Details in🧵

Online interaction is probably a defining property of RL. But with the rise of offline algo, it is not clear if the “online” bit of RL is necessary for RLHF.

We hypothesis test the causes of the perf gap between online and offline alignment. arxiv.org/pdf/2405.08448…

Details in🧵
Ruizhe Shi (@smellycat_zzz) 's Twitter Profile Photo

Two-stage RLHF or one-stage DPO: Which one is better for learning from preferences? Equal under strong assumptions, but representation differences break the tie. Our paper reveals their fine-grained performance gaps under various conditions. paper: arxiv.org/abs/2505.19770

Two-stage RLHF or one-stage DPO: Which one is better for learning from preferences?

Equal under strong assumptions, but representation differences break the tie. Our paper reveals their fine-grained performance gaps under various conditions.

paper: arxiv.org/abs/2505.19770
Séb Krier (@sebkrier) 's Twitter Profile Photo

Google DeepMind just put out a Lean 4 repo of formalized open math problems. We need more contributions to the database! github.com/google-deepmin…

Sohee Yang (@soheeyang_) 's Twitter Profile Photo

🚨 New Paper 🧵 How effectively do reasoning models reevaluate their thought? We find that: - Models excel at identifying unhelpful thoughts but struggle to recover from them - Smaller models can be more robust - Self-reevaluation ability is far from true meta-cognitive awareness

🚨 New Paper 🧵
How effectively do reasoning models reevaluate their thought? We find that:
- Models excel at identifying unhelpful thoughts but struggle to recover from them
- Smaller models can be more robust
- Self-reevaluation ability is far from true meta-cognitive awareness
fly51fly (@fly51fly) 's Twitter Profile Photo

[LG] Probably Approximately Correct Labels E J. Candès, A Ilyas, T Zrnic [Stanford University] (2025) arxiv.org/abs/2506.10908

[LG] Probably Approximately Correct Labels
E J. Candès, A Ilyas, T Zrnic [Stanford University] (2025)
arxiv.org/abs/2506.10908
Yunjae Won (@yunjae_won_) 's Twitter Profile Photo

How does the loss of external context affect a language model’s learning ability to ground its responses? Check out our latest work led by hyunji amy lee, where we introduce CINGS, a simple training method that significantly improves grounding in both text and vision-language

Joel Jang (@jang_yoel) 's Twitter Profile Photo

🚀 GR00T Dreams code is live! NVIDIA GEAR Lab's open-source solution for robotics data via video world models. Fine-tune on any robot, generate 'dreams', extract actions with IDM, and train visuomotor policies with LeRobot datasets (GR00T N1.5, SmolVLA). github.com/NVIDIA/GR00T-D…

Scott Geng (@scottgeng00) 's Twitter Profile Photo

🤔 How do we train AI models that surpass their teachers? 🚨 In #COLM2025: ✨Delta learning ✨makes LLM post-training cheap and easy – with only weak data, we beat open 8B SOTA 🤯 The secret? Learn from the *differences* in weak data pairs! 📜 arxiv.org/abs/2507.06187 🧵 below

🤔 How do we train AI models that surpass their teachers?

🚨 In #COLM2025: ✨Delta learning ✨makes LLM post-training cheap and easy – with only weak data, we beat open 8B SOTA 🤯

The secret? Learn from the *differences* in weak data pairs!

📜 arxiv.org/abs/2507.06187

🧵 below
Jiwoo Hong @ NAACL 2025 (@jiwoohong98) 's Twitter Profile Photo

⁉️Why do reward models suffer from over-optimization in RLHF? We revisit how representations are learned during reward modeling, revealing “hidden state dispersion” as the key, with a simple fix! 🧵 Meet us at ICML Conference! 📅July 16th (Wed) 11AM–1:30PM 📍East Hall A-B E-2608

⁉️Why do reward models suffer from over-optimization in RLHF?

We revisit how representations are learned during reward modeling, revealing “hidden state dispersion” as the key, with a simple fix!
🧵

Meet us at <a href="/icmlconf/">ICML Conference</a>!
📅July 16th (Wed) 11AM–1:30PM
📍East Hall A-B E-2608
Sohee Yang (@soheeyang_) 's Twitter Profile Photo

Our paper "Do Large Language Models Perform Latent Multi-Hop Reasoning without exploiting shortcuts?" will be presented at #ACL2025 today. 📍 Mon 18:00-19:30 Findings Posters (Hall X4 X5) Please visit our poster if you are interested! ✨

hyunji amy lee (@hyunji_amy_lee) 's Twitter Profile Photo

🧐 LLMs aren’t great at judging their own correctness. ❗But history across models helps! We present Generalized Correctness Models (GCMs), which learn to predict correctness based on history, outperforming model-specific correctness and larger models' self-confidence.