Moo Jin Kim (@moo_jin_kim) 's Twitter Profile
Moo Jin Kim

@moo_jin_kim

CS PhD student @Stanford | Research Intern @NVIDIA | AI/ML & Robotics

ID: 1518627093197692928

linkhttps://moojink.com calendar_today25-04-2022 16:24:57

43 Tweet

1,1K Takipçi

99 Takip Edilen

Moo Jin Kim (@moo_jin_kim) 's Twitter Profile Photo

Can we train VLAs to think about what to do next—visually—before executing tasks? In this work led by Qingqing Zhao, we found that *visual* chain-of-thought reasoning enhances policy success rates + enables VLAs to leverage unlabeled video data during pretraining! #CVPR2025

Fahim Tajwar (@fahimtajwar10) 's Twitter Profile Photo

RL with verifiable reward has shown impressive results in improving LLM reasoning, but what can we do when we do not have ground truth answers? Introducing Self-Rewarding Training (SRT): where language models provide their own reward for RL training! 🧵 1/n

RL with verifiable reward has shown impressive results in improving LLM reasoning, but what can we do when we do not have ground truth answers?

Introducing Self-Rewarding Training (SRT): where language models provide their own reward for RL training!

🧵 1/n