Adithya Bhaskar (@adithyanlp) 's Twitter Profile
Adithya Bhaskar

@adithyanlp

Second Year CS Ph.D. student at Princeton University (@princeton_nlp), previously CS undergrad at IIT Bombay

ID: 1669231860130660352

linkhttp://adithyabh.github.io calendar_today15-06-2023 06:34:30

39 Tweet

226 Followers

245 Following

Yu Meng @ ICLR'25 (@yumeng0818) 's Twitter Profile Photo

Introducing SimPO: Simpler & more effective Preference Optimization!๐ŸŽ‰ Significantly outperforms DPO w/o a reference model!๐Ÿ“ˆ Llama-3-8B-SimPO ranked among top on leaderboards!๐Ÿ’ช โœ…44.7% LC win rate on AlpacaEval 2 โœ…33.8% win rate on Arena-Hard arxiv.org/abs/2405.14734 ๐Ÿงต[1/n]

Introducing SimPO: Simpler & more effective Preference Optimization!๐ŸŽ‰

Significantly outperforms DPO w/o a reference model!๐Ÿ“ˆ

Llama-3-8B-SimPO ranked among top on leaderboards!๐Ÿ’ช
โœ…44.7% LC win rate on AlpacaEval 2
โœ…33.8% win rate on Arena-Hard

arxiv.org/abs/2405.14734
๐Ÿงต[1/n]
Zirui "Colin" Wang (@zwcolin) 's Twitter Profile Photo

๐Ÿคจ Are Multimodal Large Language Models really as ๐ ๐จ๐จ๐ at ๐œ๐ก๐š๐ซ๐ญ ๐ฎ๐ง๐๐ž๐ซ๐ฌ๐ญ๐š๐ง๐๐ข๐ง๐  as existing benchmarks such as ChartQA suggest? ๐Ÿšซ Our โ„‚๐•™๐•’๐•ฃ๐•๐•š๐•ง benchmark suggests NO! ๐Ÿฅ‡Humans achieve โœจ๐Ÿ–๐ŸŽ+% correctness. ๐ŸฅˆSonnet 3.5 outperforms GPT-4o by 10+ points,

Sadhika Malladi (@sadhikamalladi) 's Twitter Profile Photo

My new blog post argues from first principles how length normalization in preference learning objectives (e.g., SimPO) can facilitate learning from model-annotated preference data. Check it out! cs.princeton.edu/~smalladi/blogโ€ฆ

Dan Friedman (@danfriedman0) 's Twitter Profile Photo

How can we understand neural chatbots in terms of interpretable, symbolic mechanisms? To explore this question, we constructed a Transformer that implements the classic ELIZA chatbot algorithm (with Abhishek Panigrahi and Danqi Chen). Paper: arxiv.org/abs/2407.10949 (1/6)

How can we understand neural chatbots in terms of interpretable, symbolic mechanisms? To explore this question, we constructed a Transformer that implements the classic ELIZA chatbot algorithm (with <a href="/Abhishek_034/">Abhishek Panigrahi</a> and <a href="/danqi_chen/">Danqi Chen</a>). Paper: arxiv.org/abs/2407.10949 (1/6)
Adithya Bhaskar (@adithyanlp) 's Twitter Profile Photo

I'll be at ACL 2024! I'd love to chat with about interpretability, preference optimization, science of LM, or any NLP topics -- feel free to reach out! Oh, and I'll present The Heuristic Core (arxiv.org/abs/2403.03942) both as an oral (Aug 13 10:30) and a poster (Aug 12 14:00).

Noam Razin (@noamrazin) 's Twitter Profile Photo

Past work observed that DPO often decreases the probability of preferred responses. So where does the probability go? ๐Ÿง We investigate the causes for this counter-intuitive phenomenon and show that it can lead to surprising failures in alignment! ๐Ÿ“ฐ arxiv.org/abs/2410.08847 ๐Ÿงต

Past work observed that DPO often decreases the probability of preferred responses. So where does the probability go? ๐Ÿง

We investigate the causes for this counter-intuitive phenomenon and show that it can lead to surprising failures in alignment!

๐Ÿ“ฐ arxiv.org/abs/2410.08847
๐Ÿงต
Tyler Zhu (@tyleryzhu) 's Twitter Profile Photo

Have you ever wondered why we donโ€™t use multiple visual encoders for VideoLLMs? We thought the same! Excited to announce our latest work MERV, on using Multiple Encoders for Representing Videos in VideoLLMs, outperforming prior works with the same data. ๐Ÿงต

Have you ever wondered why we donโ€™t use multiple visual encoders for VideoLLMs? We thought the same! 

Excited to announce our latest work MERV, on using Multiple Encoders for Representing Videos in VideoLLMs, outperforming prior works with the same data. ๐Ÿงต
Xindi Wu (@cindy_x_wu) 's Twitter Profile Photo

Want to train large vision-language models but drowning in data? arxiv.org/abs/2501.00654 Introducing ICONS - we demonstrate how to select only 20% of training samples while maintaining 98.6% of the performance, and 60% of training samples to achieve 102.1% of the performance.

Xindi Wu (@cindy_x_wu) 's Twitter Profile Photo

Introducing COMPACT: COMPositional Atomic-to-complex Visual Capability Tuning, a data-efficient approach to improve multimodal models on complex visual tasks without scaling data volume. ๐Ÿ“ฆ arxiv.org/abs/2504.21850 1/10

Introducing COMPACT: COMPositional Atomic-to-complex Visual Capability Tuning, a data-efficient approach to improve multimodal models on complex visual tasks without scaling data volume. ๐Ÿ“ฆ

arxiv.org/abs/2504.21850

1/10
Xi Ye (@xiye_nlp) 's Twitter Profile Photo

๐Ÿค” Recent mech interp work showed that retrieval heads can explain some long-context behavior. But can we use this insight for retrieval? ๐Ÿ“ฃ Introducing QRHeads (query-focused retrieval heads) that enhance retrieval Main contributions: ๐Ÿ” Better head detection: we find a

๐Ÿค” Recent mech interp work showed that retrieval heads can explain some long-context behavior. But can we use this insight for retrieval?
๐Ÿ“ฃ Introducing QRHeads (query-focused retrieval heads) that enhance retrieval

Main contributions:
 ๐Ÿ” Better head detection: we find a