Anirudh Buvanesh (@anirudhbuvanesh) 's Twitter Profile
Anirudh Buvanesh

@anirudhbuvanesh

ID: 1580609992784113665

calendar_today13-10-2022 17:23:16

4 Tweet

5 Followers

127 Following

Laurent Charlin (@lcharlin) 's Twitter Profile Photo

Introducing a framework for end-to-end discovery of data structuresโ€”no predefined algorithms or hand-tuning needed. Work led by Omar Salemohamed. More details below. arxiv.org/abs/2411.03253

Ankur Sikarwar (@sikarwar_ank) 's Twitter Profile Photo

Thrilled to share our new work EARL ๐Ÿš€ 1โƒฃ An AR + RL image editing model that outperforms diffusion baselines w/ 5x less data. 2โƒฃ First systematic SFT vs RL study in image editing โ†’ RL post-training shines on complex edits where paired data is scarce. See thread for details๐Ÿ‘‡

Milad Aghajohari (@maghajohari) 's Twitter Profile Photo

Introducing linear scaling of reasoning: ๐“๐ก๐ž ๐Œ๐š๐ซ๐ค๐จ๐ฏ๐ข๐š๐ง ๐“๐ก๐ข๐ง๐ค๐ž๐ซ Reformulate RL so thinking scales ๐Ž(๐ง) ๐œ๐จ๐ฆ๐ฉ๐ฎ๐ญ๐ž, not O(n^2), with O(1) ๐ฆ๐ž๐ฆ๐จ๐ซ๐ฒ, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy ๐Ÿงต

Johan S. Obando ๐Ÿ‘๐Ÿฝ (@johanobandoc) 's Twitter Profile Photo

1/3 ๐ŸฅณExcited to share our new paper โ€˜Simplicial Embeddings Improve Sample Efficiency in Actorโ€“Critic Agentsโ€™! Project your features onto a product of simplices โ†’ sparse, stable reps, stronger grads, faster learning. ๐ŸงตFor more details, check out Pabloโ€™s thread ๐Ÿ‘‡

Jatin Prakash (@bicycleman15) 's Twitter Profile Photo

New paper alert ๐Ÿšจ What if I told you there is an architecture that provides a _knob_ to control quality-efficiency trade-offs directly at test-time? Introducing Compress & Attend Transformers (CATs) that provide you exactly this! ๐Ÿงต(1/n) ๐Ÿ‘‡

New paper alert ๐Ÿšจ

What if I told you there is an architecture that provides a _knob_ to control quality-efficiency trade-offs directly at test-time?

Introducing Compress & Attend Transformers (CATs) that provide you exactly this!

๐Ÿงต(1/n) ๐Ÿ‘‡
Muqeeth (@muqeeth10) 's Twitter Profile Photo

New preprint! Learning Robust Social Strategies with Large Language Models. We apply multi-agent RL finetuning to train LLMs that achieve cooperative and non-exploitable behavior in social dilemmas for the first time. ๐Ÿ“„ arxiv.org/abs/2511.19405 ๐Ÿงต โฌ‡๏ธ (1/8)

New preprint! Learning Robust Social Strategies with Large Language Models. We apply multi-agent RL finetuning to train LLMs that achieve cooperative and non-exploitable behavior in social dilemmas for the first time.

๐Ÿ“„ arxiv.org/abs/2511.19405
๐Ÿงต โฌ‡๏ธ
(1/8)