Sainbayar Sukhbaatar (@tesatory) 's Twitter Profile
Sainbayar Sukhbaatar

@tesatory

Researcher Scientist at FAIR @AIatMeta
Research: Memory Networks, Asymmetric Self-Play, CommNet, Adaptive-Span, System2Attention, ...

ID: 142201024

calendar_today10-05-2010 07:16:18

1,1K Tweet

2,2K Followers

316 Following

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks New paper from Meta introduces a new multi-turn LLM agent benchmark and a novel RL algorithm for training multi-turn LLM agents with effective credit assignment over the multiple turns.

SWEET-RL: Training Multi-Turn LLM Agents on
Collaborative Reasoning Tasks

New paper from Meta introduces a new multi-turn LLM agent benchmark and a novel RL algorithm for training multi-turn LLM agents with effective credit assignment over the multiple turns.
Sainbayar Sukhbaatar (@tesatory) 's Twitter Profile Photo

Sweet! 🍭 New paper about training Multi-step Agent LLM. If a DPO-based critic has extra information during training, it can train a better LLM agent

Sainbayar Sukhbaatar (@tesatory) 's Twitter Profile Photo

Got our first "obviously" LLM generated review. I should have known when we were working on LM ten years ago that it will come back and bite us 😂. But seriously, reviewing feels broken beyond repair.

Sainbayar Sukhbaatar (@tesatory) 's Twitter Profile Photo

Attention operates at token-level, but sometimes what we’re looking for have multiple tokens. MTA makes it possible to condition attention on multiple tokens. Super fun work!

The AI Timeline (@theaitimeline) 's Twitter Profile Photo

🚨This week's top AI/ML research papers: - Inference-Time Scaling for Generalist Reward Modeling - Multi-Token Attention - Why do LLMs attend to the first token? - Command A - LLMs Pass the Turing Test - Advances and Challenges in Foundation Agents - PaperBench - Effectively

🚨This week's top AI/ML research papers:

- Inference-Time Scaling for Generalist Reward Modeling
- Multi-Token Attention
- Why do LLMs attend to the first token?
- Command A
- LLMs Pass the Turing Test
- Advances and Challenges in Foundation Agents
- PaperBench
- Effectively
TuringPost (@theturingpost) 's Twitter Profile Photo

4 advanced attention mechanisms you should know: • Slim attention — 8× less memory, 5× faster generation by storing only K from KV pairs and recomputing V. • XAttention — 13.5× speedup on long sequences via "looking" at the sum of values along diagonal lines in the attention

4 advanced attention mechanisms you should know:

• Slim attention — 8× less memory, 5× faster generation by storing only K from KV pairs and recomputing V.

• XAttention — 13.5× speedup on long sequences via "looking" at the sum of values along diagonal lines in the attention
Jason Weston (@jaseweston) 's Twitter Profile Photo

Google friends & ex-colleagues -- Google scholar seems pretty broken😔. Our most cited paper from last year "Self-Rewarding LLMs" has disappeared! Scholar has clustered it with another paper (SPIN) and it isn't in the search results. This is bad for PhD student & first author

Google friends & ex-colleagues -- Google scholar seems pretty broken😔. Our most cited paper from last year "Self-Rewarding LLMs" has disappeared! Scholar has clustered it with another paper (SPIN) and it isn't in the search results. This is bad for PhD student & first author
Sainbayar Sukhbaatar (@tesatory) 's Twitter Profile Photo

Really excited to give a talk here after 10 years 🎉 RAM workshop is about "Reasoning, Attention, Memory" and those topics had huge impacts in the last decade of AI. So there would be plenty to reflect and look forward to!