Devvrit (@devvrit_khatri) 's Twitter Profile
Devvrit

@devvrit_khatri

GradStudent@UTCompSci. Large Scale ML - Scalability and Efficiency

ID: 1211241136859172864

linkhttps://www.devvrit.com/ calendar_today29-12-2019 11:02:56

104 Tweet

211 Takipçi

183 Takip Edilen

Devvrit (@devvrit_khatri) 's Twitter Profile Photo

Had an amazing time on the Delta Podcast about our recent Scaling RL work, future directions, and some fun broader conversation. Thanks for having me on :)

Devvrit (@devvrit_khatri) 's Twitter Profile Photo

This is a great blog explaining the progress in scaling RL and our work. Pretty clear, intuitive, and captures the key takeaways (and limitations :)). Thanks, Nathan Lambert!

Yuhui Xu (@xyh6666) 's Twitter Profile Photo

Congrats to the Meta team on ScaleRL! Interesting to see it adopt a reasoning-length control mechanism similar to what we introduced in Elastic Reasoning, using forced interruptions (e.g., “”) to improve RL training stability. Exciting to see this idea validated at scale!

Congrats to the Meta team on ScaleRL!
Interesting to see it adopt a reasoning-length control mechanism similar to what we introduced in Elastic Reasoning, using forced interruptions (e.g., “”) to improve RL training stability.

Exciting to see this idea validated at scale!
Prateek Jain (@jainprateek_) 's Twitter Profile Photo

We are hiring Research Scientists for our Frontiers-of-AI team at Google DeepMind Bangalore, Singapore, Mountain View. If you're passionate about cutting-edge AI research and building thinking, efficient, elastic, customized, and safe LLMs, we'd love to hear from you. We are

Devvrit (@devvrit_khatri) 's Twitter Profile Photo

As expected, data matters quite a bit. The commonly used RL data, at least in open source, is hardly given much weight.. Makes me think how to generate synthetic data but diverse and good quality, would be quite useful.. Also, “the Reasoning team is almost entirely composed of

Divyat Mahajan (@divyat09) 's Twitter Profile Photo

[1/9] While pretraining data might be hitting a wall, novel methods for modeling it are just getting started! We introduce future summary prediction (FSP), where the model predicts future sequence embeddings to reduce teacher forcing & shortcut learning. 📌Predict a learned

Lewis Tunstall (@_lewtun) 's Twitter Profile Photo

🌟 Introducing General On-Policy Logit Distillation 🌟 Inspired by the latest from Thinking Machines, we extend on-policy distillation to enable ANY teacher to be distilled into ANY student, even if their tokenizers differ! We've added this to TRL so you can now take any pair of

🌟 Introducing General On-Policy Logit Distillation 🌟

Inspired by the latest from <a href="/thinkymachines/">Thinking Machines</a>, we extend on-policy distillation to enable ANY teacher to be distilled into ANY student, even if their tokenizers differ!

We've added this to TRL so you can now take any pair of
Arian Khorasani 🦅 (@arian_khorasani) 's Twitter Profile Photo

🚨 New in ML Workshop at NeurIPS Conference We're so excited to invite you to the New In ML Workshop (NewInML @ NeurIPS 2025), taking place on Tuesday, December 2nd, 2025, at the San Diego Convention Center! Great opportunity, specifically for people who are new in machine learning! Details🧵

Devvrit (@devvrit_khatri) 's Twitter Profile Photo

LessGoo (again). Free lunch, folks! That’s reason enough to sign up :) Especially given we know how NeurIPS lunches are ;)