Saurabh Garg (@saurabh_garg67) 's Twitter Profile
Saurabh Garg

@saurabh_garg67

@thinkymachines | prev/ Researcher @MistralAI; PhD @mldcmu; CS @iitbombay (undergrad); Collab @GoogleAI @awscloud @apple

ID: 2942501034

linkhttps://saurabhgarg1996.github.io calendar_today25-12-2014 05:50:26

234 Tweet

1,1K Takipçi

627 Takip Edilen

Guillaume Lample @ NeurIPS 2024 (@guillaumelample) 's Twitter Profile Photo

Nice review (in French) of le Chat: youtu.be/PNGV9o_tsmQ?si… where Pixtral Large easily answers questions about complex PDFs (~100 pages, scanned, 90° rotated) that ChatGPT and Claude are unable to process.

Michael Oberst (@michaeloberst) 's Twitter Profile Photo

I'm recruiting PhD students for Fall 2025! CS PhD Deadline: Dec. 15th. I work on safe/reliable ML and causal inference, motivated by healthcare applications. Beyond myself, Johns Hopkins has a rich community of folks doing similar work! Come join us!

I'm recruiting PhD students for Fall 2025! CS PhD Deadline: Dec. 15th.
I work on safe/reliable ML and causal inference, motivated by healthcare applications.
Beyond myself, Johns Hopkins has a rich community of folks doing similar work!  Come join us!
lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

Exciting news from Copilot Arena! The latest Codestral 25.01 release is now topping the Copilot Arena leaderboard (joint #1, +12 points over previous Codestral!). Congrats to Mistral AI🎆 Try out the new model today in the Copilot Arena VSCode extension.

Exciting news from <a href="/CopilotArena/">Copilot Arena</a>!

The latest Codestral 25.01 release is now topping the Copilot Arena leaderboard (joint #1, +12 points over previous Codestral!). Congrats to <a href="/MistralAI/">Mistral AI</a>🎆

Try out the new model today in the <a href="/CopilotArena/">Copilot Arena</a> VSCode extension.
Arthur Mensch (@arthurmensch) 's Twitter Profile Photo

A new model to hasten AI progress. 24B, 81% MMLU, no RL for now! We're super excited to see the latest development in international open-source AI (kudos to Deepseek!), and cannot wait to bring new contributions to it. We're renewing our commitment to using Apache licenses. AI

Devendra Chaplot (@dchaplot) 's Twitter Profile Photo

Today, we are introducing the all new Le Chat: your ultimate AI sidekick for life and work! Now live on mobile! Blog: mistral.ai/en/news/all-ne… Try it on web: chat.mistral.ai Download mobile apps here: apps.apple.com/us/app/le-chat… play.google.com/store/apps/det…

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

This is interesting as a first large diffusion-based LLM. Most of the LLMs you've been seeing are ~clones as far as the core modeling approach goes. They're all trained "autoregressively", i.e. predicting tokens from left to right. Diffusion is different - it doesn't go left to

Saurabh Garg (@saurabh_garg67) 's Twitter Profile Photo

Career Update: Thrilled to share that I’ve joined Thinking Machines! Incredibly grateful and excited to be part of such an amazing team. Let’s build! 🚀

Palash Kala (@kalapolish) 's Twitter Profile Photo

Was fun hosting informal IITB CS get together in SF. We still argue about which hostel is the best 🙃 The oldest person was born in 1992 and the youngest was a decade younger

Lilian Weng (@lilianweng) 's Twitter Profile Photo

Giving your models more time to think before prediction, like via smart decoding, chain-of-thoughts reasoning, latent thoughts, etc, turns out to be quite effective for unblocking the next level of intelligence. New post is here :) “Why we think”: lilianweng.github.io/posts/2025-05-…

Ludwig Schmidt (@lschmidt3) 's Twitter Profile Photo

Very excited to finally release our paper for OpenThoughts! After DataComp and DCLM, this is the third large open dataset my group has been building in collaboration with the DataComp community. This time, the focus is on post-training, specifically reasoning data.

Very excited to finally release our paper for OpenThoughts!

After DataComp and DCLM, this is the third large open dataset my group has been building in collaboration with the DataComp community. This time, the focus is on post-training, specifically reasoning data.
Fartash Faghri (@fartashfg) 's Twitter Profile Photo

Is your AI keeping Up with the world? Announcing #NeurIPS2025 CCFM Workshop: Continual and Compatible Foundation Model Updates When/Where: Dec. 6-7 San Diego Submission deadline: Aug. 22, 2025. (opening soon!) sites.google.com/view/ccfm-neur… #FoundationModels #ContinualLearning

Sukjun (June) Hwang (@sukjun_hwang) 's Twitter Profile Photo

Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data

Mira Murati (@miramurati) 's Twitter Profile Photo

Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're

Alexander Kirillov (@_alex_kirillov_) 's Twitter Profile Photo

We have been working hard for the past 6 months on what I believe is the most ambitious multimodal AI program in the world. It is fantastic to see how pieces of a system that previously seemed intractable just fall into place. Feeling so lucky to create the future with this

Rowan Zellers (@rown) 's Twitter Profile Photo

It’s really fun to work with a talented yet small team. Our mission is ambitious - multimodal AI for collaborating with humans, so the best is yet to come! Join us— or fill out the application below if interested!

Saurabh Garg (@saurabh_garg67) 's Twitter Profile Photo

Really excited about our focus on building multimodal AI that collaborates with humans the way humans collaborate with each other. It's been an amazing ~4 months building with a small, talented team. Come join us!