Vibhakar Mohta
@vib2810_
MLE @ Nuro | CMU Robotics '24 | IITKGP '22
ID: 2933740599
https://github.com/vib2810 20-12-2014 15:41:47
19 Tweet
83 Followers
87 Following
Say ahoy to ππ°πΈπ»πΎπβ΅: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! ππ°πΈπ»πΎπ β΅ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!
β΅οΈ Excited to share ππ°πΈπ»πΎπ: a method for *learning to search* with learned world + reward models to plan in the latent space at test-time. Unlike behavior cloning, ππ°πΈπ»πΎπ recovers from mistakes without any additional data, DAgger corrections, or ground truth rewards.
My school friend, Kaushikβs son βVibhakar Mohtaβ is an integral part of this team. These guys are doing some phenomenal path breaking work in the field of AI. Vibhakar Mohta βοΈ NeurIPS 2025
All hands on deck for SAILOR βοΈ! π£ SAILOR nets more performance even with 10x less data than Diffusion Policies trained with behavioral cloning. Check out Gokul Swamyβs post explaining how we leverage learned models to enable test-time planning and mistake recovery for robots!
Really cool work from Gokul Swamy βοΈ NeurIPS 2025 and co! Sample efficiency gains by leveraging search in world models is a cool way to enable test time reasoning for robotics!
Finally get to share one of the coolest codebases we built this past year! We combine search with learnt reward and world models to significantly outperform diffusion policies with very little demonstration data. Check out Gokul Swamy super deep post and download our code today!
The entire team grinded on this, but I have been blown away by the rigor that Arnav Jain βοΈ Neurips 2025 and Vibhakar Mohta put into running experiments. And of course Gokul Swamy βοΈ NeurIPS 2025 for masterminding this as always. Very optimistic that this, alongside our sim2real efforts is the final playbook.
CMU and Cornell researchers released SAILOR which allows for test-time reasoning and beats other diffusion policies trained with 5-10x more data! Congrats Gokul Swamy βοΈ NeurIPS 2025 and team x.com/g_k_swamy/statβ¦
Full episode dropping soon! Geeking out with Gokul Swamy Arnav Jain Vibhakar Mohta on ππ°πΈπ»πΎπ: Robust Imitation via Learning to Search gokul.dev/sailor/ Co-hosted by Michael Cho - Rbt/Acc Chris Paxton
Full episode dropping soon! Geeking out with Gokul Swamy Arnav Jain Vibhakar Mohta on ππ°πΈπ»πΎπ: Robust Imitation via Learning to Search gokul.dev/sailor/ Co-hosted by Michael Cho - Rbt/Acc Chris Paxton
Best Self-Improving Agent: ReviveAgent Shaurya Rohatgi @ NeurIPS 2025 Varad Pimpalkhute Vibhakar Mohta They built an agent that acts as a dedicated maintenance engineer for your codebase. Its custom 'self-reflection' loop allows it to learn from its own failures, getting smarter at resolving and refactoring