Suvir Mirchandani (@suvir_m) 's Twitter Profile
Suvir Mirchandani

@suvir_m

PhD Student @StanfordAILab | Prev: Student Researcher @GoogleAI, AI Resident @MetaAI, BS/MS @Stanford CS

ID: 2731861316

linkhttp://suvir.me calendar_today14-08-2014 14:35:40

37 Tweet

504 Followers

490 Following

Siddharth Karamcheti (@siddkaramcheti) 's Twitter Profile Photo

Really grateful to Stanford HAI for covering our work on Vocal Sandbox - a framework for building robots that can seamlessly work with and learn from you in the real world (w/ Jenn Grannen Suvir Mirchandani Percy Liang Dorsa Sadigh). In case you missed it: arxiv.org/abs/2411.02599

Suvir Mirchandani (@suvir_m) 's Twitter Profile Photo

Human video data can be easier to collect than robot demonstrations—but extracting actions for training robot policies is challenging. RAD uses language reasoning extracted from human videos to boost generalization in reasoning-based policies Check out Jaden Clark's thread 👇

Suvir Mirchandani (@suvir_m) 's Twitter Profile Photo

Data quality can have a big impact on the performance of behavior cloning methods. But how can we measure the quality of demonstrations? One way is to score demonstrations via mutual information estimators: Check out Joey Hejna's thread 👇

Perry Dong (@perryadong) 's Twitter Profile Photo

Robotic models are advancing rapidly—but how do we scale their improvement? 🤖 We propose a recipe for batch online RL (train offline with online rollouts) that enables policies to self-improve without complications of online RL More: pd-perry.github.io/batch-online-rl (1/8)

Will Chen (@verityw_) 's Twitter Profile Photo

Embodied chain-of-thought reasoning (ECoT) is a powerful way to improve robot generalization & performance. But why is this the case, and how can that inform the design of learned robot policies? We investigate these questions in our latest work! ecot-lite.github.io 1/6

Omar Shaikh (@oshaikh13) 's Twitter Profile Photo

What if LLMs could learn your habits and preferences well enough (across any context!) to anticipate your needs? In a new paper, we present the General User Model (GUM): a model of you built from just your everyday computer use. 🧵

Jenn Grannen (@jenngrannen) 's Twitter Profile Photo

Meet ProVox: a proactive robot teammate that gets you 🤖❤️‍🔥 ProVox models your goals and expectations before a task starts — enabling personalized, proactive help for smoother, more natural collaboration. All powered by LLM commonsense. Recently accepted at IEEE RAS R-AL! 🧵1/7

Suvir Mirchandani (@suvir_m) 's Twitter Profile Photo

So excited for kishore siddoju to begin this new chapter! Working with Sidd has been a blast -- one of the most brilliant and kindest people I have had the privilege to learn from. Future PhD students, apply to work with Sidd!!

Haoyu Xiong (@haoyu_xiong_) 's Twitter Profile Photo

Your bimanual manipulators might need a Robot Neck 🤖🦒 Introducing Vision in Action: Learning Active Perception from Human Demonstrations ViA learns task-specific, active perceptual strategies—such as searching, tracking, and focusing—directly from human demos, enabling robust

Jubayer Ibn Hamid (@jubayer_hamid) 's Twitter Profile Photo

Exploration is fundamental to RL. Yet policy gradient methods often collapse: during training they fail to explore broadly, and converge into narrow, easily exploitable behaviors. The result is poor generalization, limited gains from test-time scaling, and brittleness on tasks

Exploration is fundamental to RL. Yet policy gradient methods often collapse: during training they fail to explore broadly, and converge into narrow, easily exploitable behaviors. The result is poor generalization, limited gains from test-time scaling, and brittleness on tasks