Yunzhu Li (@yunzhuliyz) 's Twitter Profile
Yunzhu Li

@yunzhuliyz

Assistant Professor of Computer Science @Columbia @ColumbiaCompSci, Postdoc from @Stanford @StanfordSVL, PhD from @MIT_CSAIL. #Robotics #Vision #Learning

ID: 947911979099881472

linkhttps://yunzhuli.github.io/ calendar_today01-01-2018 19:26:41

449 Tweet

6,6K Takipçi

523 Takip Edilen

Yifan Hou (@yifanhou2) 's Twitter Profile Photo

Adaptive Compliance Policy just won the best paper award at the ICRA Contact-Rich Manipulation workshop! Huge thanks to the team and everyone who supported us at the workshop. adaptive-compliance.github.io contact-rich.github.io

Adaptive Compliance Policy just won the best paper award at the ICRA Contact-Rich Manipulation workshop! Huge thanks to the team and everyone who supported us at the workshop. adaptive-compliance.github.io contact-rich.github.io
Yunzhu Li (@yunzhuliyz) 's Twitter Profile Photo

Two days into #ICRA2025 IEEE ICRA—great connecting with folks! Gave a talk, moderated a panel, and got a *Best Paper Award* 🏆 at the workshops. Up next: four papers and two more workshop talks/panels. Excited to chat robot learning and the road to general intelligence! 🤖

Yunzhu Li (@yunzhuliyz) 's Twitter Profile Photo

**Steerability** remains one of the key issues for current vision-language-action models (VLAs). Natural language is often ambiguous and vague: "Hang a mug on a branch" vs "Hang the left mug on the right branch." Many works claim to handle language input, yet the tasks are often

Mandi Zhao (@zhaomandi) 's Twitter Profile Photo

I’m a big fan of this line of work from Columbia (also check out PhysTwin by Hanxiao Jiang: jianghanxiao.github.io/phystwin-web/) they really make real2sim work for the very challenging deformable objects, And show it’s useful for real robot manipulation. So far it seems a bit limited to

Katherine Liu (@robo_kat) 's Twitter Profile Photo

How can we achieve both common sense understanding that can deal with varying levels of ambiguity in language and dextrous manipulation? Check out CodeDiffuser, a really neat work that bridges Code Gen with a 3D Diffusion Policy! This was a fun project with cool experiments! 🤖

Yunzhu Li (@yunzhuliyz) 's Twitter Profile Photo

We’ve been exploring 3D world models with the goal of finding the right recipe that is both: (1) structured—for sample efficiency and generalization (my personal emphasis), and (2) scalable—as we increase real-world data collection. With **Particle-Grid Neural Dynamics**

Yichen Li (@antheayli) 's Twitter Profile Photo

How to equip robot with super human sensory capabilities? Come join us at RSS 2025 workshop, June21, on Multimodal Robotics with Multisensory capabilities to learn more. Featuring speakers: Jitendra MALIK, Katherine J. Kuchenbecker, Kristen Grauman, Yunzhu Li, Boyi Li

Hao-Shu Fang (@haoshu_fang) 's Twitter Profile Photo

Happening tomorrow morning! Hear from Jitendra MALIK, Katherine J. Kuchenbecker, Kristen Grauman, Yunzhu Li, Boyi Li as they share their insights on Multimodal Robotics with Multisensory capabilities at our RSS workshop

Wenlong Huang (@wenlong_huang) 's Twitter Profile Photo

Join us tomorrow in SGM 124 for the SWOMO workshop at #RSS2025! We will have 6 amazing talks and a panel in the end to discuss structured world modeling for robotics! Latest schedule and information at swomo-rss.github.io

Yunzhu Li (@yunzhuliyz) 's Twitter Profile Photo

Had a great time yesterday giving three invited talks at #RSS2025 workshops—on foundation models, structured world models, and tactile sensing for robotic manipulation. Lots of engaging conversations! One more talk coming up on Wednesday (6/25). Also excited to be presenting two

Yunzhu Li (@yunzhuliyz) 's Twitter Profile Photo

I had the opportunity to meet with the Myolab team a few months ago—they're building something particularly exciting: life-like digital twins. Congratulations to the team, and I’m looking forward to what’s coming next!

Kaifeng Zhang (@kaiwynd) 's Twitter Profile Photo

In his RSS keynote talk, Trevor Darrell predicted that 4D particle will be the most effective representation for visual pretraining. Exciting to see this vision align with the core design of our work PGND: learning (4D) particle dynamics from real-world RGB-D videos. 👀

In his RSS keynote talk, Trevor Darrell predicted that 4D particle will be the most effective representation for visual pretraining. Exciting to see this vision align with the core design of our work PGND: learning (4D) particle dynamics from real-world RGB-D videos. 👀
Yunzhu Li (@yunzhuliyz) 's Twitter Profile Photo

Is VideoGen starting to become good enough for robotic manipulation? 🤖 Check out our recent work, RIGVid — Robots Imitating Generated Videos — where we use AI-generated videos as intermediate representations and 6-DoF motion retargeting to guide robots in diverse manipulation

Unnat Jain (@unnatjain2010) 's Twitter Profile Photo

Research arc: ⏪ 2 yrs ago, we introduced VRB: learning from hours of human videos to cut down teleop (Gibson🙏) ▶️ Today, we explore a wilder path: robots deployed with no teleop, no human demos, no affordances. Just raw video generation magic 🙏 Day 1 of faculty life done! 😉

Homanga Bharadhwaj (@mangahomanga) 's Twitter Profile Photo

As a researcher, it is immensly satisfying when the community tackles open problems from your previous work! In Gen2Act last year we showed how video generation models can be used zero-shot for manipulation. This paper takes the idea further via richer motion cues.

Jianren Wang (@wang_jianren) 's Twitter Profile Photo

Three years ago, when we began exploring learning from video—most tasks were just pick-and-place. With PSAG, we enabled one-shot learning of deformable object manipulation from YouTube. Now, this paper pushes it further, tackling a wider range of tasks via visual FMs w.o. demos!

Sean Kirmani (@seankirmani) 's Twitter Profile Photo

🤖🌎 We are organizing a workshop on Robotics World Modeling at Conference on Robot Learning 2025! We have an excellent group of speakers and panelists, and are inviting you to submit your papers with a July 13 deadline. Website: robot-world-modeling.github.io

🤖🌎 We are organizing a workshop on Robotics World Modeling at <a href="/corl_conf/">Conference on Robot Learning</a> 2025!

We have an excellent group of speakers and panelists, and are inviting you to submit your papers with a July 13 deadline.

Website: robot-world-modeling.github.io
Russ Tedrake (@russtedrake) 's Twitter Profile Photo

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the

Wuyang Chen (@wuyangc) 's Twitter Profile Photo

We have two workshops along with ICML next week. (Great support from SFU+UBC+Vector!) July 14 (9am-5pm): sites.google.com/view/vancouver… July 21 (9am-noon): sites.google.com/view/sfu-at-ic… Please join and enjoy the talks! Location: 515 W Hastings St, Vancouver, BC V6B 4N6 maps.app.goo.gl/kN6o9W87bbL5qC…

We have two workshops along with ICML next week. (Great support from SFU+UBC+Vector!)
July 14 (9am-5pm): sites.google.com/view/vancouver…
July 21 (9am-noon): sites.google.com/view/sfu-at-ic…
Please join and enjoy the talks!
Location: 515 W Hastings St, Vancouver, BC V6B 4N6
maps.app.goo.gl/kN6o9W87bbL5qC…
Yunzhu Li (@yunzhuliyz) 's Twitter Profile Photo

I was really impressed by the UMI gripper (Cheng Chi et al.), but a key limitation is that **force-related data wasn’t captured**: humans feel haptic feedback through the mechanical springs, but the robot couldn’t leverage that info, limiting the data’s value for fine-grained