Jinkun Cao (@jinkuncao) Twitter Tweets • TwiCopy

Han Xue

9 months ago

Humans can easily perform complex contact-rich tasks with vision and touch, but these tasks remain challenging for robots. How can we resolve this from both the algorithm side and the data side? Introducing Reactive Diffusion Policy (RDP),a slow-fast imitation learning algorithm

thumb_up_off_alt355

chat_bubble_outline10

repeat73

shareShare

Hao-Shu Fang

@haoshu_fang

7 months ago

Super fun chatting with Chris Paxton and Michael Cho - Rbt/Acc about AnyDexGrasp! 🚀 We talked about how to make robots grasp like humans — fast, efficient, and across different hands. Big thanks to both of them for the great conversation and for digging into the details! Check it out!

thumb_up_off_alt25

chat_bubble_outline0

repeat4

shareShare

Yi Zhou

@papagina_yi

6 months ago

🚀 Struggling with the lack of high-quality data for AI-driven human-object interaction research? We've got you covered! Introducing HUMOTO, a groundbreaking 4D dataset for human-object interaction, developed with a combination of wearable motion capture, SOTA 6D pose

thumb_up_off_alt821

chat_bubble_outline11

repeat157

shareShare

Tairan He

@tairanhe99

6 months ago

Excited to be at #ICRA this week! Working on humanoids, RL, or sim-to-real? Let’s grab coffee—DMs are open. See you there! Presentation for: HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots 📍 Room 307 (Regular Session WeET6: Learning for Legged Locomotion 1) ⏰

thumb_up_off_alt69

chat_bubble_outline0

repeat5

shareShare

Jeff Li

@jiefengli_jeff

6 months ago

📣📣📣 Excited to share GENMO: A Generalist Model for Human Motion. Words can’t perfectly describe human motion—so we build GENMO. It’s everything to motion. 🔥Video, Text, Music, Audio, Keyframes, Spatial Control…🔥 -- GENMO handles it all within a single model. 📹 Two

thumb_up_off_alt118

chat_bubble_outline2

repeat32

shareShare

Zhengyi “Zen” Luo

@zhengyiluo

6 months ago

I have been told this is the best dex hand they have seen. A crazy debut at ICRA 2025

thumb_up_off_alt248

chat_bubble_outline7

repeat27

shareShare

Xun Huang

@xunhuang1995

5 months ago

What exactly is a "world model"? And what limits existing video generation models from being true world models? In my new blog post, I argue that a true video world model must be causal, interactive, persistent, real-time, and physical accurate. xunhuang.me/blogs/world_mo…

thumb_up_off_alt253

chat_bubble_outline5

repeat39

shareShare

Ruihan Yang

@rchalyang

4 months ago

Turns out, when we discuss “humanoid robot” everyone’s picturing something totally different. So I made this figure, and next time, i'll show this, before discussion.

thumb_up_off_alt194

chat_bubble_outline15

repeat19

shareShare

David Park

@park_jinhyung1

3 months ago

Introducing ATLAS: A high-fidelity, parametric human body model enabling precise, independent control of surface and skeletal attributes for character creation. To be presented at #ICCV2025! Learn more about ATLAS here: jindapark.github.io/projects/atlas/

thumb_up_off_alt189

chat_bubble_outline6

repeat33

shareShare

Yuda Song @ ICLR 2025

@yus167

a month ago

🤖 Robots rarely see the true world's state—they operate on partial, noisy visual observations. How should we design algorithms under this partial observability? Should we decide (end-to-end RL) or distill (from a privileged expert)? We study this trade-off in locomotion. 🧵(1/n)

thumb_up_off_alt125

chat_bubble_outline2

repeat35

shareShare

Jinkun Cao

@jinkuncao

a month ago

an exciting chance to work on the advances of robot learning!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Rohan Choudhury

@rchoudhury997

a month ago

Excited to release our new preprint - we introduce Adaptive Patch Transformers (APT), a method to speed up vision transformers by using multiple different patch sizes within the same image!

thumb_up_off_alt156

chat_bubble_outline9

repeat22

shareShare

Zhengyi “Zen” Luo

@zhengyiluo

17 days ago

Humanoids need a single, generalist control policy for all of their physical tasks, not a new one for every new chore or demo. A policy for walking can't dance. A policy for dancing can't support mowing the lawn. We need to scale up humanoid control for diverse behaviors, just

thumb_up_off_alt115

chat_bubble_outline3

repeat20

shareShare

Jinkun Cao

@jinkuncao

16 days ago

Hao-Shu was the first mentor when I started in the area of AI nearly 10 yrs ago and has been one of the best senior researchers I could always learn from. Go apply!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Jinkun Cao

@jinkuncao

14 days ago

The results are truly impressive. Additionally, I truly appreciate the research approach that seeks to answer 'what is enough' rather than simply stacking one 'novel' piece on top of another while leaving readers with more questions.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

AI at Meta

@aiatmeta

9 days ago

Today we’re excited to unveil a new generation of Segment Anything Models: 1️⃣ SAM 3 enables detecting, segmenting and tracking of objects across images and videos, now with short text phrases and exemplar prompts. 🔗 Learn more about SAM 3: go.meta.me/591040 2️⃣ SAM 3D

thumb_up_off_alt2,2K

chat_bubble_outline86

repeat439

shareShare

AI at Meta

@aiatmeta

9 days ago

Introducing SAM 3D, the newest addition to the SAM collection, bringing common sense 3D understanding of everyday images. SAM 3D includes two models: 🛋️ SAM 3D Objects for object and scene reconstruction 🧑‍🤝‍🧑 SAM 3D Body for human pose and shape estimation Both models achieve

thumb_up_off_alt1,1K

chat_bubble_outline41

repeat277

shareShare

CMU Center for Perceptual Computing and Learning

@robovisioncmu

9 days ago

New model from Meta, SAM 3D Body, powered by people from Smith Hall (Kris Kitani,Jinkun Cao, David Park, Jyun-Ting Song) of course! #goSmithHall Introducing SAM 3D: a New Standard for 3D Object & Human Reconstruction ... youtu.be/B7PZuM55ayc?si… via YouTube

thumb_up_off_alt7

chat_bubble_outline0

repeat2

shareShare

Wildminder

@wildmindai

7 days ago

ComfyUI-SAM3DBody: single-image full-body 3D human mesh recovery; uses the Momentum Human Rig (MHR) for SOTA accuracy on in-the-wild poses. github.com/PozzettiAndrea…

thumb_up_off_alt417

chat_bubble_outline8

repeat57

shareShare

AI at Meta

@aiatmeta

3 days ago

SAM 3D is helping advance the future of rehabilitation. See how researchers at Carnegie Mellon University are using SAM 3D to capture and analyze human movement in clinical settings, opening the doors to personalized, data-driven insights in the recovery process. 🔗 Learn more about SAM

thumb_up_off_alt453

chat_bubble_outline24

repeat80

shareShare