CMU Center for Perceptual Computing and Learning (@robovisioncmu) Twitter Tweets • TwiCopy

Robots need strong visuo-motor representations to manipulate objects, but it’s hard to learn these using demo data alone. Our #RSS2024 project vastly improves robotic representations, using human affordances mined from Ego4D! w/ Mohan Kumar Srirama Shikhar Bahl Abhinav Gupta

thumb_up_off_alt90

chat_bubble_outline1

repeat18

shareShare

Mihir Prabhudesai

@mihirp98

a year ago

1/ Happy to share VADER: Video Diffusion Alignment via Reward Gradients. We adapt foundational video diffusion models using pre-trained reward models to generate high-quality, aligned videos for various end-applications. Below we generated a short movie using VADER 😀, we used

thumb_up_off_alt134

chat_bubble_outline2

repeat15

shareShare

Murtaza Dalal

@mihdalal

a year ago

Can my robot cook my food, rearrange my dresser, tidy my messy table and do so much more without ANY demos or real-world training data? Introducing ManipGen: A generalist agent for manipulation that can solve long-horizon robotics tasks entirely zero shot, from text input! 1/N

thumb_up_off_alt627

chat_bubble_outline10

repeat118

shareShare

Rohan Choudhury

@rchoudhury997

a year ago

Excited to finally release our NeurIPS 2024 (spotlight) paper! We introduce Run-Length Tokenization (RLT), a simple way to significantly speed up your vision transformer on video with no loss in performance!

thumb_up_off_alt1,1K

chat_bubble_outline22

repeat173

shareShare

Unnat Jain

@unnatjain2010

a year ago

Excited to share that I'll be joining University of California at Irvine as a CS faculty in '25!🌟 Faculty apps: Krishna Murthy, Zhuang Liu & I share our tips: unnat.github.io/notes/Hidden_C… PhD apps: I'm looking for students in vision, robot learning, & AI4Science. Details👇

Excited to share that I'll be joining University of California at Irvine as a CS faculty in '25!🌟

Faculty apps: <a href="/_krishna_murthy/">Krishna Murthy</a>, <a href="/liuzhuang1234/">Zhuang Liu</a> & I share our tips: unnat.github.io/notes/Hidden_C…

PhD apps: I'm looking for students in vision, robot learning, & AI4Science. Details👇

thumb_up_off_alt391

chat_bubble_outline38

repeat70

shareShare

CMU Center for Perceptual Computing and Learning

@robovisioncmu

a year ago

Spot-on tips for faculty applicants from RI postdoc Unnat Jain. Big congrats to him and UC Irvine! 🎉

thumb_up_off_alt2

chat_bubble_outline1

repeat2

shareShare

Tarasha Khurana

@tarashakhurana

a year ago

Excited to present new work on using diffusion priors for video amodal segmentation and content completion! with Kaihua Chen (lead author) and Deva Ramanan arXiv: arxiv.org/abs/2412.04623 project page: diffusion-vas.github.io

thumb_up_off_alt73

chat_bubble_outline3

repeat20

shareShare

Guanya Shi

@guanyashi

10 months ago

ASAP learns diverse, agile, whole-body humanoid motions via learning a residual action model from the real world to align sim and real physics, enabling motions that were previously difficult to achieve. It has two stages: Stage 1 pretrains a phase-based motion tracking policy

thumb_up_off_alt134

chat_bubble_outline5

repeat23

shareShare

Zhengyi “Zen” Luo

@zhengyiluo

10 months ago

Should have recorded our reactions when the first successful siuuu happened! 🎉 Collecting and learning real world data will be incredibly important for humanoids moving forward, and we have just took our first step ASAP🫡

thumb_up_off_alt13

chat_bubble_outline0

repeat2

shareShare

Mehul Agarwal

@meh_agarwal

10 months ago

🎵✨Excited to share our #NeurIPS2024 paper on personalized music video generation! We combine multimodal AI with identity protection to let listeners be co-creators, generating custom music videos that reflect both music and themselves. 🎥🔒 arxiv.org/abs/2502.02610 #CreativeAI

thumb_up_off_alt17

chat_bubble_outline1

repeat4

shareShare

CMU Center for Perceptual Computing and Learning

@robovisioncmu

7 months ago

New work on unifying 2D and 3D vision-language models from CMU and Meta!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Unnat Jain

@unnatjain2010

6 months ago

✨New edition of our community-building workshop series!✨ Tomorrow at #CVPR2025, we invite speakers to share their stories, values, and approaches for navigating a crowded and evolving field, especially for early-career researchers. Cheeky title🤭: How to Stand Out in the

thumb_up_off_alt65

chat_bubble_outline3

repeat15

shareShare

Tarasha Khurana

@tarashakhurana

4 months ago

Excited to share recent work with Kaihua Chen and Deva Ramanan where we learn to do novel view synthesis for dynamic scenes in a self-supervised manner, only from 2D videos! webpage: cog-nvs.github.io arxiv: arxiv.org/abs/2507.12646 code (soon): github.com/Kaihua-Chen/co…

thumb_up_off_alt92

chat_bubble_outline2

repeat27

shareShare

Yishu Li

@lisayishu

2 months ago

A closed door looks the same whether it pushes or pulls. Two identical-looking boxes might have different center of mass. How should robots act when a single visual observation isn't enough? Introducing HAVE 🤖, our method that reasons about past interactions online! #CORL2025

thumb_up_off_alt41

chat_bubble_outline1

repeat18

shareShare

CMU Center for Perceptual Computing and Learning

@robovisioncmu

a month ago

TWO Best Paper Awards at ICCV Generating Physically Stable and Buildable Brick Structures from Text Ava Pun*, Kangle Deng*, Ruixuan Liu*, Deva Ramanan, Changliu Liu, Jun-Yan Zhu Spatially-Varying Autofocus Yingsi Qin, Aswin C. Sankaranarayanan, Matthew O'Toole #goSmithHall

thumb_up_off_alt12

chat_bubble_outline0

repeat0

shareShare

Kris Kitani

@kkitani

9 days ago

Super excited to share the release of SAM 3D. It's been a year in the making. Two models for lifting object and people to 3D!

thumb_up_off_alt165

chat_bubble_outline9

repeat12

shareShare

CMU Center for Perceptual Computing and Learning

@robovisioncmu

9 days ago

New model from Meta, SAM 3D Body, powered by people from Smith Hall (Kris Kitani,Jinkun Cao, David Park, Jyun-Ting Song) of course! #goSmithHall Introducing SAM 3D: a New Standard for 3D Object & Human Reconstruction ... youtu.be/B7PZuM55ayc?si… via YouTube

thumb_up_off_alt7

chat_bubble_outline0

repeat2

shareShare

CMU Center for Perceptual Computing and Learning

@robovisioncmu

8 days ago

#goSmithHall !!!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare