Chen Geng (@gengchen01) Twitter Tweets • TwiCopy

Sanjana Srivastava

5 months ago

🤖 Household robots are becoming physically viable. But interacting with people in the home requires handling unseen, unconstrained, dynamic preferences, not just a complex physical domain. We introduce ROSETTA: a method to generate reward for such preferences cheaply. 🧵⬇️

thumb_up_off_alt128

chat_bubble_outline4

repeat27

shareShare

Zizhang Li

@zizhang_li

5 months ago

In our #ICCV2025 WonderPlay, we study how to combine physical simulation and video generative prior to enable 3D action interaction with the world from a single image! Check the 🧵for more details!

thumb_up_off_alt43

chat_bubble_outline0

repeat8

shareShare

Yuxi Xiao

@yuxixiaohenry

5 months ago

🚀 We release SpatialTrackerV2: the first feedforward model for dynamic 3D reconstruction and 3D point tracking — all at once! Reconstruct dynamic scenes and predict pixel-wise 3D motion in seconds. 🔗 Webpage: spatialtracker.github.io 🔍 Online Demo: huggingface.co/spaces/Yuxihen…

thumb_up_off_alt402

chat_bubble_outline4

repeat78

shareShare

Klemen Kotar

@klemenkotar

5 months ago

📷 New Preprint: SOTA optical flow extraction from pre-trained generative video models! While it seems intuitive that video models grasp optical flow, extracting that understanding has proven surprisingly elusive.

thumb_up_off_alt38

chat_bubble_outline1

repeat8

shareShare

Chen Geng

@gengchen01

4 months ago

🔥 Deadline extended! The non-archival track is now open until Aug 17. Have research related to digital twins? Consider submitting it to our workshop at #ICCV2025 2025. Previously accepted or published papers are welcome as well. #ICCV2025

thumb_up_off_alt34

chat_bubble_outline5

repeat10

shareShare

Hadi AlZayer

@hadizayer

3 months ago

✨ Our paper Magic Fixup is accepted to ACM TOG! We show how dynamic videos can guide photo editing across many tasks — making this a solid baseline for future research. project page: magic-fixup.github.io paper: dl.acm.org/doi/10.1145/37…

thumb_up_off_alt40

chat_bubble_outline2

repeat4

shareShare

Zhaoxi Chen

@frozen_burning

3 months ago

🔥Feed-Forward 4D Generative Modeling🔥 #4DNeX is a training-efficient recipe for 4D generative *world modeling* from a single image. The 10M 4D dataset is also released! - Project: 4dnex.github.io - Data: huggingface.co/datasets/3DTop… - Code: github.com/3DTopia/4DNeX

thumb_up_off_alt59

chat_bubble_outline0

repeat13

shareShare

Fei-Fei Li

@drfeifei

3 months ago

(1/N) How close are we to enabling robots to solve the long-horizon, complex tasks that matter in everyday life? 🚨 We are thrilled to invite you to join the 1st BEHAVIOR Challenge @NeurIPS 2025, submission deadline: 11/15. 🏆 Prizes: 🥇 $1,000 🥈 $500 🥉 $300

thumb_up_off_alt952

chat_bubble_outline31

repeat184

shareShare

Klemen Kotar

@klemenkotar

3 months ago

1/ A good world model should be promptable like an LLM, offering flexible control and zero-shot answers to many questions. Language models have benefited greatly from this fact, but it's been slow to come to vision. We introduce PSI: a path to truly interactive visual world

thumb_up_off_alt128

chat_bubble_outline3

repeat34

shareShare

Fei-Fei Li

@drfeifei

3 months ago

If you're curious about how the latest spatial intelligence model is doing World Labs, check out this new blog! I'm so excited by how much progress has been made in 3D world generation - bigger, more consistent, and forever persistent worlds! Moreover, everyone in the world

thumb_up_off_alt1,1K

chat_bubble_outline72

repeat228

shareShare

Elliott / Shangzhe Wu

@elliottszwu

2 months ago

Looking for a PhD position? Apply to the ELLIS PhD program and get the unique opportunity to work with two different research teams across Europe! Apply by 31 Oct: ellis.eu/news/ellis-phd…

thumb_up_off_alt35

chat_bubble_outline0

repeat3

shareShare

Yanjie Ze

@zeyanjie

2 months ago

Our humanoid now learns loco-manip skills that generalize across space (Stanford) & time (day & night), using egocentric vision, trained only in simulation visualmimic.github.io

thumb_up_off_alt126

chat_bubble_outline5

repeat19

shareShare

Weiyu Liu

@weiyu_liu_

2 months ago

I’m at #CoRL2025 in Seoul this week! I’m looking for students to join my lab next year, and also for folks excited to build robotic foundation models at a startup. If you’re into generalization, planning and reasoning, or robots that use language, let's chat!

thumb_up_off_alt50

chat_bubble_outline2

repeat7

shareShare

Kyle Sargent

@kylesargentai

2 months ago

This is a cool idea, but this paper combines two bad practices on ImageNet (1) DinoV2 features (+142 million images) on a data-constrained benchmark (ImageNet, ~1.2 million images) (2) Much better gFID (1.13) than held-out ImageNet images (~1.8), implying serious Goodharting

thumb_up_off_alt260

chat_bubble_outline11

repeat10

shareShare

Ruoshi Liu

@ruoshi_liu

2 months ago

Everyone says they want general-purpose robots. We actually mean it — and we’ll make it weird, creative, and fun along the way 😎 Recruiting PhD students to work on Computer Vision and Robotics UMD Department of Computer Science for Fall 2026 in the beautiful city of Washington DC!

thumb_up_off_alt297

chat_bubble_outline14

repeat54

shareShare

Hadi AlZayer

@hadizayer

2 months ago

what if you could combine diffusion models instantly? You would get exponentially better control (for free!!👀) This is exactly what we do. In ✨ coupled diffusion sampling ✨, diffusion models guide each other. The result? Diverse editing capabilities!

thumb_up_off_alt163

chat_bubble_outline5

repeat32

shareShare

Chen Geng

@gengchen01

a month ago

Join us tomorrow at the #ICCV2025 workshop on generating digital twins from images and videos! Don’t miss amazing talks from Manolis Savva, Katerina Fragkiadaki, Marc Pollefeys, Jiajun Wu, Matthias Niessner, Lei Li, Yanpei Cao, and Steve Xie on cutting-edge progress! #ICCV2025

Join us tomorrow at the #ICCV2025 workshop on generating digital twins from images and videos! Don’t miss amazing talks from Manolis Savva, <a href="/KaterinaFragiad/">Katerina Fragkiadaki</a>, <a href="/mapo1/">Marc Pollefeys</a>, <a href="/jiajunwu_cs/">Jiajun Wu</a>, <a href="/MattNiessner/">Matthias Niessner</a>, <a href="/craigleili/">Lei Li</a>, <a href="/yanpei_cao/">Yanpei Cao</a>, and <a href="/bgxc/">Steve Xie</a> on cutting-edge progress!

<a href="/ICCVConference/">#ICCV2025</a>

thumb_up_off_alt36

chat_bubble_outline1

repeat7

shareShare

Chengshu Li

@chengshuericli

a month ago

We are excited to release MoMaGen, a data generation method for multi-step bimanual mobile manipulation. MoMaGen turns 1 human-teleoped robot trajectory into 1000s of generated trajectories automatically.🚀 Website: momagen.github.io arXiv: arxiv.org/abs/2510.18316

thumb_up_off_alt160

chat_bubble_outline1

repeat35

shareShare