Carl Doersch (@carldoersch) 's Twitter Profile
Carl Doersch

@carldoersch

Researcher at DeepMind

ID: 852856132632797184

linkhttp://carldoersch.com calendar_today14-04-2017 12:08:42

61 Tweet

2,2K Takipçi

289 Takip Edilen

Carl Doersch (@carldoersch) 's Twitter Profile Photo

Just in time for CVPR, we've released code to generate "rainbow visualizations" from a set of point tracks: it semi-automatically segments foreground objects and corrects for camera motion. Try our colab demo at colab.sandbox.google.com/github/deepmin… (vid source youtube.com/watch?v=yuQFQ8…)

Dima Damen (@dimadamen) 's Twitter Profile Photo

Can you win 2nd Perception Test Challenge? European Conference on Computer Vision #ECCV2026 workshop: ptchallenge-workshop.github.io Diagnose Audio-visual MLM on ability to model memory, physics, abstraction &semantics through 6 tasks: VQA, Point Tracking, Box T, action/sound localisation - Jointly! Google DeepMind +win 💰

Skanda (@skandakoppula) 's Twitter Profile Photo

We're excited to release TAPVid-3D: an evaluation benchmark of 4,000+ real world videos and 2.1 million metric 3D point trajectories, for the task of Tracking Any Point in 3D!

Carl Doersch (@carldoersch) 's Twitter Profile Photo

Want to make a difference with point tracking? The medical community needs help tracking tissue deformation during surgery! Participate in the STIR challenge (stir-challenge.github.io) at MICCAI, deadline in September.

Carl Doersch (@carldoersch) 's Twitter Profile Photo

Want a robot to solve a task, specified in language? Generate a video of a person doing it, and then retarget the action to the robot with the help of point tracking! Cool collab with Homanga Bharadhwaj during his student researcher stint at Google.

Daniel Geng (@dangengdg) 's Twitter Profile Photo

What happens when you train a video generation model to be conditioned on motion? Turns out you can perform "motion prompting," just like you might prompt an LLM! Doing so enables many different capabilities. Here’s a few examples – check out this thread 🧵 for more results!

Kelsey Allen (@kelseyrallen) 's Twitter Profile Photo

Humans can tell the difference between a realistic generated video and an unrealistic one – can models? Excited to share TRAJAN: the world’s first point TRAJectory AutoeNcoder for evaluating motion realism in generated and corrupted videos. 🌐 trajan-paper.github.io 🧵

Carl Doersch (@carldoersch) 's Twitter Profile Photo

I "found" a few of the tasks in this video, and it's hard to convey the feeling to those who don't know the training data. Just know that a few episodes started with someone saying something like "this robot hasn't seen anything like this task; I doubt it'll work..."