Vincent Sitzmann (@vincesitzmann) Twitter Tweets • TwiCopy

Vincent Sitzmann

@vincesitzmann

+ Follow

Teaching AI to see, model, and interact with our 3D world. Assistant Professor @ MIT, leading the Scene Representation Group (scenerepresentations.org).

ID: 4872666044

linkhttp://vincentsitzmann.com calendar_today07-02-2016 08:09:03

737 Tweet

15,15K Takipçi

305 Takip Edilen

Phillip Isola

@phillip_isola

2 months ago

Vincent Sitzmann I agree, I think these initiatives are well meaning but have gone too far. To me they cross the line from useful guidlines to infantalizing dictums.

thumb_up_off_alt29

chat_bubble_outline0

repeat1

shareShare

Introducing Generative View Stitching (GVS), a non-autoregressive sampling method for length extrapolation of video diffusion models. GVS enables collision-free camera-guided video generation for predefined trajectories, including Oscar Reutersvärd's Impossible Staircase (1/9).

thumb_up_off_alt187

chat_bubble_outline8

repeat38

shareShare

Phillip Isola

@phillip_isola

2 months ago

Arxiv has been such a wonderful service but I think this is a step in the wrong direction. We have other venues for peer review. To me the value of arxiv lies precisely in its lack of excessive moderation. I'd prefer it as "github for science," rather than yet another journal.

thumb_up_off_alt725

chat_bubble_outline25

repeat35

shareShare

George Cazenavette

@gcazenavette

a month ago

Happy to finally share our latest work on Dataset Distillation! "Dataset Distillation for Pre-Trained Self-Supervised Vision Models," set to appear at #NeurIPS 2025! We learn 1 image per class to train linear heads for pre-trained models. linear-gradient-matching.github.io 1/6

thumb_up_off_alt26

chat_bubble_outline1

repeat8

shareShare

Dmytro Mishkin 🇺🇦

@ducha_aiki

a month ago

Understanding Multi-View Transformers Michal Stary Julien Gaubil Ayush Tewari Vincent Sitzmann tl;dr: DUSt3R self-attention is it secretly a diffusion model, and cross-attention is matching. arxiv.org/abs/2510.24907

Understanding Multi-View Transformers

Michal Stary <a href="/jgaubil/">Julien Gaubil</a> <a href="/_atewari/">Ayush Tewari</a> <a href="/vincesitzmann/">Vincent Sitzmann</a>

tl;dr: DUSt3R self-attention is it secretly a diffusion model, and cross-attention is matching.
arxiv.org/abs/2510.24907

thumb_up_off_alt196

chat_bubble_outline1

repeat39

shareShare

Elliott / Shangzhe Wu

@elliottszwu

a month ago

I'm looking for two PhD students to join our team at Cambridge to work on 3D/4D modeling in various domains including generative media, robotics, and biology. Apply to the PhD in Engineering program by December 2 ⌛️: postgraduate.study.cam.ac.uk/courses/direct…

thumb_up_off_alt330

chat_bubble_outline7

repeat80

shareShare

Vincent Sitzmann

Phillip Isola

Chonghyuk (Andrew) Song

Phillip Isola

George Cazenavette

Dmytro Mishkin 🇺🇦

Elliott / Shangzhe Wu