Zeren Jiang (@codyjzr) 's Twitter Profile
Zeren Jiang

@codyjzr

PhD student @ Oxford VGG

ID: 1042335138674298881

calendar_today19-09-2018 08:50:37

10 Tweet

113 Followers

102 Following

Kwang Moo Yi (@kwangmoo_yi) 's Twitter Profile Photo

Preprint of today: Jiang et al., "Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction" -- geo4d.github.io Fine-tune a video model to estimate ray, point maps, depth, then aggregate estimates of sliding windows through a multi-modal alignment.

Zhenjun Zhao (@zhenjun_zhao) 's Twitter Profile Photo

Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction Zeren Jiang, Chuanxia Zheng, Iro Laina, Diane Larlus, Andrea Vedaldi tl;dr: point+disparity+ray maps->pre-trained video diffusion model->CLIP->query transformer->U-Net->VAE decoder->alignment arxiv.org/abs/2504.07961

Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction

<a href="/CodyJzr/">Zeren Jiang</a>, <a href="/ChuanxiaZ/">Chuanxia Zheng</a>, Iro Laina, <a href="/dlarlus/">Diane Larlus</a>, Andrea Vedaldi

tl;dr: point+disparity+ray maps-&gt;pre-trained video diffusion model-&gt;CLIP-&gt;query transformer-&gt;U-Net-&gt;VAE decoder-&gt;alignment

arxiv.org/abs/2504.07961
Chris make some 3D scans (@chrisatkiri) 's Twitter Profile Photo

Play 4D scenes part 2. With the same monocular video input, Geo4D (github.com/jzr99/Geo4D) can now provide a more robust and clear 4D reconstruction result. CRAZYYY. I cannot imagine what is next. 4DGS from monocular video? I think it's feasible already.