Quankai Gao (@uuuuusher) Twitter Tweets • TwiCopy

good girl

@goodgirlxsz

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Sharing my recent project, agent-to-sim: From monocular videos taken over a long time horizon (e.g., 1 month), we learn an interactive behavior model of an agent (e.g., a 🐱) grounded in 3D. gengshan-y.github.io/agent2sim-www/

thumb_up_off_alt375

chat_bubble_outline8

repeat57

shareShare

Jia-Bin Huang

@jbhuang0604

a year ago

Why is self-supervision in vision still not working? 🤔 When pretraining a transformer on TEXT-only data by predicting the next tokens, we see clear improvement trends as we scale the model, data, and computing. But after trying to pretrain a transformer on IMAGES-only data

thumb_up_off_alt742

chat_bubble_outline72

repeat48

shareShare

Gene Chou

@gene_ch0u

a year ago

We've released our paper "Generating 3D-Consistent Videos from Unposed Internet Photos"! Video models like Luma generate pretty videos, but sometimes struggle with 3D consistency. We can do better by scaling them with 3D-aware objectives. 1/N page: genechou.com/kfcw

thumb_up_off_alt227

chat_bubble_outline6

repeat46

shareShare

Hanwen Jiang

@hanwenjiang1

a year ago

We will present CoFie at #NeurIPS2024 tomorrow - a compact geometry-aware surface representation. CoFie disentangles the transformation of local patches and explicitly models it in SE(3), aligning local patches and reducing their complexity. Location: West Ballroom A-D #6900

thumb_up_off_alt69

chat_bubble_outline1

repeat9

shareShare

Fangjinhua Wang

@fangjinhuawang

a year ago

#NeurIPS2024 In UniSDF, by unifying different neural representations, we achieved best overall performance across various scene types, ranging from object-level to unbounded scenes, with and without reflections. fangjinhuawang.github.io/UniSDF/

thumb_up_off_alt76

chat_bubble_outline1

repeat19

shareShare

Chen Geng

@gengchen01

a year ago

Ever wondered how roses grow and wither in your backyard?🌹 Our latest work on generating 4D temporal object intrinsics lets you explore a rose's entire lifecycle—from birth to death—under any environment light, from any viewpoint, at any moment. Project page:

thumb_up_off_alt188

chat_bubble_outline5

repeat37

shareShare

Hanwen Jiang

@hanwenjiang1

a year ago

💥 Think more real data is needed for scene reconstruction? Think again! Meet MegaSynth: scaling up feed-forward 3D scene reconstruction with synthesized scenes. In 3 days, it generates 700K scenes for training—70x larger than real data! ✨ The secret? Reconstruction is mostly

thumb_up_off_alt166

chat_bubble_outline7

repeat24

shareShare

Pallav Agarwal

@pallavmac

a year ago

I was able to upload my own image to Veo 2! Here is the result when asking it to pan around Lofi Girl's room - extremely impressive result

thumb_up_off_alt3,3K

chat_bubble_outline101

repeat290

shareShare

Fangjinhua Wang

@fangjinhuawang

a year ago

Check out our new progress on large-scale visual localization with Scene Coordinate Regression (SCR)! In R-SCoRe, we close the gap between SCR and feature matching methods on challenging benchmarks with strong illumination changes. Paper: arxiv.org/pdf/2501.01421

thumb_up_off_alt22

chat_bubble_outline1

repeat4

shareShare

Jiageng Mao

@pointscoder

a year ago

Thanks Marco Pavone for sharing my internship work! It was a fantastic experience collaborating with you and your team at NVIDIA. DreamDrive is our preliminary exploration of driving everywhere leveraging the Internet street view images. Stay tuned for more updates!

thumb_up_off_alt31

chat_bubble_outline0

repeat4

shareShare

Jiawei Yang

@jiaweiyang118

a year ago

Excited to share STORM! Unlike existing LRMs, STORM tackles dynamic scenes—reconstructing dynamic 3D scenes, estimating object velocities, and capturing different motion groups from a short video clip. Using a feedforward model, it slashes per-scene optimization time from 1000+s

thumb_up_off_alt85

chat_bubble_outline3

repeat10

shareShare

Sanghyun Son

@sanghyunson

10 months ago

📢 [New Paper] DMesh++: An Efficient Differentiable Mesh for Complex Shapes ✍️ Authors: Sanghyun Son, @gadelha_m, Yang, Matthew Fisher, Zexiang Xu, Yiling Qiao, Ming Lin, Yi Zhou 🔗 Arxiv: arxiv.org/abs/2412.16776 🔗 Page: sonsang.github.io/dmesh2-project/ More details👇

📢 [New Paper] DMesh++: An Efficient Differentiable Mesh for Complex Shapes
✍️ Authors: <a href="/SanghyunSon/">Sanghyun Son</a>, @gadelha_m, <a href="/leo_zhy/">Yang</a>, Matthew Fisher, <a href="/zexiangxu/">Zexiang Xu</a>, <a href="/yilingq97/">Yiling Qiao</a>, Ming Lin, <a href="/Papagina_Yi/">Yi Zhou</a>
🔗 Arxiv: arxiv.org/abs/2412.16776
🔗 Page: sonsang.github.io/dmesh2-project/

More details👇

thumb_up_off_alt135

chat_bubble_outline1

repeat32

shareShare

Jorge Condor

@arcanous98

10 months ago

I'm very happy to announce that our paper "Don't Splat your Gaussians: Volumetric Primitives for Rendering Scattering and Emissive Media" (tinyurl.com/GaussVol) was finally accepted to ACM Transactions on Graphics last month! We will present it at SIGGRAPH 2025 🧵🧵🧵 (1/11)

thumb_up_off_alt369

chat_bubble_outline4

repeat67

shareShare

Ziyu Chen

@ziyuchen_

10 months ago

Our project OmniRe has been accepted to ICLR 2025! 🎉 Huge thanks to all my fantastic collaborators for making this happen! 🙌 #ICLR2025 Project page: ziyc.github.io/omnire DriveStudio🚗: github.com/ziyc/drivestud…

thumb_up_off_alt91

chat_bubble_outline1

repeat19

shareShare

Hanwen Jiang

@hanwenjiang1

9 months ago

Working on Depth Estimation? Here is a free lunch. We tune a Depth Anything ViT-B model on MegaSynth, and the performance improves a lot -- depth estimation is also very non-semantic! #CVPR2025 Accepted

thumb_up_off_alt54

chat_bubble_outline0

repeat10

shareShare

Anpei Chen

@anpeic

7 months ago

Too many artifacts for GS reconstruction? Please checkout GenFusion: Closing the Loop between Reconstruction and Generation via Videos 🌐 Project page: genfusion.sibowu.com 💻 Code: github.com/Inception3D/Ge… #3D #DiffusionModels #ViewSynthesis #GenFusion #CVPR2025

thumb_up_off_alt195

chat_bubble_outline3

repeat38

shareShare

Xuxin Cheng

@xuxin_cheng

7 months ago

Meet 𝐀𝐌𝐎 — our universal whole‑body controller that unleashes the 𝐟𝐮𝐥𝐥  kinematic workspace of humanoid robots to the physical world. AMO is a single policy trained with RL + Hybrid Mocap & Trajectory‑Opt. Accepted to #RSS2025. Try our open models & more 👉

thumb_up_off_alt550

chat_bubble_outline23

repeat113

shareShare

MrNeRF

@janusch_patas

6 months ago

3DGEER: Exact and Efficient Volumetric Rendering with 3D Gaussians Contributions: (i) We present the first complete, first-principle derived, closed-form solution for exact volumetric Gaussian rendering. (ii) We propose an exact and efficient ray-particle association method

thumb_up_off_alt243

chat_bubble_outline9

repeat24

shareShare

spark

@sparkjsdev

6 months ago

Open Sourcing Forge: 3D Gaussian splat rendering for web developers! 3DGS has become a dominant paradigm for differentiable rendering, combining high visual quality and real-time rendering. However, support for splatting on the web still lags behind its adoption in AI.

thumb_up_off_alt82

chat_bubble_outline5

repeat27

shareShare

Haven (Haiwen) Feng

@havenfeng

5 months ago

🚀 Introducing GenLit – Reformulating Single-Image Relighting as Video Generation! We leverage video diffusion models to perform realistic near-field relighting from just a single image—No explicit 3D reconstruction or ray tracing required! No intermediate graphics buffers,

thumb_up_off_alt108

chat_bubble_outline3

repeat21

shareShare

Quankai Gao

good girl

Gengshan Yang

Jia-Bin Huang

Gene Chou

Hanwen Jiang

Fangjinhua Wang

Chen Geng

Hanwen Jiang

Pallav Agarwal

Fangjinhua Wang

Jiageng Mao

Jiawei Yang

Sanghyun Son

Jorge Condor

Ziyu Chen

Hanwen Jiang

Anpei Chen

Xuxin Cheng

MrNeRF

spark

Haven (Haiwen) Feng