david yan (@dzyan01) 's Twitter Profile
david yan

@dzyan01

meandering researcher @PrincetonVL

ID: 1904013727235457024

linkhttps://david-yan1.github.io calendar_today24-03-2025 03:33:52

0 Tweet

12 Followers

69 Following

Kevin Wang (@kevin_wang3290) 's Twitter Profile Photo

Excited to share that our paper "1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities" has won the Best Paper Award at NeurIPS '25! Hope to see you all in San Diego :)

Princeton Vision & Learning Lab (@princetonvl) 's Twitter Profile Photo

Estimating camera intrinsics from video is key to 3D reconstruction, but most methods assume they’re fixed per video. What if the camera keeps zooming and refocusing? Meet InFlux, the first benchmark with per-frame ground truth for videos with dynamic intrinsics. 🧵1/5

Kyle Sargent (@kylesargentai) 's Twitter Profile Photo

Vision-language models are getting better every day. Can we use them to improve image compression? Yes! For my internship, working w/ Google DeepMind, Google Research, we designed VLIC, a diffusion autoencoder post-trained with VLM preferences. Our preprint is out today! A🧵:

Vision-language models are getting better every day. Can we use them to improve image compression? Yes! For my internship, working w/ <a href="/GoogleDeepMind/">Google DeepMind</a>, <a href="/GoogleResearch/">Google Research</a>, we designed VLIC, a diffusion autoencoder post-trained with VLM preferences. Our preprint is out today! A🧵:
Princeton Vision & Learning Lab (@princetonvl) 's Twitter Profile Photo

Meet WAFT (Warping-Alone Field Transforms), our new optical-flow estimator. #1 on public benchmarks (Sintel & Spring), 1.3-4.1x faster than leading methods, and 2x lower memory. Key idea: replace cost volumes with high-res feature-space warping. Code and paper:👇

Jack Zhang (@jcz42) 's Twitter Profile Photo

We made Muon run up to 2x faster for free! Introducing Gram Newton-Schulz: a mathematically equivalent but computationally faster Newton-Schulz algorithm for polar decomposition. Gram Newton-Schulz rewrites Newton-Schulz such that instead of iterating on the expensive

We made Muon run up to 2x faster for free!

Introducing Gram Newton-Schulz: a mathematically equivalent but computationally faster Newton-Schulz algorithm for polar decomposition.

Gram Newton-Schulz rewrites Newton-Schulz such that instead of iterating on the expensive
Ethan (@torchcompiled) 's Twitter Profile Photo

ML interview question: You’re training a 72B MoE MNIST classifier. Layer 53 MLP expert 7 destabilizes when the ones in the dataset are turned upside down. What happened?

ML interview question: You’re training a 72B MoE MNIST classifier. Layer 53 MLP expert 7 destabilizes when the ones in the dataset are turned upside down. What happened?
Princeton Vision & Learning Lab (@princetonvl) 's Twitter Profile Photo

Stereo depth is highly useful for robots. Meet WAFT-Stereo: #1 on ETH3D (BP-0.5), Middlebury (RMSE), and KITTI (all metrics); 61% less zero-shot ETH3D BP-0.5 error; 1.8-6.7x faster than prior SOTA. Key idea: classify disparity into bins, then iterative high-res warping.🧵1/2

Guanyu Zhou (@tmartyr4951) 's Twitter Profile Photo

It's time to systematically teach VLMs to see with synthetic images! We built VisionFoundry, a simple but intuitive framework that generates synthetic image datasets from only a task name. 10k synthetic data → over +10% improvement on visual perception benchmarks 👀

It's time to systematically teach VLMs to see with synthetic images!

We built VisionFoundry, a simple but intuitive framework that generates synthetic image datasets from only a task name.

10k synthetic data → over +10% improvement on visual perception benchmarks 👀