Tianyuan Zhang (@tianyuanzhang99) 's Twitter Profile
Tianyuan Zhang

@tianyuanzhang99

PhD students in@MIT, working on vision and ML. M.S. in CMU, B.S. in PKU

ID: 905014077126213632

linkhttp://tianyuanzhang.com calendar_today05-09-2017 10:25:45

128 Tweet

963 Followers

784 Following

Tianyuan Zhang (@tianyuanzhang99) 's Twitter Profile Photo

Checkout Tianwei”s fast autoregressive video diffusion. A promising step towards real time interactive video generation!

Tianyuan Zhang (@tianyuanzhang99) 's Twitter Profile Photo

An image of an object tells more than the visual geometry of objects —it’s also a physical snapshot of an object in a state of static equilibrium. Can we use that cue to get more information about the objects? Checkout Minghao’s work on this topic: PhysComp!

Jia-Bin Huang (@jbhuang0604) 's Twitter Profile Photo

The slide is bad, her response to an audience is even worse… “Maybe there is one, maybe they are common, who knows what. I hope it was an outlier."

Yuandong Tian (@tydsh) 's Twitter Profile Photo

Unbelievable... This is explicit racial bias. How could this happen in NeurIPS? How could this be spoken by a top university professor, an invited keynote speaker?

Hongjie Wang (@hongjiewang3) 's Twitter Profile Photo

🎉Excited to introduce our latest work, LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity! ✨For the first time, we demonstrate high-resolution 68-second video generation at 16fps on a single GPU— without relying on

Hanwen Jiang (@hanwenjiang1) 's Twitter Profile Photo

💥 Think more real data is needed for scene reconstruction? Think again! Meet MegaSynth: scaling up feed-forward 3D scene reconstruction with synthesized scenes. In 3 days, it generates 700K scenes for training—70x larger than real data! ✨ The secret? Reconstruction is mostly

leloy! (@leloykun) 's Twitter Profile Photo

(Linear) Attention Mechanisms as Test-Time Regression By now, you've probably already heard of linear attention, in-context learning, test-time scaling, etc... Here, I'll discuss: 1. The unifying framework that ties them all together; 2. How to derive different linear

(Linear) Attention Mechanisms as Test-Time Regression

By now, you've probably already heard of linear attention, in-context learning, test-time scaling, etc...

Here, I'll discuss:

1. The unifying framework that ties them all together;
2. How to derive different linear
Yinbo Chen (@yinbochen) 's Twitter Profile Photo

Introducing “Diffusion Autoencoders are Scalable Image Tokenizers” (DiTo). We show that with proper designs and scaling up, diffusion autoencoders (a single L2 loss) can outperform the GAN-LPIPS tokenizers (hybrid losses) used in current SOTA generative models. (1/4)

Introducing “Diffusion Autoencoders are Scalable Image Tokenizers” (DiTo).

We show that with proper designs and scaling up, diffusion autoencoders (a single L2 loss) can outperform the GAN-LPIPS tokenizers (hybrid losses) used in current SOTA generative models. (1/4)
Tianyuan Zhang (@tianyuanzhang99) 's Twitter Profile Photo

Very interesting work from MIT office mates! Diffusion Forcing with History Guidance introduces a novel approach to video generation, excelling at ultra-long sequences—800+ frames shown in the paper!

Yilun Xu (@xuyilun2) 's Twitter Profile Photo

Tired of slow diffusion models? Our new paper introduces f-distill, enabling arbitrary f-divergence for one-step diffusion distillation. JS divergence gives SOTA results on text-to-image! Choose the divergence that suits your needs. Joint work with Weili Nie Arash Vahdat 1/N

Tired of slow diffusion models? Our new paper introduces f-distill, enabling arbitrary f-divergence for one-step diffusion distillation. JS divergence gives SOTA results on text-to-image! Choose the divergence that suits your needs. 

Joint work with <a href="/wn8_nie/">Weili Nie</a> <a href="/ArashVahdat/">Arash Vahdat</a>   1/N
Tianwei Yin (@tianweiy) 's Twitter Profile Photo

Super excited to share that I’ve officially defended my PhD, wrapped up an incredible journey at Massachusetts Institute of Technology (MIT) and Adobe Research, and joined Reve! Thrilled to be working alongside the same amazing founders I teamed up with back in the Adobe days. That experience gave me deep

Hong-Xing "Koven" Yu (@koven_yu) 's Twitter Profile Photo

🔥Want to capture 3D dancing fluids♨️🌫️🌪️💦? No specialized equipment, just one video! Introducing FluidNexus: Now you only need one camera to reconstruct 3D fluid dynamics and predict future evolution! 🧵1/4 Web: yuegao.me/FluidNexus/ Arxiv: arxiv.org/pdf/2503.04720

Ken Liu (@kenziyuliu) 's Twitter Profile Photo

An LLM generates an article verbatim—did it “train on” the article? It’s complicated: under n-gram definitions of train-set inclusion, LLMs can complete “unseen” texts—both after data deletion and adding “gibberish” data. Our results impact unlearning, MIAs & data transparency🧵

An LLM generates an article verbatim—did it “train on” the article?

It’s complicated: under n-gram definitions of train-set inclusion, LLMs can complete “unseen” texts—both after data deletion and adding “gibberish” data. Our results impact unlearning, MIAs &amp; data transparency🧵
Hong-Xing "Koven" Yu (@koven_yu) 's Twitter Profile Photo

🔥Spatial intelligence requires world generation, and now we have the first comprehensive evaluation benchmark📏 for it! Introducing WorldScore: Unifying evaluation for 3D, 4D, and video models on world generation! 🧵1/7 Web: haoyi-duan.github.io/WorldScore/ arxiv: arxiv.org/abs/2504.00983

Haian Jin (@haian_jin) 's Twitter Profile Photo

Excited to attend #ICLR2025 in person this year! I’ll be presenting two papers: 1. LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias 🔹 Oral Presentation: Session 3C (Garnet 216-218) — Apr 25 (Fri), 11:06–11:18 a.m. 🔹 Poster: Hall 3 + Hall 2B, Poster #593 — Apr