John Nguyen (@__johnnguyen__) 's Twitter Profile
John Nguyen

@__johnnguyen__

Research at FAIR

ID: 1537295545189793793

linkhttps://johnlnguyen.com calendar_today16-06-2022 04:46:42

117 Tweet

441 Takipçi

153 Takip Edilen

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Nice, short post illustrating how simple text (discrete) diffusion can be. Diffusion (i.e. parallel, iterated denoising, top) is the pervasive generative paradigm in image/video, but autoregression (i.e. go left to right bottom) is the dominant paradigm in text. For audio I've

Ellis Brown (@_ellisbrown) 's Twitter Profile Photo

MLLMs are great at understanding videos, but struggle with spatial reasoning—like estimating distances or tracking objects across time. the bottleneck? getting precise 3D spatial annotations on real videos is expensive and error-prone. introducing SIMS-V 🤖 [1/n]

John Nguyen (@__johnnguyen__) 's Twitter Profile Photo

Pre 2021: submit to a conference and on arxiv 2021-2024: submit to arxiv, make a website and tweet about 2025: all of the above plus promo video

John Nguyen (@__johnnguyen__) 's Twitter Profile Photo

Human reasoning requires the ability to revise and self-correct through multiple rounds of edits neither do AR or (masked) Diffusion LMs do this.