Soroush Mehraban (@soroushmhrbn) Twitter Tweets • TwiCopy

Just patchify an image, quantize it, and use Llama for next patch prediction😃 It's straightforward yet more powerful than diffusion models. I also liked the ablation on codebook design—larger codebook sizes with lower dimensions seem to work better.

thumb_up_off_alt10

chat_bubble_outline0

repeat1

shareShare

Soroush Mehraban

@soroushmhrbn

a year ago

Wow! loved the paper. Pruning tokens of transformers without any finetuning. Very intuitive paper that provides visual explanation for every design choice they made. Also liked how they leveraged the PageRank algorithm of Google Search and used it for finding the token importance

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Brett Adcock

@adcock_brett

a year ago

Cover Google DeepMind published new research on the lab’s video-to-audio (V2A) system. It allows AI systems to generate soundtracks for videos — including newly generated videos or any video/film with no audio

thumb_up_off_alt99

chat_bubble_outline3

repeat8

shareShare

keshav

@keshavchan

a year ago

andrej karpathy on how to learn and become an expert

thumb_up_off_alt10,10K

chat_bubble_outline78

repeat1,1K

shareShare

Soroush Mehraban

@soroushmhrbn

a year ago

Just posted a YouTube video explaining how diffusion models (DDPM & DDIM) work. Watch it if interested! youtu.be/r4V0vLhYZIQ

thumb_up_off_alt11

chat_bubble_outline0

repeat1

shareShare

Soroush Mehraban

@soroushmhrbn

a year ago

"Autoregressive Image Generation without Vector Quantization" is a new method that uses autoregressive models with diffusion loss for image generation. Just posted a video on YouTube explain how. Check that out if interested youtu.be/JoxCUetOADc?si…

thumb_up_off_alt7

chat_bubble_outline2

repeat2

shareShare

Soroush Mehraban

@soroushmhrbn

a year ago

DDNM is a cool zero-shot image restoration method for applications such as image inpainting, super-resolution, or colorization using diffusion models without training that was proposed in ICLR2023. Just posted a YouTube video explaining how it works: youtu.be/Mq-_PImmuy0

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

Soroush Mehraban

@soroushmhrbn

a year ago

Prompt-to-Prompt (P2P) image editing method proposes a simple approach for editing the generated images of a diffusion. Posted a YouTube video explaining how it works: youtu.be/L-MAZlnlfiQ

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Soroush Mehraban

@soroushmhrbn

9 months ago

Posted a video explaining Variational Score Distillation: youtu.be/xWsGlY11Z3o

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Jon Barron

@jon_barron

8 months ago

A thread of thoughts on radiance fields, from my keynote at 3DV: Radiance fields have had 3 distinct generations. First was NeRF: just posenc and a tiny MLP. This was slow to train but worked really well, and it was unusually compressed --- The NeRF was smaller than the images.

thumb_up_off_alt646

chat_bubble_outline10

repeat79

shareShare

Lightly

@lightlyai

7 months ago

🥳 Today we're excited to officially launch LightlyTrain — a self-supervised pretraining framework that helps you build better vision models without any labels. Try it Free: lnkd.in/en78gQiC Read more: lnkd.in/e5yvuSKG Star us on Github: lnkd.in/eEXfTqr8

thumb_up_off_alt13

chat_bubble_outline1

repeat8

shareShare

Javad

@rajabi2001

5 months ago

[1/7]⚡️Check out our recent work — "Token Perturbation Guidance for Diffusion Models" A simple yet effective method based on token shuffling for extending the benefits of CFG to broader settings, including unconditional generation. arXiv: arxiv.org/abs/2506.10036

thumb_up_off_alt12

chat_bubble_outline1

repeat5

shareShare

Babak Taati

@babak_taati

5 months ago

Happy to share our paper "LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding", now accepted at #ICCV2025. We bridge the gap between global and local representations in neural implicit functions using a unified, task- and data-agnostic framework. 🧵👇

thumb_up_off_alt11

chat_bubble_outline1

repeat2

shareShare

Javad

@rajabi2001

2 months ago

Happy to share that our paper has been accepted at #NeurIPS2025 ! 🥳🎉 U of T Department of Computer Science & Vector Institute

thumb_up_off_alt11

chat_bubble_outline0

repeat2

shareShare

Vida Adeli

@vida_adl

2 months ago

Our paper “CARE-PD: A Multi-Site Anonymized Clinical Dataset for Parkinson’s Disease Gait Assessment” is accepted at #NeurIPS2025 🎉 Explore CARE-PD: neurips2025.care-pd.ca KITE Research Institute U of T Department of Computer Science Vector Institute #MotionAnalysis #CAREPD #AI4Health

thumb_up_off_alt4

chat_bubble_outline1

repeat1

shareShare