Soroush Mehraban (@soroushmhrbn) 's Twitter Profile
Soroush Mehraban

@soroushmhrbn

PhD student @UofT l Faculty Affiliate Researcher @ Vector Institute | Interested in Computer Vision

ID: 1650667214310191105

linkhttp://youtube.com/@SoroushMehraban calendar_today25-04-2023 01:05:14

86 Tweet

153 Takipçi

218 Takip Edilen

Soroush Mehraban (@soroushmhrbn) 's Twitter Profile Photo

Just patchify an image, quantize it, and use Llama for next patch prediction😃 It's straightforward yet more powerful than diffusion models. I also liked the ablation on codebook design—larger codebook sizes with lower dimensions seem to work better.

Soroush Mehraban (@soroushmhrbn) 's Twitter Profile Photo

Wow! loved the paper. Pruning tokens of transformers without any finetuning. Very intuitive paper that provides visual explanation for every design choice they made. Also liked how they leveraged the PageRank algorithm of Google Search and used it for finding the token importance

Brett Adcock (@adcock_brett) 's Twitter Profile Photo

Cover Google DeepMind published new research on the lab’s video-to-audio (V2A) system. It allows AI systems to generate soundtracks for videos — including newly generated videos or any video/film with no audio

Soroush Mehraban (@soroushmhrbn) 's Twitter Profile Photo

"Autoregressive Image Generation without Vector Quantization" is a new method that uses autoregressive models with diffusion loss for image generation. Just posted a video on YouTube explain how. Check that out if interested youtu.be/JoxCUetOADc?si…

Soroush Mehraban (@soroushmhrbn) 's Twitter Profile Photo

DDNM is a cool zero-shot image restoration method for applications such as image inpainting, super-resolution, or colorization using diffusion models without training that was proposed in ICLR2023. Just posted a YouTube video explaining how it works: youtu.be/Mq-_PImmuy0

Soroush Mehraban (@soroushmhrbn) 's Twitter Profile Photo

Prompt-to-Prompt (P2P) image editing method proposes a simple approach for editing the generated images of a diffusion. Posted a YouTube video explaining how it works: youtu.be/L-MAZlnlfiQ

Jon Barron (@jon_barron) 's Twitter Profile Photo

A thread of thoughts on radiance fields, from my keynote at 3DV: Radiance fields have had 3 distinct generations. First was NeRF: just posenc and a tiny MLP. This was slow to train but worked really well, and it was unusually compressed --- The NeRF was smaller than the images.

A thread of thoughts on radiance fields, from my keynote at 3DV:

Radiance fields have had 3 distinct generations. First was NeRF: just posenc and a tiny MLP. This was slow to train but worked really well, and it was unusually compressed --- The NeRF was smaller than the images.
Lightly (@lightlyai) 's Twitter Profile Photo

🥳 Today we're excited to officially launch LightlyTrain — a self-supervised pretraining framework that helps you build better vision models without any labels. Try it Free: lnkd.in/en78gQiC Read more: lnkd.in/e5yvuSKG Star us on Github: lnkd.in/eEXfTqr8

🥳 Today we're excited to officially launch LightlyTrain — a self-supervised pretraining framework that helps you build better vision models without any labels.

Try it Free: lnkd.in/en78gQiC
Read more: lnkd.in/e5yvuSKG
Star us on Github: lnkd.in/eEXfTqr8
Javad (@rajabi2001) 's Twitter Profile Photo

[1/7]⚡️Check out our recent work — "Token Perturbation Guidance for Diffusion Models" A simple yet effective method based on token shuffling for extending the benefits of CFG to broader settings, including unconditional generation. arXiv: arxiv.org/abs/2506.10036

[1/7]⚡️Check out our recent work — "Token Perturbation Guidance for Diffusion Models"

A simple yet effective method based on token shuffling for extending the benefits of CFG to broader settings, including unconditional generation.

arXiv: arxiv.org/abs/2506.10036
Babak Taati (@babak_taati) 's Twitter Profile Photo

Happy to share our paper "LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding", now accepted at #ICCV2025. We bridge the gap between global and local representations in neural implicit functions using a unified, task- and data-agnostic framework. 🧵👇

Happy to share our paper "LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding", now accepted at #ICCV2025.

We bridge the gap between global and local representations in neural implicit functions using a unified, task- and data-agnostic framework. 
🧵👇
Vida Adeli (@vida_adl) 's Twitter Profile Photo

Our paper “CARE-PD: A Multi-Site Anonymized Clinical Dataset for Parkinson’s Disease Gait Assessment” is accepted at #NeurIPS2025 🎉 Explore CARE-PD: neurips2025.care-pd.ca KITE Research Institute U of T Department of Computer Science Vector Institute #MotionAnalysis #CAREPD #AI4Health