Subham Sahoo (@ssahoo_) Twitter Tweets • TwiCopy

Subham Sahoo

@ssahoo_

+ Follow

PhD candidate @cornell working on Diffusion Language Models. Previously @GoogleAI, @IITKgp.

ID: 155813173

linkhttps://s-sahoo.com calendar_today15-06-2010 06:38:32

99 Tweet

182 Followers

111 Following

Jinjie Ni @ ICLR'25 🇸🇬

@nijinjie

2 months ago

🍷Imagine you are the boss of Google DeepMind. To train the best diffusion language model in world within 1 year, using 800 TPU pods, which model size will you go for? 🐿️ We build Quokka to help you decide–the first-ever large-scale scaling law for DLMs. Interesting facts: 1.

thumb_up_off_alt287

chat_bubble_outline6

repeat58

shareShare

Subham Sahoo

@ssahoo_

2 months ago

Happening tomorrow at 2:30pm ET / 11:30 am PT

thumb_up_off_alt21

chat_bubble_outline2

repeat0

shareShare

Subham Sahoo

@ssahoo_

2 months ago

🎓 Officially a doctor now 😊!!! As a first-gen college kid, this moment means the world to me. Grateful beyond words to all my mentors who’ve guided me along the way — from Georg Martius who first introduced me to research back in 2017, to Volodymyr Kuleshov 🇺🇦 who sparked my love for

thumb_up_off_alt1,1K

chat_bubble_outline83

repeat58

shareShare

Subham Sahoo

@ssahoo_

a month ago

We’re dropping “The Diffusion Duality, Chapter 2” soon! So, stay tuned 🤗

thumb_up_off_alt82

chat_bubble_outline0

repeat7

shareShare

Justin Deschenaux

@jdeschena

a month ago

✨ Masked Generative Models (MGMs) are powerful and can generate tokens in parallel. They’ve driven impressive results across text and images and are increasingly competitive with autoregressive (AR) models. Thrilled to share our latest work to accelerate MGMs (1/12) 🧵

thumb_up_off_alt34

chat_bubble_outline2

repeat12

shareShare

Subham Sahoo

@ssahoo_

a month ago

Funny enough, after we released MDLM last year, Sasha Rush came up with the exact same idea!

thumb_up_off_alt18

chat_bubble_outline1

repeat0

shareShare

Subham Sahoo

@ssahoo_

a month ago

Impressive work by Justin Deschenaux ! They propose to replace the Encoder only denoising transformer with an Encoder-Decoder architecture which leads to faster training and inference of MDLM.

thumb_up_off_alt52

chat_bubble_outline1

repeat4

shareShare

Subham Sahoo

@ssahoo_

a month ago

How do you even compute such probabilities?

thumb_up_off_alt12

chat_bubble_outline1

repeat0

shareShare

Subham Sahoo

@ssahoo_

a month ago

Happy Diwali — from mine to yours ✨

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

Zachary Horvitz

@zachary_horvitz

a month ago

✨Masked Diffusion Language Models✨ are great for reasoning, but not just for the reasons you think! Fast parallel decoding? 🤔 Any-order decoding? 🤨 Plot twist: MDLMs offer A LOT MORE for inference and post-training! 🎢🧵

thumb_up_off_alt162

chat_bubble_outline4

repeat35

shareShare

The Discrete Diffusion Reading Group

@diffusionllms

25 days ago

Drowning in the sea of Discrete Diffusion papers? 🌊 We got you. Join our Reading Group! From theory → empirics, and language → molecules — we’ll decode the chaos together 💫 Join the cult—uh, I mean community 😇 👉 Google Group: groups.google.com/g/diffusion-ll… (1 / 2)

thumb_up_off_alt21

chat_bubble_outline1

repeat7

shareShare

Subham Sahoo

@ssahoo_

25 days ago

Overwhelmed by the number of Diffusion LLM papers? 🌊 Same here 😭 So I’m starting a Discrete Diffusion Reading Group (Discrete Diffusion Reading Group) with my favorite disciples Justin Deschenaux and Zhihan Yang ✨ We’ll cover everything—from theory to empirics, from language to molecules. Join

Overwhelmed by the number of Diffusion LLM papers? 🌊
Same here 😭

So I’m starting a Discrete Diffusion Reading Group (<a href="/diffusion_llms/">Discrete Diffusion Reading Group</a>) with my favorite disciples <a href="/jdeschena/">Justin Deschenaux</a> and <a href="/zhihanyang_/">Zhihan Yang</a> ✨

We’ll cover everything—from theory to empirics, from language to molecules.

Join

thumb_up_off_alt317

chat_bubble_outline20

repeat40

shareShare

Subham Sahoo

@ssahoo_

22 days ago

We’re building a space that connects researchers, students, and practitioners working on discrete diffusion. Join the Discord — collaborate, learn, and share! Whether you’re 💼hiring or showcasing your work, this is the place 👇 Discord: discord.gg/JxSCwpNb

thumb_up_off_alt104

chat_bubble_outline0

repeat8

shareShare

Subham Sahoo

@ssahoo_

21 days ago

The term AGI gives me the same ick that “AI” did back in 2015. If it takes hundreds of billions of tokens just to get a respectable score on grade school math (GSM8K), that says everything about where we actually are.

thumb_up_off_alt10

chat_bubble_outline0

repeat0

shareShare