Jiaxin Shi (@thjashin) 's Twitter Profile
Jiaxin Shi

@thjashin

Research Scientist @GoogleDeepMind | prev @Stanford @MSRNE @VectorInst @RIKEN_AIP_EN @Tsinghua_Uni. Building probabilistic & algorithmic models for learning.

ID: 702089336842375169

linkhttp://jiaxins.io calendar_today23-02-2016 11:15:16

575 Tweet

3,3K Takipçi

345 Takip Edilen

Jiaxin Shi (@thjashin) 's Twitter Profile Photo

Autoregressive models are too restrictive by forcing a fixed generation order, while masked diffusion is wasteful as it fits all possible orders. Can our model dynamically decide the next position to generate based on context? Learn more in our ICML paper arxiv.org/abs/2503.05979

Autoregressive models are too restrictive by forcing a fixed generation order, while masked diffusion is wasteful as it fits all possible orders. Can our model dynamically decide the next position to generate based on context? Learn more in our ICML paper

arxiv.org/abs/2503.05979