Kastan Day
@kastanday
Research SWE @NCSAatIllinois. prev: @IBM_Research, 3x NASA SWE intern. MS in CS from UIUC, Swarthmore, Phillips Academy. Plz donate to charity.
ID: 1262219683
http://kastan.ai 12-03-2013 15:17:46
1,1K Tweet
364 Followers
987 Following
This is interesting as a first large diffusion-based LLM. Most of the LLMs you've been seeing are ~clones as far as the core modeling approach goes. They're all trained "autoregressively", i.e. predicting tokens from left to right. Diffusion is different - it doesn't go left to