Jim Fan (@drjimfan) 's Twitter Profile
Jim Fan

@drjimfan

NVIDIA Sr. Research Manager. Co-Lead of GR00T (Humanoid Robotics) & GEAR Lab. Solving Physical AI, one motor at a time. Stanford Ph.D. OpenAI's 1st intern.

ID: 1007413134

linkhttps://jimfan.me calendar_today12-12-2012 22:11:27

3,3K Tweet

302,302K Takipçi

3,3K Takip Edilen

Jim Fan (@drjimfan) 's Twitter Profile Photo

DALL-E generates pixels from text. Now meet its cousin, VALL-E, that generates audio from text Microsoft Research! VALL-E’s resemblance to DALL-E v1 and Parti Google AI is striking. Image and audio are both continuous signals, but they can be quantized into discrete tokens. 1/🧵

DALL-E generates pixels from text. Now meet its cousin, VALL-E, that generates audio from text <a href="/MSFTResearch/">Microsoft Research</a>!

VALL-E’s resemblance to DALL-E v1 and Parti <a href="/GoogleAI/">Google AI</a> is striking. Image and audio are both continuous signals, but they can be quantized into discrete tokens.

1/🧵