John Thickstun (@jwthickstun) Twitter Tweets • TwiCopy

Maya Sen

@maya_sen

7 months ago

Where do people think top STEM researchers are trained?

thumb_up_off_alt49,49K

chat_bubble_outline295

repeat3,3K

shareShare

Volodymyr Kuleshov 🇺🇦

@volokuleshov

7 months ago

If you’re at #iclr2025, you should catch Cornell PhD student Yair Schiff—check out his new paper that derives classifier-based and classifier-free guidance for discrete diffusion models.

If you’re at #iclr2025, you should catch Cornell PhD student <a href="/SchiffYair/">Yair Schiff</a>—check out his new paper that derives classifier-based and classifier-free guidance for discrete diffusion models.

thumb_up_off_alt120

chat_bubble_outline0

repeat10

shareShare

I defended my PhD from Stanford CS Stanford NLP Group 🌲 w/ Stanford CS first all-female committee!! My dissertation focused on AI methods, evaluations & interventions to improve Education. So much gratitude for the support & love - and SO excited for the next chapter!!!! 🥳

I defended my PhD from Stanford CS <a href="/stanfordnlp/">Stanford NLP Group</a> 🌲 w/ Stanford CS first all-female committee!! My dissertation focused on AI methods, evaluations & interventions to improve Education.

So much gratitude for the support & love - and SO excited for the next chapter!!!! 🥳

thumb_up_off_alt748

chat_bubble_outline52

repeat43

shareShare

Wenting Zhao

@wzhao_nlp

7 months ago

Excited to announce our workshop on Visions of Language Modeling at COLM'25! 🔥 We thought that current LM research overly focuses on a narrow set of popular topics (e.g., test-time scaling and LLM agents), and we'd love to bring some entropy back 💪 To do this, we invited a

thumb_up_off_alt96

chat_bubble_outline4

repeat14

shareShare

Wenting Zhao

@wzhao_nlp

7 months ago

Some personal news: I'll join UMass Amherst CS as an assistant professor in fall 2026. Until then, I'll postdoc at Meta nyc. Reasoning will continue to be my main interest, with a focus on data-centric approaches🤩 If you're also interested, apply to me (phds & a postdoc)!

thumb_up_off_alt833

chat_bubble_outline95

repeat31

shareShare

Oliver Li

@oliver54244160

7 months ago

🤯 GPT-4o knows H&M left Russia in 2022 but still recommends shopping at H&M in Moscow. 🤔 LLMs store conflicting facts from different times, leading to inconsistent responses. We dig into how to better update LLMs with fresh facts that contradict their prior knowledge. 🧵 1/6

thumb_up_off_alt24

chat_bubble_outline3

repeat10

shareShare

Percy Liang

@percyliang

6 months ago

What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:

thumb_up_off_alt939

chat_bubble_outline39

repeat185

shareShare

Rishi Jha

@rishi_d_jha

6 months ago

I’m stoked to share our new paper: “Harnessing the Universal Geometry of Embeddings” with jack morris, Collin Zhang, and Vitaly Shmatikov. We present the first method to translate text embeddings across different spaces without any paired data or encoders. Here's why we're excited: 🧵👇🏾

I’m stoked to share our new paper: “Harnessing the Universal Geometry of Embeddings” with <a href="/jxmnop/">jack morris</a>, Collin Zhang, and <a href="/shmatikov/">Vitaly Shmatikov</a>.

We present the first method to translate text embeddings across different spaces without any paired data or encoders.

Here's why we're excited: 🧵👇🏾

thumb_up_off_alt1,1K

chat_bubble_outline33

repeat257

shareShare

Dasaem Jeong

@dasaemj

6 months ago

🎶Now a neural network can read scanned score image and generate performance audio in end-to-end😎 I'm super excited to introduce our work on Unified Cross-modal translation between Score Image, Symbolic Music, and Audio. Why does it matter and how to make it? Check the thread🧵

thumb_up_off_alt166

chat_bubble_outline4

repeat30

shareShare

Andrew Ng

@andrewyng

6 months ago

I am alarmed by the proposed cuts to U.S. funding for basic research, and the impact this would have for U.S. competitiveness in AI and other areas. Funding research that is openly shared benefits the whole world, but the nation it benefits most is the one where the research is

thumb_up_off_alt2,2K

chat_bubble_outline108

repeat457

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

6 months ago

Esoteric Language Models "In this work, we introduce Eso-LMs, a new family of models that fuses AR and MDM paradigms, enabling smooth interpolation between their perplexities while overcoming their respective limitations." "our method achieves up to **65x** faster inference

thumb_up_off_alt126

chat_bubble_outline6

repeat16

shareShare

Subham Sahoo

@ssahoo_

6 months ago

🚨 [New paper alert] Esoteric Language Models (Eso-LMs) First Diffusion LM to support KV caching w/o compromising parallel generation. 🔥 Sets new SOTA on the sampling speed–quality Pareto frontier 🔥 🚀 65× faster than MDLM ⚡ 4× faster than Block Diffusion 📜 Paper:

thumb_up_off_alt242

chat_bubble_outline10

repeat36

shareShare

jack morris

@jxmnop

6 months ago

new paper from our work at Meta! **GPT-style language models memorize 3.6 bits per param** we compute capacity by measuring total bits memorized, using some theory from Shannon (1953) shockingly, the memorization-datasize curves look like this: ___________ / / (🧵)

thumb_up_off_alt3,3K

chat_bubble_outline76

repeat369

shareShare

Zhihan Yang

@zhihanyangzy

6 months ago

📢Thrilled to share our new paper: Esoteric Language Models (Eso-LMs) > 🔀Fuses autoregressive (AR) and masked diffusion (MDM) paradigms > 🚀First to unlock KV caching for MDMs (65x speedup!) > 🥇Sets new SOTA on generation speed-vs-quality Pareto frontier How? Dive in👇

thumb_up_off_alt276

chat_bubble_outline5

repeat62

shareShare

LLM360

@llm360

6 months ago

KV-caching is great, but will it work for Diffusion Language Models. Zhihan Yang and team showed how to make it work with 65x speedup 🚀! Checkout the new preprint: arxiv.org/abs/2506.01928 The LLM360 team is very interested to explore new architectures.

thumb_up_off_alt28

chat_bubble_outline0

repeat7

shareShare

TuringPost

@theturingpost

6 months ago

.@NVIDIA never stops surprising Together with Cornell University they presented Eso-LMs (Esoteric Language Models) — a new kind of LM that combines the best parts of autoregressive (AR) and diffusion models. • It’s the first diffusion-based model that supports full KV caching. • At the

.@NVIDIA never stops surprising

Together with <a href="/Cornell/">Cornell University</a> they presented Eso-LMs (Esoteric Language Models) — a new kind of LM that combines the best parts of autoregressive (AR) and diffusion models.

• It’s the first diffusion-based model that supports full KV caching.
• At the

thumb_up_off_alt321

chat_bubble_outline10

repeat66

shareShare

Kevin Ellis

@ellisk_kellis

6 months ago

New paper: World models + Program synthesis by Wasu Top Piriyakulkij 1. World modeling on-the-fly by synthesizing programs w/ 4000+ lines of code 2. Learns new environments from minutes of experience 3. Positive score on Montezuma's Revenge 4. Compositional generalization to new environments

thumb_up_off_alt556

chat_bubble_outline14

repeat100

shareShare

hardmaru

@hardmaru

6 months ago

I agree with Jensen. If you want AI development to be done safely and responsibly, you do it in the open. Don’t do it in a dark room and tell me it’s “safe”. Article archive: archive.md/CC5VZ

thumb_up_off_alt169

chat_bubble_outline3

repeat17

shareShare

Chris Donahue

@chrisdonahuey

5 months ago

Excited to announce 🎵Magenta RealTime, the first open weights music generation model capable of real-time audio generation with real-time control. 👋 **Try Magenta RT on Colab TPUs**: colab.research.google.com/github/magenta… 👀 Blog post: g.co/magenta/rt 🧵 below

thumb_up_off_alt131

chat_bubble_outline9

repeat28

shareShare

Hao-Wen (Herman) Dong 董皓文

@hermanhwdong

5 months ago

🔥Happy to announce that the AI for Music Workshop is coming to #NeurIPS2025! We have an amazing lineup of speakers! We call for papers & demos (due on August 22)! See you in San Diego!🏖️ Chris Donahue Ilaria Manco Akira MAEZAWA Anna Huang McAuley Lab UCSD Zachary Novack NeurIPS Conference

thumb_up_off_alt117

chat_bubble_outline2

repeat31

shareShare

John Thickstun

Maya Sen

Volodymyr Kuleshov 🇺🇦

Rose

Wenting Zhao

Wenting Zhao

Oliver Li

Percy Liang

Rishi Jha

Dasaem Jeong

Andrew Ng

Tanishq Mathew Abraham, Ph.D.

Subham Sahoo

jack morris

Zhihan Yang

LLM360

TuringPost

Kevin Ellis

hardmaru

Chris Donahue

Hao-Wen (Herman) Dong 董皓文