Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile
Zeyuan Allen-Zhu, Sc.D.

@zeyuanallenzhu

physics of language models @ Meta (FAIR, not GenAI)
🎓:Tsinghua Physics — MIT — Princeton/IAS
🏆:IOI — ACM-ICPC — USACO — Codejam — math MCM

ID: 136335720

linkhttp://zeyuan.allen-zhu.com calendar_today23-04-2010 16:59:01

317 Tweet

15,15K Takipçi

338 Takip Edilen

Kamalika Chaudhuri (@kamalikac) 's Twitter Profile Photo

Papers I talked about: (1) One-model deja-vu memorization: arxiv.org/abs/2504.05651 (2) AgentDAM "data minimization" benchmark: arxiv.org/abs/2503.09780

Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile Photo

(9/8) People suggested I study Primer (arxiv.org/abs/2109.08668). Their multi-dconv-head attention is what I call Canon-B (no-res)—and we found issues with it. Yet, Primer is underrated with just 180 citations. They found meaningful signals from noisy real-life exp that I couldn't

(9/8) People suggested I study Primer (arxiv.org/abs/2109.08668). Their multi-dconv-head attention is what I call Canon-B (no-res)—and we found issues with it. Yet, Primer is underrated with just 180 citations. They found meaningful signals from noisy real-life exp that I couldn't
Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile Photo

This person seems stressed and is spreading false rumors on our project. To clarify: this PDF is from our peer-reviewed spotlight paper accepted at ICLR 2025. We have 4 papers accepted at ICLR'25 (Parts 2.1, 2.2, 3.2, 3.3). I suggest you find healthier outlets to cope with stress

This person seems stressed and is spreading false rumors on our project. To clarify: this PDF is from our peer-reviewed spotlight paper accepted at ICLR 2025. We have 4 papers accepted at ICLR'25 (Parts 2.1, 2.2, 3.2, 3.3). I suggest you find healthier outlets to cope with stress
Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile Photo

Please stop spreading false rumors. This full arxiv paper underwent peer review. After 30 minutes of discussion, you’ve made no effort to verify the truth or retract the false claim despite my repeated requests. If you retract, I treat this as a misunderstanding, but you haven’t.

Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile Photo

I've wasted too much energy on X, naively thinking any of it mattered. Now I'm truly disillusioned—but finally awake. I'm shedding distractions, returning fully to research and meaningful work. No more replies, only occasional updates. Thanks to the few who truly supported me.

Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile Photo

No matter how AI evolves overnight—tech, career, how it may impact me—I remain committed to using "physics of language models" approach to predict next-gen AI. Due to my limited GPU access at Meta, Part 4.1 (+new 4.2) are still in progress, but results on Canon layers are shining

No matter how AI evolves overnight—tech, career, how it may impact me—I remain committed to using "physics of language models" approach to predict next-gen AI. Due to my limited GPU access at Meta, Part 4.1 (+new 4.2) are still in progress, but results on Canon layers are shining
Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile Photo

Facebook AI Research (FAIR) is a small, prestigious lab in Meta. We don't train large models like GenAI or MSL, so it's natural that we have limited GPUs. GenAI or MSL's success or failure, past or future, doesn't reflect the work of FAIR. It is important to make this distinction

Facebook AI Research (FAIR) is a small, prestigious lab in Meta. We don't train large models like GenAI or MSL, so it's natural that we have limited GPUs. GenAI or MSL's success or failure, past or future, doesn't reflect the work of FAIR. It is important to make this distinction