leloy! (@leloykun) 's Twitter Profile
leloy!

@leloykun

Math @ AdMU • NanoGPT speedrunner • Muon fan 🤍 • soon RE @ ███ • prev ML @ XPD • 2x IOI & 2x ICPC • admonymous.co/leloy

ID: 1059904581604270080

linkhttps://leloykun.github.io/ calendar_today06-11-2018 20:25:19

7,7K Tweet

4,4K Takipçi

3,3K Takip Edilen

Laker Newhouse (@lakernewhouse) 's Twitter Profile Photo

[1/9] We created a performant Lipschitz transformer by spectrally regulating the weights—without using activation stability tricks: no layer norm, QK norm, or logit softcapping. We think this may address a “root cause” of unstable training.

[1/9] We created a performant Lipschitz transformer by spectrally regulating the weights—without using activation stability tricks: no layer norm, QK norm, or logit softcapping. We think this may address a “root cause” of unstable training.
Thomas Fel (@napoolar) 's Twitter Profile Photo

Great excuse to share something I really love: 1-Lipschitz nets. They give clean theory, certs for robustness, the right loss for W-GANs, even nicer grads for explainability!! Yet are still niche. Here’s a speed-run through some of my favorite papers on the field. 🧵👇

Great excuse to share something I really love: 
1-Lipschitz nets.

They give clean theory, certs for robustness, the right loss for W-GANs, even nicer grads for explainability!! Yet are still niche.

Here’s a speed-run through some of my favorite papers on the field. 🧵👇
Laker Newhouse (@lakernewhouse) 's Twitter Profile Photo

[1/6] Curious about Muon, but not sure where to start? I wrote a 3-part blog series called “Understanding Muon” designed to get you up to speed—with The Matrix references, annotated source code, and thoughts on where Muon might be going.

[1/6] Curious about Muon, but not sure where to start? I wrote a 3-part blog series called “Understanding Muon” designed to get you up to speed—with The Matrix references, annotated source code, and thoughts on where Muon might be going.
leloy! (@leloykun) 's Twitter Profile Photo

I have so many interests I find it hard to focus on any of them I wanna study algebraic topology, category theory, optimization on finsler manifolds but also, I wanna build. I can build the entire AI infra of an AI SaaS, even the UI. I've done it before yet here I am,