leloy! (@leloykun) Twitter Tweets • TwiCopy

leloy!

@leloykun

+ Follow

Math @ AdMU • NanoGPT speedrunner • Muon fan 🤍 • soon RE @ ███ • prev ML @ XPD • 2x IOI & 2x ICPC • admonymous.co/leloy

ID: 1059904581604270080

linkhttps://leloykun.github.io/ calendar_today06-11-2018 20:25:19

7,7K Tweet

4,4K Takipçi

3,3K Takip Edilen

Laker Newhouse

@lakernewhouse

6 months ago

[1/9] We created a performant Lipschitz transformer by spectrally regulating the weights—without using activation stability tricks: no layer norm, QK norm, or logit softcapping. We think this may address a “root cause” of unstable training.

thumb_up_off_alt562

chat_bubble_outline13

repeat77

shareShare

Thomas Fel

@napoolar

6 months ago

Great excuse to share something I really love: 1-Lipschitz nets. They give clean theory, certs for robustness, the right loss for W-GANs, even nicer grads for explainability!! Yet are still niche. Here’s a speed-run through some of my favorite papers on the field. 🧵👇

thumb_up_off_alt432

chat_bubble_outline5

repeat53

shareShare

Laker Newhouse

@lakernewhouse

6 months ago

[1/6] Curious about Muon, but not sure where to start? I wrote a 3-part blog series called “Understanding Muon” designed to get you up to speed—with The Matrix references, annotated source code, and thoughts on where Muon might be going.

thumb_up_off_alt314

chat_bubble_outline7

repeat39

shareShare

leloy!

@leloykun

5 months ago

I have so many interests I find it hard to focus on any of them I wanna study algebraic topology, category theory, optimization on finsler manifolds but also, I wanna build. I can build the entire AI infra of an AI SaaS, even the UI. I've done it before yet here I am,

thumb_up_off_alt569

chat_bubble_outline30

repeat21

shareShare