Larry Dial
@classiclarryd
Data Engineer at AWS. & AI research.
ID: 1791834924317560832
http://larrydial.com 18-05-2024 14:15:22
6 Tweet
117 Followers
19 Following
Excited to share an (unofficial till merged) collaborative WR of 159s on modded-nanogpt! github.com/KellerJordan/m… 180s->159s includes align bos, triton, sparse attn gate, FA3, drop MLP, & dynamic YaRN. Keller Jordan Love how speedrun format enables immediate feedback on ideas!