
Lars Ankile
@larsankile
Doing research in ML at MIT.
ID: 1025927936
https://ankile.com 21-12-2012 08:11:50
82 Tweet
292 Followers
507 Following

Remember the llm.c repro of the GPT-2 (124M) training run? It took 45 min on 8xH100. Since then, Keller Jordan (and by now many others) have iterated on that extensively in the new modded-nanogpt repo that achieves the same result, now in only 5 min! Love this repo 👏 600 LOC








HOT 🔥 fastest, most precise, and most capable hand control setup ever... Less than $450 and fully open-source 🤯 by Hugging Face, Rob Knight, Martino Russi This tendon-driven technology will disrupt robotics! Retweet to accelerate its democratization 🚀 A thread 🧵


HNY! Lately I took a crack at implementing the pi0 model from Physical Intelligence PaliGemma VLM (2.3B fine-tuned) + 0.3B "action expert" MoE + block attention Flow matching w/ action chunking Strong eval on Simpler w/ 75ms inference github.com/allenzren/open… ckpts available! 👇(1/6)



Attending #ICLR2025 next week! I will be presenting Diffusion Policy Policy Optimization (DPPO) at the Friday morning poster session with Lars Ankile diffusion-ppo.github.io I also joined Physical Intelligence lately. Love to chat about what we've been up to at Pi!

