Guanyu Zhou
@tmartyr4951
Working with AIGC, Explore the next generation of intelligent systems
ID: 1756289864981741568
https://the-martyr.github.io 10-02-2024 12:12:03
9 Tweet
8 Followers
93 Following
After reading Hayden Prairie's paper, I became curious about what loop models really cost to train from the gradient and backward-pass perspective, both in FLOPs and in memory. > So what does it actually cost to train one? I wrote a blog that analyzes loop model training