
Peter Humphreys
@p_humphreys
AI, quantum and neuroscience. Scientist @ DeepMind
ID: 1244596843
05-03-2013 20:45:57
10 Tweet
63 Followers
33 Following


Big fan of the work of Adam Santoro and others. Glad Google decided to finally release it! arxiv.org/abs/2404.02258


I have implemented Mixture-of-Depths and it shows significant memory reduction during training and 10% speed increase. I will verify if it achieves the same quality with 12.5% active tokens. github.com/thepowerfuldee… thanks Alex Hägele for initial code

