Alexandre Ramé (@ramealexandre) 's Twitter Profile
Alexandre Ramé

@ramealexandre

Research scientist @GoogleDeepMind. Previously PhD @Sorbonne_Univ_.

Post-training Gemma LLMs: distillation, RL and merging.

ID: 300445195

linkhttps://alexrame.github.io/ calendar_today17-05-2011 19:37:18

662 Tweet

1,1K Takipçi

731 Takip Edilen

Alexandre Ramé (@ramealexandre) 's Twitter Profile Photo

Weight averaging strategies are super useful in deep learning, and succeed despite the non-linearities in networks' architectures. Our 2 works presented at #NeurIPS2022 analyze how they can help for out-of-distribution classification in computer vision! (1/4)