Nicolas Zucchet
@nicolaszucchet
PhD student in NeuroAI @CSatETH | prev. @Polytechnique
ID: 936865625233817600
https://nicolaszucchet.github.io 02-12-2017 07:52:25
100 Tweet
317 Takipçi
274 Takip Edilen
S4, Mamba, and Hawk/Griffin are great – but do we really understand how they work? We fully characterize the power of gated (selective) SSMs mathematically using powerful tools from Rough Path Theory. All thanks to our math magician Nicola Muça Cirone arxiv.org/pdf/2402.19047… 🧵
🚨🚨New ICML 2024 Paper: arxiv.org/abs/2402.05787 How do Transformers perform In-Context Autoregressive Learning? We investigate how causal Transformers learn simple autoregressive processes or order 1. with Raja Giryes 💔, Taiji Suzuki, Mathieu Blondel and Gabriel Peyré 🙏