Mahan Fathi
@mahanfathi
research @nvidia👁️; ex @googledeepmind, @google🧠 & @mila_quebec.
ID: 321884742
http://mahanfathi.github.io/ 22-06-2011 08:52:28
65 Tweet
666 Followers
130 Following
Last week, I gave a talk at Mila - Institut québécois d'IA. The talk should be of interest to anyone working on predictive models, particularly in latent space. In collab. with Mahan Fathi Clement Gehring Jonathan Pilault David Kanaa Pierre-Luc Bacon. See you at ICLR 2026 in 🇦🇹! drive.google.com/file/d/1mQSXFa…
Introducing our new paper explaining in-context learning through the lens of Occam’s razor, giving a normative account of next-token prediction objectives. This was with Tom Marty Tejas kasetty Léo Gagnon Sarthak Mittal Mahan Fathi Dhanya Sridhar Guillaume Lajoie arxiv.org/abs/2410.14086
The talk I gave @ Mila on learning linearized representations of dynamical systems (Koopman representations) is on YouTube. The work was mainly carried out by Mahan Fathi in collaboration with Pierre-Luc Bacon 's lab, and was presented at ICLR 2024. youtube.com/watch?v=wKyN5j…
🚀 Nemotron 3 Nano 30B-A3B is here! Open weights + open data + open source. AA Intelligence Index: 52 (Artificial Analysis ) ✅ 1M‑token context ✅ up to 3.3× higher throughput vs similarly sized open models ✅ stronger reasoning/agentic + chat Details + links in the thread 🧵
Nemotron 3 Super is here — 120B total / 12B active, Hybrid SSM Latent MoE, designed for Blackwell. Truly open: permissive license, open data, open training infra. See analysis on Artificial Analysis Details in thread 🧵below: