Alireza Mousavi @ ICLR 2025 (@alirezamh_) Twitter Tweets • TwiCopy

Alireza Mousavi @ ICLR 2025

@alirezamh_

+ Follow

CS PhD student @UofT and @VectorInst. Interested in deep learning theory.

ID: 1844793221240483848

linkhttps://mousavih.github.io calendar_today11-10-2024 17:32:44

28 Tweet

215 Followers

183 Following

MTL MLOpt

@mtl_mlopt

a year ago

Join us on Wednesday, November 13th, at 12:30 PM EDT for a talk by Alireza Mousavi (UofT) on "Learning and Optimization with Mean-Field Langevin Dynamics" at Mila - Institut québécois d'IA in Montreal

Join us on Wednesday, November 13th, at 12:30 PM EDT for a talk by <a href="/alirezamh_/">Alireza Mousavi</a> (UofT) on "Learning and Optimization with Mean-Field Langevin Dynamics" at <a href="/Mila_Quebec/">Mila - Institut québécois d'IA</a> in Montreal

thumb_up_off_alt12

chat_bubble_outline0

repeat4

shareShare

Alireza Mousavi @ ICLR 2025

@alirezamh_

a year ago

Honored to receive this fellowship. Thank you RBC Borealis for supporting our research, and for the cool sweatshirts :)

thumb_up_off_alt16

chat_bubble_outline1

repeat0

shareShare

Alireza Mousavi @ ICLR 2025

@alirezamh_

a year ago

So can someone ask o3 to prove that with high probability over initialization, gradient descent on ResNet and CIFAR10 converges to 0 loss?

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Vector Institute

@vectorinst

10 months ago

Congratulations to Vector-affiliated researchers Alireza Mousavi-Hosseini and Mohammed Adnan, who were named RBC Borealis Fellows. The RBC Borealis Fellowships Program represents excellence in Canadian AI research and innovation. We’re proud to see our affiliated researchers

thumb_up_off_alt14

chat_bubble_outline0

repeat2

shareShare

Eshaan Nichani

@eshaannichani

7 months ago

Excited to announce a new paper with Yunwei Ren, Denny Wu, Jason Lee! We prove a neural scaling law in the SGD learning of extensive width two-layer neural networks. arxiv.org/abs/2504.19983 🧵below (1/10)

Excited to announce a new paper with Yunwei Ren, Denny Wu, <a href="/jasondeanlee/">Jason Lee</a>!

We prove a neural scaling law in the SGD learning of extensive width two-layer neural networks.

arxiv.org/abs/2504.19983

🧵below (1/10)

thumb_up_off_alt199

chat_bubble_outline5

repeat45

shareShare

Jason Lee

@jasondeanlee

6 months ago

New work arxiv.org/abs/2506.05500 on learning multi-index models with Alex Damian and Joan Bruna. Multi-index are of the form y= g(Ux), where U=r by d maps from d dimension to r dimension and d>>r. g is an arbitrary function. Examples of multi-index models are any neural net

thumb_up_off_alt112

chat_bubble_outline2

repeat19

shareShare

masani

@mohammadhamani

5 months ago

Why does RL struggle with tasks requiring long reasoning chains? Because “bumping into” a correct solution becomes exponentially less likely as the number of reasoning steps grows. We propose an adaptive backtracking algorithm: AdaBack. 1/n

thumb_up_off_alt48

chat_bubble_outline2

repeat7

shareShare

Bruno Mlodozeniec

@kayembruno

4 months ago

NeurIPS Conference, why take the option to provide figures in the rebuttals away from the authors during the rebuttal period? Grounding the discussion in hard evidential data (like plots) makes resolving disagreements much easier for both the authors and the reviewers. Left: NeurIPS

<a href="/NeurIPSConf/">NeurIPS Conference</a>, why take the option to provide figures in the rebuttals away from the authors during the rebuttal period? Grounding the discussion in hard evidential data (like plots) makes resolving disagreements much easier for both the authors and the reviewers.

Left: NeurIPS

thumb_up_off_alt89

chat_bubble_outline3

repeat21

shareShare