Volkan Cevher (@cevherlions) 's Twitter Profile
Volkan Cevher

@cevherlions

Associate Professor of Electrical Engineering, EPFL.
Amazon Scholar (AGI Foundations). IEEE Fellow. ELLIS Fellow.

ID: 1062706908

linkhttp://lions.epfl.ch/ calendar_today05-01-2013 10:46:44

1,1K Tweet

3,3K Followers

625 Following

Caglar Gulcehre (@caglarml) 's Twitter Profile Photo

Volkan Cevher and I have spent a lot of time preparing these course materials on the "foundations of training LLMs." Now, we are excited to share them with the broader community. These lectures touch both on theoretical (like MuP) and empirical aspects of training LLMs.

Luca Viano (@lucaviano4) 's Twitter Profile Photo

1/n If you are developing a new IL algorithm that alternates between reward and SAC updates, read this new trick named SOAR ! arxiv.org/abs/2502.19859 It has guarantees in the tabular environments and halves the training time in MuJoCo ;) ICML work with Stefano and Volkan Cevher

Francesco Orabona (@bremen79) 's Twitter Profile Photo

I have an opening for a post-doc position: I am looking for smart people with a strong CV in optimization and/or online learning All my ex post-docs (Kwang-Sung (Kwang) Jun, Mingrui Liu, and El Mehdi SAAD) became assistant professors, I'd like to continue this trend 😉 Please share it!

You Jiacheng (@youjiacheng) 's Twitter Profile Photo

If you cite Muon, I think you should definitely cite SSD (proceedings.mlr.press/v38/carlson15.…) by Volkan Cevher et al. (sorry I can't find the handle of other authors) -- which proposed spectral descent.

If you cite Muon, I think you should definitely cite SSD (proceedings.mlr.press/v38/carlson15.…) by <a href="/CevherLIONS/">Volkan Cevher</a> et al. (sorry I can't find the handle of other authors) -- which proposed spectral descent.
Grigoris Chrysos (@grigoris_c) 's Twitter Profile Photo

🚨 Panel on "how are theoretical tools useful in vision?" with an amazing list of panelists: Volkan Cevher Olga Russakovsky Rene Vidal Open to your questions, the more ambitious the better. In #CVPR2025 : Room 107 A at 12 🎸.

Cohere Labs (@cohere_labs) 's Twitter Profile Photo

Join our ML Theory group next week as they welcome Tony S.F. on July 3rd for a presentation on "Training neural networks at any scale" Thanks to Andrej Jovanović Anier Velasco Sotomayor and Thang Chu for organizing this session 👏 Learn more: cohere.com/events/Cohere-…

Join our ML Theory group next week as they welcome <a href="/tonysilveti/">Tony S.F.</a> on July 3rd for a presentation on "Training neural networks at any scale"

Thanks to <a href="/itsmaddox_j/">Andrej Jovanović</a>  <a href="/aniervs/">Anier Velasco Sotomayor</a>  and <a href="/ThangChu77/">Thang Chu</a>  for organizing this session 👏

Learn more: cohere.com/events/Cohere-…
Volkan Cevher (@cevherlions) 's Twitter Profile Photo

Excited to give a tutorial with Leena C Vankadara on Training Neural Networks at Any Scale (TRAINS) ICML Conference at 13:30 (West Ballroom A). Our slides can be found here: go.epfl.ch/ICML25TRAINS Please join us.

Fanghui Liu (@fanghui_sgra) 's Twitter Profile Photo

I will give the presentation today 4pm at #ICML2025 Oral session: Learning dynamics 2 @ West Ballroom B! Here is the poster and long-version slides (lfhsgre.org/files/talk_LoR…) if you’re interested in.

I will give the presentation today 4pm at #ICML2025 Oral session: Learning dynamics 2 @ West Ballroom B!

Here is the poster and long-version slides (lfhsgre.org/files/talk_LoR…) if you’re interested in.
Tony S.F. (@tonysilveti) 's Twitter Profile Photo

Marguerite Frank describing her memeory of inventing the Frank-Wolfe/Conditional Gradient algorithm together with Philip Wolfe. It seems she was the one to come up with the idea of the linear minimization oracle, with Wolfe contributing the convergence proof and presentation.

Marguerite Frank describing her memeory of inventing the Frank-Wolfe/Conditional Gradient algorithm together with Philip Wolfe. It seems she was the one to come up with the idea of the linear minimization oracle, with Wolfe contributing the convergence proof and presentation.
Arshia Afzal (@rshia_afz) 's Twitter Profile Photo

Check out our 📚 Paper: arxiv.org/abs/2502.16249 🌐 Blogpost: arshiaafzal.github.io/blog/ 𝕏 Thread: x.com/CevherLIONS/st… Finally, huge thanks to Leyla Naz Candogan, Elias Abad Rocamora, Pol Puigdemont & Volkan Cevher, this wouldn’t have been possible without their support and help!

Check out our  

📚 Paper: arxiv.org/abs/2502.16249 
🌐 Blogpost: arshiaafzal.github.io/blog/  
𝕏 Thread: x.com/CevherLIONS/st…

Finally, huge thanks to <a href="/leylacandogan/">Leyla Naz Candogan</a>, <a href="/abad_rocamora/">Elias Abad Rocamora</a>, <a href="/polpuigdemont/">Pol Puigdemont</a> &amp; <a href="/CevherLIONS/">Volkan Cevher</a>,  this wouldn’t have been possible without their support and help!
Peyman Milanfar (@docmilanfar) 's Twitter Profile Photo

To establish power law behavior we need statistical tests. This paper is a nice overview of statistical methods for testing power laws "Power-Law Distributions in Empirical Data" by A Clauset, CR Shalizi, & MEJ Newman SIAM Review, 51(4), 661–703 arxiv.org/abs/0706.1062 4/5

To establish power law behavior we need statistical tests. This paper is a nice overview of statistical methods for testing power laws

"Power-Law Distributions in Empirical Data"
by A Clauset, CR Shalizi, &amp; MEJ Newman
SIAM Review, 51(4), 661–703

arxiv.org/abs/0706.1062

4/5