Volkan Cevher (@cevherlions) Twitter Tweets • TwiCopy

good girl

@goodgirlxsz

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Volkan Cevher and I have spent a lot of time preparing these course materials on the "foundations of training LLMs." Now, we are excited to share them with the broader community. These lectures touch both on theoretical (like MuP) and empirical aspects of training LLMs.

thumb_up_off_alt33

chat_bubble_outline0

repeat4

shareShare

Luca Viano

@lucaviano4

7 months ago

This will be presented at ICML !

thumb_up_off_alt14

chat_bubble_outline0

repeat2

shareShare

Luca Viano

@lucaviano4

7 months ago

1/n If you are developing a new IL algorithm that alternates between reward and SAC updates, read this new trick named SOAR ! arxiv.org/abs/2502.19859 It has guarantees in the tabular environments and halves the training time in MuJoCo ;) ICML work with Stefano and Volkan Cevher

thumb_up_off_alt14

chat_bubble_outline1

repeat3

shareShare

Francesco Orabona

@bremen79

6 months ago

I have an opening for a post-doc position: I am looking for smart people with a strong CV in optimization and/or online learning All my ex post-docs (Kwang-Sung (Kwang) Jun, Mingrui Liu, and El Mehdi SAAD) became assistant professors, I'd like to continue this trend 😉 Please share it!

thumb_up_off_alt102

chat_bubble_outline1

repeat39

shareShare

Tony S.F.

@tonysilveti

6 months ago

A short and sweet proof of convergence of steepest descent w.r.t. an arbitrary norm in the nonconvex (but smooth) setting.

thumb_up_off_alt30

chat_bubble_outline1

repeat6

shareShare

Curiosity

@mastronomers

6 months ago

The Sun photographed for more than a year from the same spot at the same time ♾

thumb_up_off_alt27,27K

chat_bubble_outline390

repeat3,3K

shareShare

Luca Viano

@lucaviano4

6 months ago

Finally, we have expert sample complexity bounds in multi agent imitation learning! arxiv.org/pdf/2505.17610 Joint work with Till Freihaut, Volkan Cevher, Matthieu and Giorgia Ramponi

thumb_up_off_alt32

chat_bubble_outline2

repeat4

shareShare

You Jiacheng

@youjiacheng

5 months ago

If you cite Muon, I think you should definitely cite SSD (proceedings.mlr.press/v38/carlson15.…) by Volkan Cevher et al. (sorry I can't find the handle of other authors) -- which proposed spectral descent.

If you cite Muon, I think you should definitely cite SSD (proceedings.mlr.press/v38/carlson15.…) by <a href="/CevherLIONS/">Volkan Cevher</a> et al. (sorry I can't find the handle of other authors) -- which proposed spectral descent.

thumb_up_off_alt153

chat_bubble_outline1

repeat18

shareShare

Arthur Mensch

@arthurmensch

5 months ago

We purposely made it great at optimal transport as you may have guessed !

thumb_up_off_alt122

chat_bubble_outline1

repeat9

shareShare

Grigoris Chrysos

@grigoris_c

5 months ago

🚨 Panel on "how are theoretical tools useful in vision?" with an amazing list of panelists: Volkan Cevher Olga Russakovsky Rene Vidal Open to your questions, the more ambitious the better. In #CVPR2025 : Room 107 A at 12 🎸.

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

Cohere Labs

@cohere_labs

5 months ago

Join our ML Theory group next week as they welcome Tony S.F. on July 3rd for a presentation on "Training neural networks at any scale" Thanks to Andrej Jovanović Anier Velasco Sotomayor and Thang Chu for organizing this session 👏 Learn more: cohere.com/events/Cohere-…

Join our ML Theory group next week as they welcome <a href="/tonysilveti/">Tony S.F.</a> on July 3rd for a presentation on "Training neural networks at any scale"

Thanks to <a href="/itsmaddox_j/">Andrej Jovanović</a> <a href="/aniervs/">Anier Velasco Sotomayor</a> and <a href="/ThangChu77/">Thang Chu</a> for organizing this session 👏

Learn more: cohere.com/events/Cohere-…

thumb_up_off_alt50

chat_bubble_outline4

repeat13

shareShare

rohan anil

@_arohan_

4 months ago

Actually its’s even older! Spectral stochastic gradient descent from 2015!

thumb_up_off_alt25

chat_bubble_outline1

repeat1

shareShare

Volkan Cevher

@cevherlions

4 months ago

Excited to give a tutorial with Leena C Vankadara on Training Neural Networks at Any Scale (TRAINS) ICML Conference at 13:30 (West Ballroom A). Our slides can be found here: go.epfl.ch/ICML25TRAINS Please join us.

thumb_up_off_alt84

chat_bubble_outline3

repeat12

shareShare

Fanghui Liu

@fanghui_sgra

4 months ago

I will give the presentation today 4pm at #ICML2025 Oral session: Learning dynamics 2 @ West Ballroom B! Here is the poster and long-version slides (lfhsgre.org/files/talk_LoR…) if you’re interested in.

thumb_up_off_alt12

chat_bubble_outline0

repeat2

shareShare

Tony S.F.

@tonysilveti

3 months ago

Marguerite Frank describing her memeory of inventing the Frank-Wolfe/Conditional Gradient algorithm together with Philip Wolfe. It seems she was the one to come up with the idea of the linear minimization oracle, with Wolfe contributing the convergence proof and presentation.

thumb_up_off_alt13

chat_bubble_outline1

repeat1

shareShare

Giorgia Ramponi

@gio_ramponi

2 months ago

- Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning, Luca Viano Till Freihaut Matthieu Geist Volkan Cevher - On Feasible Rewards in Multi-Agent Inverse Reinforcement Learning Till Freihaut

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Volkan Cevher

@cevherlions

2 months ago

Highly recommended!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Arshia Afzal

@rshia_afz

2 months ago

Check out our 📚 Paper: arxiv.org/abs/2502.16249 🌐 Blogpost: arshiaafzal.github.io/blog/ 𝕏 Thread: x.com/CevherLIONS/st… Finally, huge thanks to Leyla Naz Candogan, Elias Abad Rocamora, Pol Puigdemont & Volkan Cevher, this wouldn’t have been possible without their support and help!

Check out our

📚 Paper: arxiv.org/abs/2502.16249
🌐 Blogpost: arshiaafzal.github.io/blog/
𝕏 Thread: x.com/CevherLIONS/st…

Finally, huge thanks to <a href="/leylacandogan/">Leyla Naz Candogan</a>, <a href="/abad_rocamora/">Elias Abad Rocamora</a>, <a href="/polpuigdemont/">Pol Puigdemont</a> & <a href="/CevherLIONS/">Volkan Cevher</a>, this wouldn’t have been possible without their support and help!

thumb_up_off_alt1

chat_bubble_outline0

repeat1

shareShare

Peyman Milanfar

@docmilanfar

a month ago

To establish power law behavior we need statistical tests. This paper is a nice overview of statistical methods for testing power laws "Power-Law Distributions in Empirical Data" by A Clauset, CR Shalizi, & MEJ Newman SIAM Review, 51(4), 661–703 arxiv.org/abs/0706.1062 4/5

thumb_up_off_alt81

chat_bubble_outline2

repeat5

shareShare

Volkan Cevher

good girl

Caglar Gulcehre

Luca Viano

Luca Viano

Francesco Orabona

Tony S.F.

Curiosity

Luca Viano

You Jiacheng

Arthur Mensch

Grigoris Chrysos

Cohere Labs

rohan anil

Volkan Cevher

Fanghui Liu

Tony S.F.

Giorgia Ramponi

Volkan Cevher

Arshia Afzal

Peyman Milanfar