Kwangjun Ahn
@kwangjuna
Senior Researcher at Microsoft Reserach // PhD from MIT EECS
ID: 1229766355622055936
http://kjahn.mit.edu/ 18-02-2020 13:56:04
40 Tweet
550 Followers
266 Following
Coud it be that the emergence of the threshold units are related to these oscillations? Oscillations themselves have recently been under intense scrutiny by theoreticians under the name "Edge of Stability", a beautiful phenomenon discovered by Jeremy Cohen and co-authors. 5/8
Project was led by three incredible MIT students, Kwangjun Ahn, Sinho Chewi and Felipe Suárez Colmenares. I cannot recommend them strongly enough. Project went so far beyond what I expected to be true at the beginning, let alone what would be *provable*. Such a pleasure to work with them. 8/8
Damek Davis youtu.be/0tYpMncAKFs?fe… In case if you haven't watched it yet :)
Excited to share our NeurIPS paper that Sebastien Bubeck mentioned in his post: arxiv.org/abs/2212.07469 Also check out a NeurIPS paper on understanding SAM (a companion paper!) arxiv.org/abs/2305.15287 My talk video from INFORMS about these works: youtu.be/TMmpeVBbD7o?si…
If you're at #NeurIPS2023, Kwangjun Ahn will be presenting his work on SpecTr++ in Optimal Transport workshop where he discusses improved transport plans for speculative decoding.
Exciting new paper by Kwangjun Ahn (Kwangjun Ahn) and Ashok Cutkosky (Ashok Cutkosky)! Adam with model exponential moving average is effective for nonconvex optimization arxiv.org/pdf/2405.18199 This approach to analyzing Adam is extremely promising IMHO.
In our ICML 2024 paper (ICML Conference), joint w/ Zhiyu Zhang (Zhiyu Zhang), Yunbum Kook, Yan Dai, we provide a new perspective on Adam optimizer based on online learning. In particular, our perspective shows the importance of Adam's key components. (video: youtu.be/AU39SNkkIsA)