
Yaroslav Bulatov
@yaroslavvb
Together.AI (ex-Google Brain, OpenAI, Meta)
New Blog: medium.com/@yaroslavvb
Old Blog: yaroslavvb.blogspot.com
ID: 258031029
http://medium.com/@yaroslavvb 26-02-2011 20:22:57
1,1K Tweet
7,7K Takipçi
873 Takip Edilen








Watching Zhuang Liu's - "Transformers without Normalization", this slide is a reminder how our optimizer and architecture choices are coupled




Enjoyed Jeremy Bernstein thought-provoking talk on optimizers at ML Collective today. Are theories that motivate optimizers very useful? Adversarial for AdaGrad, natural gradient for KFAC. Non-linear solvers in scientific computing seem to advance without spending a lot of effort thinking

came across this overview by Derek Lowe on the state of AI drugs a year ago ACS Central Science science.org/content/blog-p…

