Max Shad (@maxshadx) 's Twitter Profile
Max Shad

@maxshadx

Senior Director of AI/ML Research Engineering, Kempner Institute @KempnerInst @Harvard , views are my own

ID: 218606940

linkhttps://www.linkedin.com/in/mmsh/ calendar_today22-11-2010 19:56:11

109 Tweet

438 Followers

440 Following

Kempner Institute at Harvard University (@kempnerinst) 's Twitter Profile Photo

NEW blog post! Edwin Zhang and Eran Malach show how generative models in #chess can surpass the performance of their training data. Read more on our #KempnerInstitute blog and check out the preprint: bit.ly/3XotNqz #transcendence

NEW blog post! Edwin Zhang and <a href="/EranMalach/">Eran Malach</a> show how generative models in #chess can surpass the performance of their training data. Read more on our #KempnerInstitute blog and check out the preprint: bit.ly/3XotNqz #transcendence
Alex Dimakis (@alexgdimakis) 's Twitter Profile Photo

This paper seems very interesting: say you train an LLM to play chess using only transcripts of games of players up to 1000 elo. Is it possible that the model plays better than 1000 elo? (i.e. "transcends" the training data performance?). It seems you get something from nothing,

This paper seems very interesting: say you train an LLM to play chess using only transcripts of games of players up to 1000 elo. Is it possible that the model plays better than 1000 elo? (i.e. "transcends" the training data performance?). It seems you get something from nothing,
Keyon Vafa (@keyonv) 's Twitter Profile Photo

New paper: How can you tell if a transformer has the right world model? We trained a transformer to predict directions for NYC taxi rides. The model was good. It could find shortest paths between new points But had it built a map of NYC? We reconstructed its map and found this:

New paper: How can you tell if a transformer has the right world model?

We trained a transformer to predict directions for NYC taxi rides. The model was good. It could find shortest paths between new points

But had it built a map of NYC? We reconstructed its map and found this:
Kempner Institute at Harvard University (@kempnerinst) 's Twitter Profile Photo

NEW #KempnerInstitute blog: Rosie Zhao, Depen Morwani, David Brandfonbrener, Nikhil Vyas & Sham Kakade study a variety of #LLM training optimizers and find they are all fairly similar except for SGD, which is notably worse. Read more: bit.ly/3S5PmZk #ML #AI

NEW #KempnerInstitute blog: <a href="/rosieyzh/">Rosie Zhao</a>, <a href="/depen_morwani/">Depen Morwani</a>, <a href="/brandfonbrener/">David Brandfonbrener</a>, <a href="/vyasnikhil96/">Nikhil Vyas</a> &amp; <a href="/ShamKakade6/">Sham Kakade</a> study a variety of #LLM training optimizers and find they are all fairly similar except for SGD, which is notably worse. Read more: bit.ly/3S5PmZk #ML #AI
Sham Kakade (@shamkakade6) 's Twitter Profile Photo

Please apply and spread the word! These positions are pretty great, and it's a wonderful community to study the most exciting AI and neuro questions!

Yilun Du (@du_yilun) 's Twitter Profile Photo

I'm recruiting PhD students this year with interest in machine learning, embodied AI, or AI for science! If you are interested in constructing fundamental tools to improve Generative AI and exploring how these tools can be used for intelligent embodied agents and science,

Sham Kakade (@shamkakade6) 's Twitter Profile Photo

1/5⚡Introducing Flash Inference: an *exact* method cutting inference time for Long Convolution Sequence Models (LCSMs) to near-linear O(L log² L) complexity! Faster inference, same precision. Learn how we accelerate LCSM inference.

1/5⚡Introducing Flash Inference: an *exact* method cutting inference time for Long Convolution Sequence Models (LCSMs) to near-linear O(L log² L) complexity! Faster inference, same precision. Learn how we accelerate LCSM inference.
Kempner Institute at Harvard University (@kempnerinst) 's Twitter Profile Photo

We're hiring! Our Associate Director of Educational Programs oversees all aspects of the #KempnerInstitute's fellowship and training programs for undergraduate, post-baccalaureate, and graduate students. Apply today: bit.ly/3C4IV3P #AI #education Harvard University

We're hiring! Our Associate Director of Educational Programs oversees all aspects of the #KempnerInstitute's fellowship and training programs for undergraduate, post-baccalaureate, and graduate students. Apply today: bit.ly/3C4IV3P #AI #education <a href="/Harvard/">Harvard University</a>
Kempner Institute at Harvard University (@kempnerinst) 's Twitter Profile Photo

Attending SC25 this week in Atlanta? Be sure to stop by the MGHPCC booth to learn about the #KempnerInstitute AI cluster, and more about our educational programs & other opportunities. Hope to see you then! Max Shad Yasin Mazloumi bit.ly/4fPrDWN

Max Shad (@maxshadx) 's Twitter Profile Photo

Eleven years ago, I named our first HPC cluster "Ludwig" to honor his contributions to science. With its computing power, we simulated complex fluid dynamics using Lattice Boltzmann Method! #science #hpc #fluid

Kempner Institute at Harvard University (@kempnerinst) 's Twitter Profile Photo

Congrats to #KempnerInstitute's Marinka Zitnik & colleagues whose work is featured as a spotlight paper at #NeurIPS2024! Check out "Generalized Protein Pocket Generation with Prior-Informed Flow Matching" at the 11a session today. Zaixi Zhang Qi Liu buff.ly/4fgzlsp

Congrats to #KempnerInstitute's <a href="/marinkazitnik/">Marinka Zitnik</a> &amp; colleagues whose work is featured as a spotlight paper at #NeurIPS2024! Check out "Generalized Protein Pocket Generation with Prior-Informed Flow Matching" at the 11a session today. <a href="/ZaixiZhang/">Zaixi Zhang</a> <a href="/leuchine/">Qi Liu</a> buff.ly/4fgzlsp
Kempner Institute at Harvard University (@kempnerinst) 's Twitter Profile Photo

New in the #DeeperLearningBlog: Kempner researchers Nikhil Anand (Nikhil Anand) and Chloe Su (Chloe H. Su) discuss new work on how numerical precision can impact the accuracy and stability of #LLMs. kempnerinstitute.harvard.edu/research/deepe… #AI (1/2)