Laurence Aitchison (@laurence_ai) 's Twitter Profile
Laurence Aitchison

@laurence_ai

LLMs and probabilistic machine learning. University of Bristol

ID: 1131602344200810503

linkhttp://www.laurenceai.com calendar_today23-05-2019 16:46:37

691 Tweet

1,1K Followers

494 Following

David Pfau (@pfau) 's Twitter Profile Photo

This is really impressive work, and congrats to the team, but as an aside...does anyone else find it weird that "world model" evolved from meaning "the minimal model needed to plan in an environment" to "action-conditional video model"?

Cengiz Pehlevan (@cpehlevan) 's Twitter Profile Photo

Congratulations, Blake Bordelon ☕️🧪👨‍💻! Working with you has been a privilege! I’ll miss having you at Harvard, but I’m excited to see the groundbreaking discoveries your new group will make. Wishing you every success!

G-Research Careers (@gresearchjobs) 's Twitter Profile Photo

🏙️ Want to spend your summer in London? Applications are open for our 10-week internship in quantitative research 2026. Work with leading researchers on projects in modelling, deep learning and optimisation. Apply here: eu1.hubs.ly/H0mvF8v0 #Internship #GResearch

🏙️ Want to spend your summer in London?

Applications are open for our 10-week internship in quantitative research 2026. Work with leading researchers on projects in modelling, deep learning and optimisation.

Apply here: eu1.hubs.ly/H0mvF8v0

#Internship #GResearch
Edward Milsom (@edward_milsom) 's Twitter Profile Photo

Excited to announce I'll be starting this September 2025 as a Lecturer (Assistant Professor) at the University of Bath! I will continue my research on deep learning foundations, and am open to ideas for collaborations. (Pictured: Bath. Not pictured: University of Bath)

Excited to announce I'll be starting this September 2025 as a Lecturer (Assistant Professor) at the University of Bath! I will continue my research on deep learning foundations, and am open to ideas for collaborations. (Pictured: Bath. Not pictured: University of Bath)
Laurence Aitchison (@laurence_ai) 's Twitter Profile Photo

Super exciting that my former student Edward Milsom is starting his lab at the University of Bath. Ed has led super-fundamental work on representation learning and learning dynamics. And he’s great fun to work with, so I definitely recommend collaborating!

Lucy Farnik (@lucyfarnik) 's Twitter Profile Photo

I'm looking for an informal PhD supervisor in LLMs/post-training — any recommendations? My supervisor is leaving academia & the rest of the dep't doesn't work on LLMs, so I'm hoping to find someone external to collaborate with More info 👇, RTs appreciated! 🙏

Shane Bergsma (@shanebergsma) 's Twitter Profile Photo

(1/4) Cerebras Hot off the presses 🔥📄arxiv.org/abs/2509.25087 If you're spending $1B to train an LLM, you need to know it’s on track—every step of the way. With optimal AdamW τ + fixed TPP, loss curves collapse to a universal path → an early-warning signal for training.

(1/4)
<a href="/CerebrasSystems/">Cerebras</a> Hot off the presses 🔥📄arxiv.org/abs/2509.25087
If you're spending $1B to train an LLM, you need to know it’s on track—every step of the way.
With optimal AdamW τ + fixed TPP, loss curves collapse to a universal path → an early-warning signal for training.
Shikai Qiu (@shikaiqiu) 's Twitter Profile Photo

Beautiful work on pretraining science using scaling collapse to precisely predict, debug, and tune LLM training from small-scale and partial runs. So much insights on going beyond μP!

Albert Jiang (@albertqjiang) 's Twitter Profile Photo

Thrilled to announce Mistral AI's new formal math team following our $2B funding round. AI for formal math researchers wanted. We offer: - Work on creating the state-of-the-art prover, autoformalizer, and automatic proof agent all in one model - World class team (with

Simone Scardapane (@s_scardapane) 's Twitter Profile Photo

*Scale-invariant attention* by Ben Anson Laurence Aitchison Attention is "scale-invariant" if it acts similarly at different (length) scales. This paper proposes a simple modification to make it scale-invariant & improve length generalization. arxiv.org/abs/2505.17083

*Scale-invariant attention*
by <a href="/benaibean/">Ben Anson</a> <a href="/laurence_ai/">Laurence Aitchison</a>

Attention is "scale-invariant" if it acts similarly at different (length) scales. This paper proposes a simple modification to make it scale-invariant &amp; improve length generalization.

arxiv.org/abs/2505.17083
Xidulu (@xidulu) 's Twitter Profile Photo

Recent work from Nvidia uses the information gain of reasoning chain as a reward function during pre-training (left most). Our recent work considers a similiar quantity but uses it as a signal for early exiting the reasoning chain arxiv.org/abs/2509.26522

Recent work from Nvidia uses the information gain of reasoning chain as a reward function during pre-training (left most). Our recent work considers a similiar quantity  but uses it as a signal for early exiting the reasoning chain arxiv.org/abs/2509.26522
Xidulu (@xidulu) 's Twitter Profile Photo

Wooo so Laurence Aitchison and I proposed micro adam for batch size invariance and linear batchsize-learning_rate scaling opt-ml.org/papers/2024/pa… a while ago, and this paper from Google provides an efficient implementation and more in-depth analysis and extensions!

Ravid Shwartz Ziv (@ziv_ravid) 's Twitter Profile Photo

I'm not sure how to put it, but even if you scale your model with different compute, fit a function to these points, and present it on a log-log scale, it's not necessarily a scaling law! Thank you for this matter

I'm not sure how to put it, but even if you scale your model with different compute, fit a function to these points, and present it on a log-log scale, it's not necessarily a scaling law!  Thank you for this matter