Laurence Aitchison (@laurence_ai) Twitter Tweets • TwiCopy

David Pfau

3 months ago

This is really impressive work, and congrats to the team, but as an aside...does anyone else find it weird that "world model" evolved from meaning "the minimal model needed to plan in an environment" to "action-conditional video model"?

thumb_up_off_alt452

chat_bubble_outline26

repeat17

shareShare

Cengiz Pehlevan

@cpehlevan

3 months ago

Congratulations, Blake Bordelon ☕️🧪👨‍💻! Working with you has been a privilege! I’ll miss having you at Harvard, but I’m excited to see the groundbreaking discoveries your new group will make. Wishing you every success!

thumb_up_off_alt38

chat_bubble_outline0

repeat1

shareShare

Xidulu

@xidulu

3 months ago

Please apply, Hanze is an amazing person to work with

thumb_up_off_alt3

chat_bubble_outline1

repeat1

shareShare

G-Research Careers

@gresearchjobs

3 months ago

🏙️ Want to spend your summer in London? Applications are open for our 10-week internship in quantitative research 2026. Work with leading researchers on projects in modelling, deep learning and optimisation. Apply here: eu1.hubs.ly/H0mvF8v0 #Internship #GResearch

thumb_up_off_alt5

chat_bubble_outline0

repeat3

shareShare

Edward Milsom

@edward_milsom

3 months ago

Excited to announce I'll be starting this September 2025 as a Lecturer (Assistant Professor) at the University of Bath! I will continue my research on deep learning foundations, and am open to ideas for collaborations. (Pictured: Bath. Not pictured: University of Bath)

thumb_up_off_alt33

chat_bubble_outline3

repeat3

shareShare

Laurence Aitchison

@laurence_ai

3 months ago

Super exciting that my former student Edward Milsom is starting his lab at the University of Bath. Ed has led super-fundamental work on representation learning and learning dynamics. And he’s great fun to work with, so I definitely recommend collaborating!

thumb_up_off_alt19

chat_bubble_outline0

repeat4

shareShare

Xidulu

@xidulu

3 months ago

Mr Bloomberg please buy us more GPU thx 🙏

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Lucy Farnik

@lucyfarnik

3 months ago

I'm looking for an informal PhD supervisor in LLMs/post-training — any recommendations? My supervisor is leaving academia & the rest of the dep't doesn't work on LLMs, so I'm hoping to find someone external to collaborate with More info 👇, RTs appreciated! 🙏

thumb_up_off_alt66

chat_bubble_outline4

repeat17

shareShare

Vincent Fortuin @vincefort.bsky.social

@vincefort

3 months ago

Possibly one of the best PhD programs in the world if you are passionate about machine learning and AI:

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

🔥 Matt Dancho (Business Science) 🔥

@mdancho84

2 months ago

🚨How to build a Self-Improving Coding Agent A free 18-page PDF:

thumb_up_off_alt1,1K

chat_bubble_outline22

repeat304

shareShare

Shane Bergsma

@shanebergsma

2 months ago

(1/4) Cerebras Hot off the presses 🔥📄arxiv.org/abs/2509.25087 If you're spending $1B to train an LLM, you need to know it’s on track—every step of the way. With optimal AdamW τ + fixed TPP, loss curves collapse to a universal path → an early-warning signal for training.

(1/4)
<a href="/CerebrasSystems/">Cerebras</a> Hot off the presses 🔥📄arxiv.org/abs/2509.25087
If you're spending $1B to train an LLM, you need to know it’s on track—every step of the way.
With optimal AdamW τ + fixed TPP, loss curves collapse to a universal path → an early-warning signal for training.

thumb_up_off_alt26

chat_bubble_outline2

repeat8

shareShare

Shikai Qiu

@shikaiqiu

2 months ago

Beautiful work on pretraining science using scaling collapse to precisely predict, debug, and tune LLM training from small-scale and partial runs. So much insights on going beyond μP!

thumb_up_off_alt12

chat_bubble_outline2

repeat2

shareShare

Albert Jiang

@albertqjiang

2 months ago

Thrilled to announce Mistral AI's new formal math team following our $2B funding round. AI for formal math researchers wanted. We offer: - Work on creating the state-of-the-art prover, autoformalizer, and automatic proof agent all in one model - World class team (with

thumb_up_off_alt520

chat_bubble_outline15

repeat47

shareShare

Simone Scardapane

@s_scardapane

a month ago

*Scale-invariant attention* by Ben Anson Laurence Aitchison Attention is "scale-invariant" if it acts similarly at different (length) scales. This paper proposes a simple modification to make it scale-invariant & improve length generalization. arxiv.org/abs/2505.17083

*Scale-invariant attention*
by <a href="/benaibean/">Ben Anson</a> <a href="/laurence_ai/">Laurence Aitchison</a>

Attention is "scale-invariant" if it acts similarly at different (length) scales. This paper proposes a simple modification to make it scale-invariant & improve length generalization.

arxiv.org/abs/2505.17083

thumb_up_off_alt166

chat_bubble_outline1

repeat19

shareShare

vik

@vikhyatk

a month ago

i just want to say, i fucking called it re: sqrt(log(pos))

thumb_up_off_alt95

chat_bubble_outline3

repeat4

shareShare

Xidulu

@xidulu

a month ago

Recent work from Nvidia uses the information gain of reasoning chain as a reward function during pre-training (left most). Our recent work considers a similiar quantity but uses it as a signal for early exiting the reasoning chain arxiv.org/abs/2509.26522

thumb_up_off_alt34

chat_bubble_outline1

repeat6

shareShare

Xidulu

@xidulu

a month ago

Wooo so Laurence Aitchison and I proposed micro adam for batch size invariance and linear batchsize-learning_rate scaling opt-ml.org/papers/2024/pa… a while ago, and this paper from Google provides an efficient implementation and more in-depth analysis and extensions!

thumb_up_off_alt17

chat_bubble_outline1

repeat1

shareShare

Ravid Shwartz Ziv

@ziv_ravid

a month ago

I'm not sure how to put it, but even if you scale your model with different compute, fit a function to these points, and present it on a log-log scale, it's not necessarily a scaling law! Thank you for this matter

thumb_up_off_alt209

chat_bubble_outline14

repeat12

shareShare