Jonny Cook (@jonnycoook) Twitter Tweets • TwiCopy

Jonny Cook

@jonnycoook

+ Follow

DPhil Student in AI @FLAIR_Ox
Prev. RS Intern @cohere, @DeepMind Scholar

ID: 1452746225074200582

calendar_today25-10-2021 21:17:53

48 Tweet

305 Followers

514 Following

Anastasios Gerontopoulos

@nasosger

6 months ago

1/n Multi-token prediction boosts LLMs (DeepSeek-V3), tackling key limitations of the next-token setup: • Short-term focus • Struggles with long-range decisions • Weaker supervision Prior methods add complexity (extra layers) 🔑 Our fix? Register tokens—elegant and powerful

thumb_up_off_alt134

chat_bubble_outline3

repeat17

shareShare

Clarisse Wibault

@clarissewibault

6 months ago

How can we bypass the need for online hyper-parameter tuning in offline RL? Foerster Lab for AI Research is introducing two fully offline algorithms: SOReL, for accurate offline regret approximation, and TOReL, for offline hyper-parameter tuning! arxiv.org/html/2505.2244…

thumb_up_off_alt17

chat_bubble_outline1

repeat7

shareShare

Amrith Setlur

@setlur_amrith

5 months ago

Since R1 there has been a lot of chatter 💬 on post-training LLMs with RL. Is RL only sharpening the distribution over correct responses sampled by the pretrained LLM OR is it exploring and discovering new strategies 🤔? Find answers in our latest post ⬇️ tinyurl.com/rlshadis

thumb_up_off_alt147

chat_bubble_outline2

repeat25

shareShare

Silvia Sapora

@silviasapora

5 months ago

🧵 Check out our latest preprint: "Programming by Backprop". What if LLMs could internalize algorithms just by reading code, with no input-output examples? This could reshape how we train models to reason algorithmically. Let's dive into our findings 👇

thumb_up_off_alt70

chat_bubble_outline5

repeat20

shareShare

Laura Ruis

@lauraruis

5 months ago

LLMs can be programmed by backprop 🔎 In our new preprint, we show they can act as fuzzy program interpreters and databases. After being ‘programmed’ with next-token prediction, they can retrieve, evaluate, and even *compose* programs at test time, without seeing I/O examples.

thumb_up_off_alt314

chat_bubble_outline4

repeat50

shareShare

Ulyana Piterbarg

@ulyanapiterbarg

5 months ago

The way programs are represented in training data can have strong effects on generalization & reasoning -- really great (and truly scientific) work

thumb_up_off_alt23

chat_bubble_outline0

repeat2

shareShare

Andrei Lupu

@_andreilupu

5 months ago

Theory of Mind (ToM) is crucial for next gen LLM Agents, yet current benchmarks suffer from multiple shortcomings. Enter 💽 Decrypto, an interactive benchmark for multi-agent reasoning and ToM in LLMs! Work done with Timon Willi & Jakob Foerster at AI at Meta & Foerster Lab for AI Research 🧵👇

thumb_up_off_alt101

chat_bubble_outline4

repeat30

shareShare

Ola Kalisz

@olakalisz8

5 months ago

Antiviral therapy design is myopic 🦠🙈 optimised only for the current strain. That's why you need a different Flu vaccine every year! Our #ICML2025 paper ADIOS proposes "shaper therapies" that steer viral evolution in our favour & remain effective. Work done Foerster Lab for AI Research 🧵👇

thumb_up_off_alt50

chat_bubble_outline1

repeat18

shareShare

Alexi Gladstone

@alexiglad

4 months ago

How can we unlock generalized reasoning? ⚡️Introducing Energy-Based Transformers (EBTs), an approach that out-scales (feed-forward) transformers and unlocks generalized reasoning/thinking on any modality/problem without rewards. TLDR: - EBTs are the first model to outscale the

thumb_up_off_alt1,1K

chat_bubble_outline32

repeat208

shareShare

Micah Goldblum

@micahgoldblum

4 months ago

🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n

thumb_up_off_alt736

chat_bubble_outline22

repeat92

shareShare