Taylor Webb (@taylorwwebb) Twitter Tweets • TwiCopy

good girl

@goodgirlxsz

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Introducing our new work on mechanistic intepretability of LLM cognition🤖🧠: why do Transformer-based LLMs have limited working memory capacity, as measured by N-back tasks? (1/7) openreview.net/pdf?id=dXjQgm9…

thumb_up_off_alt17

chat_bubble_outline1

repeat4

shareShare

Earl K. Miller

@millerlabmit

a year ago

More evidence that working memory is not persistent activity. Instead, it is dynamic on/off states with short-term synaptic plasticity. Intermittent rate coding and cue-specific ensembles support working memory nature.com/articles/s4158… #neuroscience

thumb_up_off_alt304

chat_bubble_outline5

repeat81

shareShare

Laura Ruis

@lauraruis

a year ago

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

thumb_up_off_alt966

chat_bubble_outline24

repeat208

shareShare

Taylor Webb

@taylorwwebb

a year ago

This looks like a very useful and important contribution!

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Taylor Webb

@taylorwwebb

a year ago

It would be great to have a precise enough formulation of ‘approximate retrieval’ for this hypothesis to be rigorously tested. There is a concern that virtually any task can be characterized in this way, by appealing to a vague notion of similarity with other tasks.

thumb_up_off_alt12

chat_bubble_outline1

repeat0

shareShare

Valentina Pyatkin

@valentina__py

a year ago

Open Post-Training recipes! Some of my personal highlights: 💡 We significantly scaled up our preference data! (using more than 330k preference pairs for our 70b model!) 💡 We used RL with Verifiable Rewards to improve targeted skills like math and precise instruction following

thumb_up_off_alt159

chat_bubble_outline2

repeat29

shareShare

Matthias Michel

@matthiasmichel_

a year ago

In this new preprint @smfleming and I present a theory of the functions and evolution of conscious vision. This is a big project: osf.io/preprints/psya…. We'd love to get your comments!

thumb_up_off_alt54

chat_bubble_outline2

repeat17

shareShare

Alexa R. Tartaglini

@artartaglini

a year ago

🚨 New paper at NeurIPS Conference w/ Michael Lepori! Most work on interpreting vision models focuses on concrete visual features (edges, objects). But how do models represent abstract visual relations between objects? We adapt NLP interpretability techniques for ViTs to find out! 🔍

🚨 New paper at <a href="/NeurIPSConf/">NeurIPS Conference</a> w/ <a href="/Michael_Lepori/">Michael Lepori</a>! Most work on interpreting vision models focuses on concrete visual features (edges, objects). But how do models represent abstract visual relations between objects? We adapt NLP interpretability techniques for ViTs to find out! 🔍

thumb_up_off_alt260

chat_bubble_outline2

repeat37

shareShare

Michael Lepori

@michael_lepori

a year ago

Even ducklings🐣can represent abstract visual relations. Can your favorite ViT? In our new NeurIPS Conference paper, we use mechanistic interpretability to find out!

thumb_up_off_alt23

chat_bubble_outline0

repeat4

shareShare

Hope Kean

@hopekean

a year ago

New paper with Alexander Fung, Pramod RT/ಪ್ರಮೋದ್ ರಾ ತಾ , Jessica Chomik , Nancy Kanwisher @[email protected], Ev (like in 'evidence', not Eve) Fedorenko 🇺🇦 on the representations that underlie our intuitive physical reasoning about the world. Thread 🧵about our new preprint 📄✨linked here: tinyurl.com/intphyslang 1/10

thumb_up_off_alt92

chat_bubble_outline1

repeat28

shareShare

Dylan Foster 🐢

@canondetortugas

a year ago

Given a high-quality verifier, language model accuracy can be improved by scaling inference-time compute (e.g., w/ repeated sampling). When can we expect similar gains without an external verifier? New paper: Self-Improvement in Language Models: The Sharpening Mechanism

thumb_up_off_alt255

chat_bubble_outline3

repeat49

shareShare

François Chollet

@fchollet

a year ago

Today OpenAI announced o3, its next-gen reasoning model. We've worked with OpenAI to test it on ARC-AGI, and we believe it represents a significant breakthrough in getting AI to adapt to novel tasks. It scores 75.7% on the semi-private eval in low-compute mode (for $20 per task

thumb_up_off_alt8,8K

chat_bubble_outline204

repeat1,1K

shareShare

Taylor Webb

@taylorwwebb

a year ago

Truly incredible results. I have been impressed with o1’s capabilities but certainly didn’t expect this leap.

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Mikel Bober-Irizar

@mikb0b

a year ago

Why do pre-o3 LLMs struggle with generalization tasks like ARC Prize? It's not what you might think. OpenAI o3 shattered the ARC-AGI benchmark. But the hardest puzzles didn’t stump it because of reasoning, and this has implications for the benchmark as a whole. Analysis below🧵

Why do pre-o3 LLMs struggle with generalization tasks like <a href="/arcprize/">ARC Prize</a>? It's not what you might think.

OpenAI o3 shattered the ARC-AGI benchmark. But the hardest puzzles didn’t stump it because of reasoning, and this has implications for the benchmark as a whole.

Analysis below🧵

thumb_up_off_alt666

chat_bubble_outline19

repeat72

shareShare

Stephanie Chan

@scychan_brains

9 months ago

New work led by Aaditya Singh: "Strategy coopetition explains the emergence and transience of in-context learning in transformers." We find some surprising things!! E.g. that circuits can simultaneously compete AND cooperate ("coopetition") 😯 🧵👇

thumb_up_off_alt41

chat_bubble_outline1

repeat8

shareShare

Mengdi Wang

@mengdiwang10

8 months ago

🚨 Discover the Science of LLM! We uncover how LLMs (Llama3-70B) achieve abstract reasoning through emergent symbolic mechanisms: 1️⃣ Symbol Abstraction Heads: Early layers convert input tokens into abstract variables based on their relationships. 2️⃣ Symbolic Induction Heads:

thumb_up_off_alt167

chat_bubble_outline4

repeat35

shareShare

Brian Odegaard

@brianodegaard2

6 months ago

Led by postdoc Doyeon Lee and grad student Joseph Pruitt, our lab has a new Perspectives piece in PNAS Nexus: "Metacognitive sensitivity: The key to calibrating trust and optimal decision-making with AI" academic.oup.com/pnasnexus/arti… With co-authors Tianyu Zhou and Eric Du 1/

thumb_up_off_alt9

chat_bubble_outline1

repeat3

shareShare

Raphaël Millière

@raphaelmilliere

4 months ago

The final version of this paper has now been published in open access in the Journal of Memory and Language (link below). This was a long-running but very rewarding project. Here are a few thoughts on our methodology and main findings. 1/9

thumb_up_off_alt165

chat_bubble_outline4

repeat36

shareShare

Tom McCoy

@rtommccoy

4 months ago

🤖🧠 NEW PAPER ON COGSCI & AI 🧠🤖 Recent neural networks capture properties long thought to require symbols: compositionality, productivity, rapid learning So what role should symbols play in theories of the mind? For our answer...read on! Paper: arxiv.org/abs/2508.05776 1/n

thumb_up_off_alt164

chat_bubble_outline4

repeat33

shareShare

Taylor Webb

good girl

Dongyu Gong

Earl K. Miller

Laura Ruis

Taylor Webb

Taylor Webb

Valentina Pyatkin

Matthias Michel

Alexa R. Tartaglini

Michael Lepori

Hope Kean

Dylan Foster 🐢

François Chollet

Taylor Webb

Mikel Bober-Irizar

Stephanie Chan

Mengdi Wang

Brian Odegaard

Raphaël Millière

Tom McCoy