CogInterp Workshop @ NeurIPS 2025 (@coginterp) Twitter Tweets • TwiCopy

Justin Angel

4 months ago

At the CogInterp Workshop @ NeurIPS 2025 workshop at NeurIPS. coginterp.github.io/neurips2025/ This slide explains MechIntrep vs CongIntrep:

At the <a href="/CogInterp/">CogInterp Workshop @ NeurIPS 2025</a> workshop at NeurIPS.

coginterp.github.io/neurips2025/

This slide explains MechIntrep vs CongIntrep:

thumb_up_off_alt10

chat_bubble_outline0

repeat4

shareShare

CogInterp Workshop @ NeurIPS 2025

@coginterp

4 months ago

For our second spotlight talk, Yifei Cao, Chonghao Cai, and Liyuan Li present work using hybrid neural-cognitive models to explain strategies in reversal learning

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

Excited to be presenting our work on using cognitive models to interpret pluralistic values in LLMs once again as a spotlight talk 🌟 at the NeurIPS CogInterp workshop! Come by upper level room 5AB today and check out the paper here: arxiv.org/abs/2506.20666

thumb_up_off_alt7

chat_bubble_outline0

repeat2

shareShare

CogInterp Workshop @ NeurIPS 2025

@coginterp

4 months ago

Ari Holtzman Ari Holtzman takes us into the mind of an LLM to help us understand how these models see the world, and what might be a good road forward to studying them

Ari Holtzman <a href="/universeinanegg/">Ari Holtzman</a> takes us into the mind of an LLM to help us understand how these models see the world, and what might be a good road forward to studying them

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

CogInterp Workshop @ NeurIPS 2025

@coginterp

4 months ago

Erin Grant Erin Grant is @NeurIPS discusses dissociations between function and representation, and asks whether representational alignment is enough for understanding deep neural networks

Erin Grant <a href="/ermgrant/">Erin Grant is @NeurIPS</a> discusses dissociations between function and representation, and asks whether representational alignment is enough for understanding deep neural networks

thumb_up_off_alt10

chat_bubble_outline1

repeat1

shareShare

CogInterp Workshop @ NeurIPS 2025

@coginterp

4 months ago

For our third spotlight talk, Sonia Murthy Sonia Murthy @ NeurIPS25 uses probabilistic cognitive models to understand value trade-offs in LLMs that enable pragmatic reasoning about politeness in speech acts

For our third spotlight talk, Sonia Murthy <a href="/soniakmurthy/">Sonia Murthy @ NeurIPS25</a> uses probabilistic cognitive models to understand value trade-offs in LLMs that enable pragmatic reasoning about politeness in speech acts

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

CogInterp Workshop @ NeurIPS 2025

@coginterp

4 months ago

In our fourth spotlight talk, neural network legend Paul Smolensky uses symbolic programs such as production systems to understand how neural networks process symbols

thumb_up_off_alt21

chat_bubble_outline0

repeat3

shareShare

CogInterp Workshop @ NeurIPS 2025

@coginterp

4 months ago

Swing by a super happening poster session where ML and CogSci meet!

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

CogInterp Workshop @ NeurIPS 2025

@coginterp

4 months ago

A big crowd for Jay McClelland’s talk!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

CogInterp Workshop @ NeurIPS 2025

@coginterp

4 months ago

Our final speaker Sydney Levine makes a radical proposal: building computational models of human moral judgements to use as an AI system for making moral judgements.

Our final speaker <a href="/sydneymlevine/">Sydney Levine</a> makes a radical proposal: building computational models of human moral judgements to use as an AI system for making moral judgements.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

CogInterp Workshop @ NeurIPS 2025

@coginterp

4 months ago

We are about to start our panel discussion, join us for some hot takes about what cognitive interpretability should be about.

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

CogInterp Workshop @ NeurIPS 2025

@coginterp

4 months ago

Our Best Paper Award goes to Nathaniel Imel and Noga Zaslavsky Noga Zaslavsky for their excellent paper “Culturally transmitted color categories in LLMs reflect a learning bias toward efficient compression”!

Our Best Paper Award goes to Nathaniel Imel and Noga Zaslavsky <a href="/NogaZaslavsky/">Noga Zaslavsky</a> for their excellent paper “Culturally transmitted color categories in LLMs reflect a learning bias toward efficient compression”!

thumb_up_off_alt12

chat_bubble_outline0

repeat1

shareShare

Ari Holtzman

@universeinanegg

4 months ago

this was so awesome. Jay still killin' it five decades later

thumb_up_off_alt39

chat_bubble_outline3

repeat1

shareShare

Noga Zaslavsky

@nogazaslavsky

4 months ago

Honored and thrilled that our work received the CogInterp Workshop @ NeurIPS 2025 best paper award! 💫 📄 Extended paper: arxiv.org/pdf/2509.08093 🧵 Highlights: x.com/NogaZaslavsky/… NeurIPS Conference #NeurIPS2025

thumb_up_off_alt34

chat_bubble_outline2

repeat5

shareShare

Christopher Potts

@chrisgpotts

4 months ago

Safety-oriented interpretability researchers should be focused on AI systems, not individual model artifacts. A snippet from the NeurIPS CogInterp workshop panel on Sunday:

thumb_up_off_alt162

chat_bubble_outline4

repeat18

shareShare

Goodfire

@goodfireai

4 months ago

Our last Stanford guest lecture - Ekdeep Singh is @NeurIPS on what counts as an explanation & a neuro-inspired "model systems approach" to interp Plus, how in-context learning and many-shot jailbreaking are explained by LLM representations changing in-context (as a case study for that approach)

thumb_up_off_alt116

chat_bubble_outline3

repeat24

shareShare