Manqing Liu (@manqingliu5) Twitter Tweets • TwiCopy

Manqing Liu

8 months ago

How Claude 3.7 romanticizes functional analysis with striking analogies, this is so beautiful #AI #Math #FunctionalAnalysis

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

We knew very little about how LLMs actually work...until now. Anthropic just dropped the most insane research paper, detailing some of the ways AI "thinks." And it's completely different than we thought. Here are their wild findings: 🧵

We knew very little about how LLMs actually work...until now.

<a href="/AnthropicAI/">Anthropic</a> just dropped the most insane research paper, detailing some of the ways AI "thinks."

And it's completely different than we thought.

Here are their wild findings: 🧵

thumb_up_off_alt10,10K

chat_bubble_outline86

repeat1,1K

shareShare

Dario Amodei

@darioamodei

6 months ago

The Urgency of Interpretability: Why it's crucial that we understand how AI models work darioamodei.com/post/the-urgen…

thumb_up_off_alt2,2K

chat_bubble_outline203

repeat544

shareShare

Neel Nanda

@neelnanda5

5 months ago

After supervising 20+ papers, I have highly opinionated views on writing great ML papers. When I entered the field I found this all frustratingly opaque So I wrote a guide on turning research into high-quality papers with scientific integrity! Hopefully still useful for NeurIPS

thumb_up_off_alt1,1K

chat_bubble_outline18

repeat167

shareShare

Ethan Mollick

@emollick

5 months ago

Huh. Looks like Plato was right. A new paper shows all language models converge on the same "universal geometry" of meaning. Researchers can translate between ANY model's embeddings without seeing the original text. Implications for philosophy and vector databases alike.

thumb_up_off_alt13,13K

chat_bubble_outline402

repeat1,1K

shareShare

Petar Veličković

@petarv_93

4 months ago

the 1st draft 'g' chapter of the geometric deep learning book is live! 🚀 alice enters the magical, branchy world of graphs & gnns 🕸️ (llms are there too!) i've spent 7+ years studying, researching & talking about graphs. this text conveys what i've learnt. more in thread 💎

thumb_up_off_alt192

chat_bubble_outline5

repeat32

shareShare

Neel Nanda

@neelnanda5

4 months ago

I'm very excited about our vision for "mech interp" of CoT: Study reasoning steps and their connections - analogous to activations Don't just read it: study attn, causally intervene, and, crucially, resampling - study the distn over CoTs, not just this one There's lots to do!

thumb_up_off_alt555

chat_bubble_outline4

repeat62

shareShare

Peng Qi

@qi2peng2

3 months ago

Seven years ago, I co-led a paper called 𝗛𝗼𝘁𝗽𝗼𝘁𝗤𝗔 that has motivated and facilitated many #AI #Agents research works since. Today, I'm asking that you stop using HotpotQA blindly for agents research in 2025 and beyond. In my new blog post, I revisit the brief history of

thumb_up_off_alt222

chat_bubble_outline5

repeat44

shareShare

Anthropic

@anthropicai

3 months ago

New Anthropic research: Why do some language models fake alignment while others don't? Last year, we found a situation where Claude 3 Opus fakes alignment. Now, we’ve done the same analysis for 25 frontier LLMs—and the story looks more complex.

thumb_up_off_alt1,1K

chat_bubble_outline25

repeat153

shareShare

Scott Emmons

@emmons_scott

3 months ago

Is CoT monitoring a lost cause due to unfaithfulness? 🤔 We say no. The key is the complexity of the bad behavior. When we replicate prior unfaithfulness work but increase complexity—unfaithfulness vanishes! Our finding: "When Chain of Thought is Necessary, Language Models

thumb_up_off_alt170

chat_bubble_outline6

repeat37

shareShare

Rohin Shah

@rohinmshah

3 months ago

Chain of thought monitoring looks valuable enough that we’ve put it in our Frontier Safety Framework to address deceptive alignment. This paper is a good explanation of why we’re optimistic – but also why it may be fragile, and what to do to preserve it. x.com/balesni/status…

thumb_up_off_alt71

chat_bubble_outline1

repeat6

shareShare

Shakeel

@shakeelhashim

3 months ago

The AI Action Plan is out. Immediate reactions in this thread:

thumb_up_off_alt570

chat_bubble_outline24

repeat73

shareShare

Gabriele Berton

@gabriberton

3 months ago

If you think NeurIPS reviews are getting worse because of LLMs, think again The seminal 2015 distillation paper from Jeff Hinton, Oryol Vinyals, and Jeff Dean was rejected by NeurIPS for lack of impact, was published as a workshop, and it has now 26k citations🤯

thumb_up_off_alt482

chat_bubble_outline4

repeat34

shareShare

alphaXiv

@askalphaxiv

3 months ago

In-context learning is just gradient descent without explicit training! This paper "Learning without training: The implicit dynamics of in-context learning" shows that ICL can be mathematically interpreted as an implicit low-rank weight update during inference.

thumb_up_off_alt649

chat_bubble_outline11

repeat92

shareShare

Emmanuel Ameisen

@mlpowered

3 months ago

Earlier this year, we showed a method to interpret the intermediate steps a model takes to produce an answer. But we were missing a key bit of information: explaining why the model attends to specific concepts. Today, we do just that 🧵

thumb_up_off_alt507

chat_bubble_outline6

repeat55

shareShare

ludwig

@ludwigabap

2 months ago

The "Circuit Analysis Research Landscape" for August 2025 is out and is an interesting read on "the landscape of interpretability methods" and model biology Qwen3 4B is also out on Circuit Tracer

thumb_up_off_alt108

chat_bubble_outline4

repeat13

shareShare

steve hsu

@hsu_steve

2 months ago

Is Chain-of-Thought Reasoning of LLMs a Mirage? ... Our results reveal that CoT reasoning is a brittle mirage that vanishes when it is pushed beyond training distributions. This work offers a deeper understanding of why and when CoT reasoning fails, emphasizing the ongoing

thumb_up_off_alt4,4K

chat_bubble_outline167

repeat799

shareShare

Pingbang Hu 🇹🇼

@pingbanghu

2 months ago

As a PhD student, sometimes I feel isolated. Not from the world, but from myself. A while ago, in an event full of startup founders, someone asked me what do I really want to do, and I said: "I want to do impactful and meaningful things." That's from the bottom of my heart, for

thumb_up_off_alt486

chat_bubble_outline15

repeat15

shareShare

Andrej Karpathy

@karpathy

2 months ago

In era of pretraining, what mattered was internet text. You'd primarily want a large, diverse, high quality collection of internet documents to learn from. In era of supervised finetuning, it was conversations. Contract workers are hired to create answers for questions, a bit

thumb_up_off_alt3,3K

chat_bubble_outline158

repeat397

shareShare

Andrej Karpathy

@karpathy

4 days ago

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,

thumb_up_off_alt16,16K

chat_bubble_outline517

repeat2,2K

shareShare

Manqing Liu

Manqing Liu

MatthewBerman

Dario Amodei

Neel Nanda

Ethan Mollick

Petar Veličković

Neel Nanda

Peng Qi

Anthropic

Scott Emmons

Rohin Shah

Shakeel

Gabriele Berton

alphaXiv

Emmanuel Ameisen

ludwig

steve hsu

Pingbang Hu 🇹🇼

Andrej Karpathy

Andrej Karpathy