Neel Nanda (@neelnanda5) 's Twitter Profile
Neel Nanda

@neelnanda5

Mechanistic Interpretability lead DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!

ID: 1542528075128348674

linkhttp://neelnanda.io calendar_today30-06-2022 15:18:58

4,4K Tweet

25,25K Takipçi

117 Takip Edilen

Neel Nanda (@neelnanda5) 's Twitter Profile Photo

IMO chain of thought monitoring is actually pretty great. It gives additional safety and should be widely used on frontier models CoT improves capabilities. Thoughts are intermediate state of computation. On the hardest tasks, they have real info It's not perfect, but what is?