Tom Everitt (@tom4everitt) 's Twitter Profile
Tom Everitt

@tom4everitt

AGI safety researcher at @GoogleDeepMind, leading causalincentives.com

switching to
bsky.app/profile/tom4ev…

ID: 899746531904901121

linkhttp://tomeveritt.se calendar_today21-08-2017 21:34:24

215 Tweet

1,1K Followers

702 Following

Matt MacDermott (@mattmacdermott1) 's Twitter Profile Photo

🔦 @NeurIPS2024 spotlight paper we’re presenting today. Making AI systems more agentic is a hot research topic. But powerful agents bring worries about misalignment and loss of control. Can we measure how agentic an AI system is? 🧵

🔦 @NeurIPS2024 spotlight paper we’re presenting today. Making AI systems more agentic is a hot research topic. But powerful agents bring worries about misalignment and loss of control. Can we measure how agentic an AI system is? 🧵
Tom Everitt (@tom4everitt) 's Twitter Profile Photo

One thing that I really like about this is that my content is much less determined by who I follow, than by which posts I like. This means I can express my approval for a post, without worrying that similar content will now flood my feed.

Francis Rhys Ward (@f_rhys_ward) 's Twitter Profile Photo

In real-life, agents with different subjective beliefs interact in a shared objective reality. They have higher-order beliefs about each other's beliefs and goals, which is required for phenomena involving theory-of-mind, like deception Our paper formalises this in causal models

In real-life, agents with different subjective beliefs interact in a shared objective reality. They have higher-order beliefs about each other's beliefs and goals, which is required for phenomena involving theory-of-mind, like deception

Our paper formalises this in causal models
Tom Everitt (@tom4everitt) 's Twitter Profile Photo

Causality is about predicting how interventions affect outcomes. Can we use causality to predict how environment changes affect agent behavior? We explore this idea in a new paper

Michael Dennis (@michaeld1729) 's Twitter Profile Photo

Someone needs to use this as the basis of an unsupervised environment design algorithm to give AI designers direct control over agent behavior