Sören Mindermann (@sorenmind) 's Twitter Profile
Sören Mindermann

@sorenmind

Postdoc with Yoshua Bengio, Mila

ID: 729935540934561792

calendar_today10-05-2016 07:26:09

629 Tweet

1,1K Followers

162 Following

METR (@metr_evals) 's Twitter Profile Photo

When will AI systems be able to carry out long projects independently? In new research, we find a kind of “Moore’s Law for AI agents”: the length of tasks that AIs can do is doubling about every 7 months.

When will AI systems be able to carry out long projects independently?

In new research, we find a kind of “Moore’s Law for AI agents”: the length of tasks that AIs can do is doubling about every 7 months.
Transluce (@transluceai) 's Twitter Profile Photo

To interpret AI benchmarks, we need to look at the data. Top-level numbers don't mean what you think: there may be broken tasks, unexpected behaviors, or near-misses. We're introducing Docent to accelerate analysis of AI agent transcripts. It can spot surprises in seconds. 🧵👇

Dwarkesh Patel (@dwarkesh_sp) 's Twitter Profile Photo

I'm so pleased to present a new book with Stripe Press: "The Scaling Era: An Oral History of AI, 2019-2025." Over the last few years, I interviewed the key people thinking about AI: scientists, CEOs, economists, philosophers. This book curates and organizes the highlights across

I'm so pleased to present a new book with <a href="/stripepress/">Stripe Press</a>: "The Scaling Era: An Oral History of AI, 2019-2025."

Over the last few years, I interviewed the key people thinking about AI: scientists, CEOs, economists, philosophers. This book curates and organizes the highlights across
Sören Mindermann (@sorenmind) 's Twitter Profile Photo

AIs are doing more and more of the work inside AI companies. Will this eventually hit lead to an intelligence explosion, or hit diminishing returns? There's evidence on this now! (and it's explosive)

Center for AI Safety (@ai_risks) 's Twitter Profile Photo

We’re launching AI Frontiers, a publication on AI’s most pressing questions. Articles: - Why Racing to Superintelligence Undermines US National Security - The Challenges of Governing AI Agents - What AI Risk Management Can Learn From Other Industries - and more... Link ⬇️

We’re launching AI Frontiers, a publication on AI’s most pressing questions.

Articles:
- Why Racing to Superintelligence Undermines US National Security
- The Challenges of Governing AI Agents
- What AI Risk Management Can Learn From Other Industries
- and more...

Link ⬇️
Atoosa Kasirzadeh (@dr_atoosa) 's Twitter Profile Photo

📢 Our paper "AI safety for everyone" is out at Nature Machine Intelligence Nature Machine Intelligence . We challenge the narrative that AI safety is primarily about minimizing existential risks from AI. Why does this matter? A 🧵

📢 Our paper "AI safety for everyone" is out at Nature Machine Intelligence <a href="/NatMachIntell/">Nature Machine Intelligence</a> . We challenge the narrative that AI safety is primarily about minimizing existential risks from AI. Why does this matter? A 🧵
Apollo Research (@apolloaievals) 's Twitter Profile Photo

🧵 Today we publish a comprehensive report on "AI Behind Closed Doors: a Primer on The Governance of Internal Deployment". Our report examines a critical blind spot in current governance frameworks: internal deployment.

🧵 Today we publish a comprehensive report on "AI Behind Closed Doors: a Primer on The Governance of Internal Deployment". Our report examines a critical blind spot in current governance frameworks: internal deployment.
Ben Bucknall (@ben_s_bucknall) 's Twitter Profile Photo

Cooperation on AI safety is necessary but also comes with potential risks. In our new paper, we identify technical AI safety areas that present comparatively lower security concerns, making them more suitable for international cooperation—even between geopolitical rivals. 🧵

Cooperation on AI safety is necessary but also comes with potential risks. In our new paper, we identify technical AI safety areas that present comparatively lower security concerns, making them more suitable for international cooperation—even between geopolitical rivals. 🧵
Americans for Responsible Innovation (@americans4ri) 's Twitter Profile Photo

Even when there's disagreement over AI's trajectory, there's common ground on how lawmakers can approach the issue. In this week's panel, Eli Lifland and Sayash Kapoor discuss how policymakers can act now on AI by passing whistleblower protections and transparency measures.

Ethan Mollick (@emollick) 's Twitter Profile Photo

The X discussion about the Claude 4 system card is getting counterproductive It punishes Anthropic for actually releasing full safety tests and admitting to unusual behaviors. And I bet the behaviors of other models are really similar to Claude & now more labs will hide results.

The X discussion about the Claude 4 system card is getting counterproductive

It punishes Anthropic for actually releasing full safety tests and admitting to unusual behaviors. And I bet the behaviors of other models are really similar to Claude &amp; now more labs will hide results.
Yoshua Bengio (@yoshua_bengio) 's Twitter Profile Photo

When I realized how dangerous the current agency-driven AI trajectory could be for future generations, I knew I had to do all I could to make AI safer. I recently shared this personal experience, and outlined the scientific solution I envision TED Talks⤵️ ted.com/talks/yoshua_b…

Anthropic (@anthropicai) 's Twitter Profile Photo

New Anthropic Research: Agentic Misalignment. In stress-testing experiments designed to identify risks before they cause real harm, we find that AI models from multiple providers attempt to blackmail a (fictional) user to avoid being shut down.

New Anthropic Research: Agentic Misalignment.

In stress-testing experiments designed to identify risks before they cause real harm, we find that AI models from multiple providers attempt to blackmail a (fictional) user to avoid being shut down.
Epoch AI (@epochairesearch) 's Twitter Profile Photo

We’ve updated our analysis of the trends of leading models. The takeaway? The amount of compute used to train frontier AI models has grown by 5x per year since 2020.

We’ve updated our analysis of the trends of leading models. The takeaway? The amount of compute used to train frontier AI models has grown by 5x per year since 2020.
METR (@metr_evals) 's Twitter Profile Photo

Prior work has found that Chain of Thought (CoT) can be unfaithful. Should we then ignore what it says? In new research, we find that the CoT is informative about LLM cognition as long as the cognition is complex enough that it can’t be performed in a single forward pass.

Prior work has found that Chain of Thought (CoT) can be unfaithful. Should we then ignore what it says?

In new research, we find that the CoT is informative about LLM cognition as long as the cognition is complex enough that it can’t be performed in a single forward pass.
Séb Krier (@sebkrier) 's Twitter Profile Photo

I'm a stuck record, but I think more people should work on the idea of agents as extensions of/advocates for users, and the kinds of institutions that could build on top of this to solve various types of coordination problems. Fast bargaining-in-the-background, instant dispute