Lewis Ho (@_lewisho) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

We're excited to share a proposal for evals-based catastrophic risk reduction that AI developers can adopt today: Responsible Scaling Policies (RSPs) that establish conditions under which it would be unsafe to continue advancing AI capabilities without additional safety measures.

thumb_up_off_alt67

chat_bubble_outline1

repeat16

shareShare

harry law (hopfield network truther)

@lawhsw

2 years ago

1/9 Amidst lots of discussion about what an appropriate international governance regime for AI might look like, Lewis Ho and I wrote for @nature about whether an organisation with a ‘dual mandate’ to manage risk and spread benefits could be a promising model to explore

1/9 Amidst lots of discussion about what an appropriate international governance regime for AI might look like, <a href="/_lewisho/">Lewis Ho</a> and I wrote for @nature about whether an organisation with a ‘dual mandate’ to manage risk and spread benefits could be a promising model to explore

thumb_up_off_alt13

chat_bubble_outline2

repeat2

shareShare

Zach Freitas-Groff 🔸

@zdgroff

2 years ago

📈Job market paper time📉 I’m excited to finally share my job market paper! My JMP studies whether and why policy choices are stubbornly persistent. For example, Oregon has an income tax, but Washington doesn’t—seemingly because of nearly century-old choices. Is this typical?

thumb_up_off_alt373

chat_bubble_outline20

repeat73

shareShare

Toby

@tshevl

a year ago

In 2024, the AI community will develop more capable AI systems than ever before. How do we know what new risks to protect against, and what the stakes are? Our research team at Google DeepMind built a set of evaluations to measure potentially dangerous capabilities: 🧵

thumb_up_off_alt229

chat_bubble_outline8

repeat44

shareShare

Lewis Ho

@_lewisho

a year ago

GDM's 1st step towards the ambitious ideals of responsible scaling, these being: identifying AI capabilities that pose severe risk, using evals to detect such capabilities, preparing and articulating mitigations plans, and involving externals in the process as appropriate.

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Sarah Cogan

@sarah_cogan

a year ago

Curious about how we evaluate dangerous capabilities at Google DeepMind? 🤔 The Frontier Safety team just open-sourced resources for our in-house CTF & self-proliferation challenges! Check it out: github.com/google-deepmin…

thumb_up_off_alt189

chat_bubble_outline2

repeat44

shareShare

Allan Dafoe

@allandafoe

10 months ago

We are hiring! Google DeepMind's Frontier Safety and Governance team is dedicated to mitigating frontier AI risks; we work closely with technical safety, policy, responsibility, security, and GDM leadership. Please encourage great people to apply! 1/ boards.greenhouse.io/deepmind/jobs/…

thumb_up_off_alt158

chat_bubble_outline3

repeat47

shareShare

Séb Krier

@sebkrier

10 months ago

Are you tired of reading bad Twitter takes on AGI governance? Do you want to work on some of the most exciting and thorny questions relating to AGI safety and governance? Then you should apply for this Research Scientist position with the Frontier Safety & Governance team ASAP.

thumb_up_off_alt67

chat_bubble_outline2

repeat8

shareShare

Chris Painter

@chrispainteryup

9 months ago

We thought it would be helpful to have all of the similar themes/components from each of Deepmind's Frontier Safety Framework, OpenAI's Preparedness Framework, and Anthropic's Responsible Scaling Policy, in one place.

thumb_up_off_alt65

chat_bubble_outline0

repeat6

shareShare

David Lindner

@davlindner

5 months ago

New Google DeepMind safety paper! LLM agents are coming – how do we stop them finding complex plans to hack the reward? Our method, MONA, prevents many such hacks, *even if* humans are unable to detect them! Inspired by myopic optimization but better performance – details in🧵

thumb_up_off_alt576

chat_bubble_outline16

repeat97

shareShare

Allan Dafoe

@allandafoe

4 months ago

I'm proud of GoogleDeepMind/Google's v2 update to our Frontier Safety Framework. We were the first major tech company to produce an explicit risk management framework for extreme risks, and I'm glad we are continuing to push ahead on safety best practice. deepmind.google/discover/blog/…

thumb_up_off_alt119

chat_bubble_outline3

repeat18

shareShare

Victoria Krakovna

@vkrakovna

4 months ago

We are excited to release a short course on AGI safety! The course offers a concise and accessible introduction to AI alignment problems and our technical & governance approaches, consisting of short recorded talks and exercises (75 minutes total). deepmindsafetyresearch.medium.com/1072adb7912c

thumb_up_off_alt259

chat_bubble_outline5

repeat47

shareShare

Allan Dafoe

@allandafoe

4 months ago

Thanks Rob for a great conversation about important topics: why technology drives history, and the rare opportunity of steering it.

thumb_up_off_alt61

chat_bubble_outline2

repeat8

shareShare

Rohin Shah

@rohinmshah

4 months ago

We're hiring! Join an elite team that sets an AGI safety approach for all of Google -- both through development and implementation of the Frontier Safety Framework (FSF), and through research that enables a future stronger FSF.

thumb_up_off_alt297

chat_bubble_outline11

repeat37

shareShare

Atul Gawande

@atul_gawande

4 months ago

Yesterday, Rubio terminated 5800 USAID contracts – more than 90% of its foreign aid programs – in defiance of the courts. Here’s a list of just some of the lifesaving awards that were terminated. Nearly all were Congressional mandated. They've saved millions of lives. 🧵

thumb_up_off_alt20,20K

chat_bubble_outline3,3K

repeat8,8K

shareShare

Rohin Shah

@rohinmshah

3 months ago

Just released GDM’s 100+ page approach to AGI safety & security! (Don’t worry, there’s a 10 page summary.) AGI will be transformative. It enables massive benefits, but could also pose risks. Responsible development means proactively preparing for severe harms before they arise.

thumb_up_off_alt362

chat_bubble_outline13

repeat68

shareShare

Lewis Ho

@_lewisho

2 months ago

We have updated the Gemini 2.5 Pro model card with results from our FSF evaluations. These continue to be critical for helping us understand how to keep our systems safe amidst the dizzyingly impressive capability improvements.

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Lewis Ho

Gate.io

METR

harry law (hopfield network truther)

Zach Freitas-Groff 🔸

Toby

Lewis Ho

Sarah Cogan

Allan Dafoe

Séb Krier

Chris Painter

David Lindner

Allan Dafoe

Victoria Krakovna

Allan Dafoe

Rohin Shah

Atul Gawande

Rohin Shah

Lewis Ho