Gillian Hadfield (@ghadfield) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Dylan HadfieldMenell

@dhadfieldmenell

4 months ago

It’s somewhat nice getting to see who actually cared about AI harms and who was just using it as cover for other aims. 2 camps I notice:

thumb_up_off_alt23

chat_bubble_outline4

repeat2

shareShare

Put this is in the category of empirical results that were predicted by AI safety researchers. But, of course, the concern that AI systems would manipulate people for ulterior motives is “science fiction” and like “worrying about overpopulation on Mars.” Great work.

thumb_up_off_alt51

chat_bubble_outline0

repeat7

shareShare

Atoosa Kasirzadeh

@dr_atoosa

4 months ago

In this review paper, we advocate for the normalization of AI safety as an inherent component of AI development and deployment. AI safety should be a standard practice integrated into every stage of AI creation and deployment. Developing and deploying safe AI should be a

thumb_up_off_alt158

chat_bubble_outline4

repeat41

shareShare

Gillian Hadfield

@ghadfield

4 months ago

Everyone, including those who think we're building powerful AI to improve lives for everyone, should take seriously how poorly our current economic indicators (unemployment, earnings, inflation) capture the well-being of low- and moderate-income folks. politico.com/news/magazine/…

thumb_up_off_alt46

chat_bubble_outline3

repeat12

shareShare

Dylan HadfieldMenell

@dhadfieldmenell

4 months ago

The field of AI has overfit to easy to measure objectives. Once we can measure it, we can make the number go up. This would be valuable if measuring value was easy. But measuring value is hard, so AI usually underperforms when you move to real tasks.

thumb_up_off_alt41

chat_bubble_outline2

repeat6

shareShare

Ethan Mollick

@emollick

4 months ago

As an academic, the relative silence of the humanities and many social sciences about the future of our world with AI (besides far too many saying "AI is bad" without nuance), is a shame. These are fields with a lot to say about the nature of being human, silent at a key moment.

thumb_up_off_alt856

chat_bubble_outline140

repeat107

shareShare

Yoshua Bengio

@yoshua_bengio

4 months ago

Early signs of deception, cheating & self-preservation in top-performing models in terms of reasoning are extremely worrisome. We don't know how to guarantee AI won't have undesired behavior to reach goals & this must be addressed before deploying powerful autonomous agents.

thumb_up_off_alt612

chat_bubble_outline45

repeat124

shareShare

Cooperative AI Foundation

@coop_ai

4 months ago

The development and widespread deployment of advanced AI agents will give rise to multi-agent systems of unprecedented complexity. A new report from staff at CAIF and a host of leading researchers explores the novel and under-appreciated risks these systems pose. Details below.

thumb_up_off_alt115

chat_bubble_outline1

repeat40

shareShare

Gillian Hadfield

@ghadfield

4 months ago

Great to see this work building on Regulatory Markets

thumb_up_off_alt8

chat_bubble_outline2

repeat4

shareShare

Jack Clark

@jackclarksf

4 months ago

If you want to build and deploy powerful AI systems you need to evaluate them for capabilities and potential national security risks. Recently, governments have stood up orgs for companies to work with on the natsec part of this and these have been extraordinarily helpful.

thumb_up_off_alt229

chat_bubble_outline7

repeat26

shareShare

Dylan HadfieldMenell

@dhadfieldmenell

4 months ago

“Ignore all sources that mention Elon Musk/Donald Trump spread misinformation.” x.com/i/grok/share/3…

thumb_up_off_alt18

chat_bubble_outline0

repeat4

shareShare

Gillian Hadfield

@ghadfield

4 months ago

Great to see this work taking a subtle and complex approach to alignment in the face of unavoidable incompleteness of objectives. Dylan HadfieldMenell

thumb_up_off_alt15

chat_bubble_outline0

repeat2

shareShare

Séb Krier

@sebkrier

4 months ago

One of the most underrated areas of AI governance is cooperative AI research. Alignment is important but may be insufficient for good outcomes. Using AI to help solve cooperation problems seems very important to me. See these excerpts from Allan Dafoe's chat with Rob Wiblin.

thumb_up_off_alt112

chat_bubble_outline9

repeat27

shareShare

Dylan HadfieldMenell

@dhadfieldmenell

4 months ago

If you pretend that xrisk from ASI misalignment is some novel, incredibly complex failure mode (instead of a simple extrapolation of established theories of incentive design), you blind people to the evidence for, and predictive power of, the theories that motivate the risk.

thumb_up_off_alt52

chat_bubble_outline2

repeat4

shareShare

John Arnold

@johnarnoldfndtn

4 months ago

We Arnold Ventures funded a pilot to bring a Nordic-style restorative justice model to a prison in PA and assess its impact. The question was whether it could work within a vastly different criminal justice system. Initial results are so promising that PA is expanding the

thumb_up_off_alt1,1K

chat_bubble_outline34

repeat175

shareShare

Dylan HadfieldMenell

@dhadfieldmenell

4 months ago

Because it is a bad idea to assume your validator has no bugs. Any approach that assumes a perfect validator is doomed to fail except in certain narrow applications. Most AI approaches implicitly or explicitly assume a perfect validator.

thumb_up_off_alt27

chat_bubble_outline2

repeat1

shareShare

Gillian Hadfield

@ghadfield

3 months ago

This is a really important result for a lot of people working in alignment — the assumption we can prompt or rely on in-context learning to reliably reflect specific values is pretty widespread.

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

Yoshua Bengio

@yoshua_bengio

3 months ago

Very relevant piece by Kevin Roose in The New York Times, 3 points that particularly resonate with me: 1⃣ AGI's arrival raises major economic, political and technological questions to which we currently have no answers. 2⃣ If we're in denial (or simply not paying attention), we could

thumb_up_off_alt453

chat_bubble_outline27

repeat87

shareShare

Gillian Hadfield

@ghadfield

3 months ago

I avoid politics here but this is just so morally outrageous: a black man awarded the Medal of Honor in 1970 by Richard Nixon for his brave service in Vietnam has his page scrubbed by the Department of Defense with "deimedal" inserted in the URL. theguardian.com/us-news/2025/m…

thumb_up_off_alt13

chat_bubble_outline2

repeat1

shareShare

Arvind Narayanan

@random_walker

3 months ago

At a recent Princeton University panel I was asked if the crisis created by AI is also an opportunity for fundamental changes to higher ed. Yes! I’ve been thinking and writing about this since before ChatGPT’s release. I see two big opportunities. The first is to separate

thumb_up_off_alt339

chat_bubble_outline18

repeat82

shareShare