Joshua Clymer (@joshua_clymer) 's Twitter Profile
Joshua Clymer

@joshua_clymer

Sorting out safety evaluation methodologies at Redwood Research.

ID: 1515777073163427840

calendar_today17-04-2022 19:40:15

377 Tweet

2,2K Followers

102 Following

Neel Nanda (@neelnanda5) 's Twitter Profile Photo

New post: I'm all for investment in interpretability but IMO this overstates its importance vs other safety methods I disagree that interp is the only path to reliable safeguards on powerful AI. IMO high reliability is implausible by any means and interp's role is in a portfolio

New post: I'm all for investment in interpretability but IMO this overstates its importance vs other safety methods

I disagree that interp is the only path to reliable safeguards on powerful AI. IMO high reliability is implausible by any means and interp's role is in a portfolio
Joshua Clymer (@joshua_clymer) 's Twitter Profile Photo

AI companies can potentially share internal misalignment incidents without hurting their reputation -- just give them to a third party to publish who will anonymize their origin.

Joshua Clymer (@joshua_clymer) 's Twitter Profile Photo

I'm glad for more serious investigation of existential risk from think tanks, but I think ASI will be much better at identifying paths to human extinction than these authors. I wish they did not make sweeping claims like "Extinction threats posed by AI are immensely challenging"

Joshua Clymer (@joshua_clymer) 's Twitter Profile Photo

no matter what you think about ai alignment, we'll eventually need gov oversight of AI to preserve democracy. otherwise, there's no way to stop ASI from manipulating voters "If I don't do it, my competitor will." those will be the last words of the free world

Joshua Clymer (@joshua_clymer) 's Twitter Profile Photo

The greatest AI security threat isn’t stealing secrets—it’s planting secret loyalties into models. Whoever controls the AI that seeds an intelligence explosion controls the future.

Joshua Clymer (@joshua_clymer) 's Twitter Profile Photo

Principles for a positive ASI future: - People maintain at least their current standards of health, living conditions, and political representation. - The benefits created by ASI are distributed so as to promote equitable human empowerment. - Humans can control the resources at

Inference (@inferencemag) 's Twitter Profile Photo

Inference is hosting some of the world’s leading experts for a debate on the possibility and potential consequences of automated AI research. The debate will be hosted in London on July 1st. There are limited spaces available. Register your interest below

Inference is hosting some of the world’s leading experts for a debate on the possibility and potential consequences of automated AI research. 

The debate will be hosted in London on July 1st. There are limited spaces available. Register your interest below