Ryan Kidd (@ryan_kidd44) Twitter Tweets • TwiCopy

Ryan Kidd

@ryan_kidd44

+ Follow

Co-Director @MATSprogram, Co-Founder @LondonSafeAI, Regrantor @Manifund | PhD in physics | Accelerate AI alignment + build a better future for all

ID: 1102399276334759936

calendar_today04-03-2019 02:44:03

1,1K Tweet

1,1K Followers

1,1K Following

Marius Hobbhahn

@mariushobbhahn

a month ago

We're hiring for Research Scientists / Engineers! - We closely work with all frontier labs - We're a small org and can move fast - We can choose our own agenda and what we publish We're especially looking for people who enjoy fast empirical research. Deadline: 31 Oct!

thumb_up_off_alt723

chat_bubble_outline17

repeat70

shareShare

Vaidehi Agarwalla

@vaidehiagrwalla

a month ago

we're hiring for several roles at Theorem - reach out if you're exploring! - ml research scientists - ml research engineers - compiler engineers - senior swe's

thumb_up_off_alt29

chat_bubble_outline1

repeat4

shareShare

Wenx

@firebirdwen

a month ago

🔔New paper: Can reasoning models hide their reasoning? We stress-tested Chain-of-Thought (CoT) monitoring and found that while monitors detect ~96% of hidden malicious intent under normal conditions, ⚠️detection can collapse to ~10% under strong obfuscation pressure. 🧵

thumb_up_off_alt15

chat_bubble_outline9

repeat3

shareShare

Scott Alexander

@slatestarcodex

a month ago

Sriram Krishnan Ryonan Dean W. Ball Thanks for your interest. I'm not expecting too much danger in the next 18 months, so these would mostly be small updates, but to answer the question: MORE WORRIED: - Anything that looks like shorter timelines, especially superexponential progress on METR time horizons graph or

thumb_up_off_alt208

chat_bubble_outline6

repeat6

shareShare

Yo Shavit

@yonashav

a month ago

Séb Krier and I are pretty floored by the quality of MATS applicants

<a href="/sebkrier/">Séb Krier</a> and I are pretty floored by the quality of MATS applicants

thumb_up_off_alt341

chat_bubble_outline9

repeat10

shareShare

Stanislav Fort

@stanislavfort

24 days ago

In 2025, only 4 security vulnerabilities with CVEs were disclosed in OpenSSL = the crypto library securing most of the internet. AISLE Aisle's autonomous AI system discovered 3 out of the 4. And proposed the fixes that remediated them.

In 2025, only 4 security vulnerabilities with CVEs were disclosed in OpenSSL = the crypto library securing most of the internet.

AISLE <a href="/WeAreAisle/">Aisle</a>'s autonomous AI system discovered 3 out of the 4. And proposed the fixes that remediated them.

thumb_up_off_alt158

chat_bubble_outline10

repeat23

shareShare

AI Impacts

@aiimpacts

21 days ago

Our surveys’ findings that AI researchers assign a median 5-10% to extinction or similar made a splash (NYT, NBC News, TIME..) But people sometimes underestimate our survey’s methodological quality due to various circulating misconceptions. Today, an FAQ correcting key errors:

thumb_up_off_alt81

chat_bubble_outline2

repeat16

shareShare

Sharan

@_maiush

17 days ago

AI that is “forced to be good” v “genuinely good” Should we care about the difference? (yes!) We’re releasing the first open implementation of character training. We shape the persona of AI assistants in a more robust way than alternatives like prompting or activation steering.

thumb_up_off_alt175

chat_bubble_outline4

repeat37

shareShare

Andy Masley

@andymasley

15 days ago

Hammered out some thoughts on why I was motivated to post a lot about data centers: the popular conversation about them has been disproportionately informed by very low-trust intuitions I think are bad andymasley.substack.com/p/data-centers…

thumb_up_off_alt253

chat_bubble_outline13

repeat27

shareShare

Daniel Kokotajlo

@dkokotajlo

9 days ago

Common ground between the authors of AI 2027 and AI as Normal Technology! Coauthored article below.

thumb_up_off_alt401

chat_bubble_outline24

repeat67

shareShare

Mike McCormick

@mikemccormick_

9 days ago

This post is an experiment! I want to fund & help accomplished people and nascent companies + nonprofits working to make AI safe, secure and good for humanity. If you're doing that, or know somebody super credible who is, ping me. And if you're a fan of Halcyon Futures' work,

thumb_up_off_alt20

chat_bubble_outline0

repeat7

shareShare

Ryan Kidd

@ryan_kidd44

6 days ago

I wrote a blog post on why I think the AI safety ecosystem undervalues founders and field-builders and what to do about it! lesswrong.com/posts/yw9B5jQa…

thumb_up_off_alt18

chat_bubble_outline0

repeat2

shareShare

Ryan Kidd

@ryan_kidd44

5 days ago

As a counterpoint to "e/accs", I like the label "AI safers". This is: - Less unwieldy than "AI notkilleveryoneists" - More accurate than "AI doomers" - More inclusive than "EAs" "Safer" also implies that AI can be made more safe by gradation, rather than an absolutist term.

thumb_up_off_alt5

chat_bubble_outline1

repeat0

shareShare

Ryan Kidd

@ryan_kidd44

6 hours ago

Hackathon for AI safety! Online from 7:30 pm today and in SF all day tomorrow.

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare