Tyler John (@tyler_m_john) Twitter Tweets • TwiCopy

Kevin Roose

6 days ago

Podcast hosts: never forget to ask your silly joke questions, like "should there be a Polymarket-for-kids inside Roblox?" You never know when you'll get a sincere, enthusiastic yes.

thumb_up_off_alt1,1K

chat_bubble_outline19

repeat107

shareShare

Agree with Sriram that an update is appropriate, but most reactions miss that updating 6 months into the prediction is a feature, not a bug, of writing out concrete scenarios. We want more people to put their neck on the line with concrete, path-dependent predictions rather than

thumb_up_off_alt75

chat_bubble_outline2

repeat5

shareShare

Neel Nanda

@neelnanda5

5 days ago

New video: If a future LLM is dangerously misaligned, could we tell? I don't know and this is an issue. I discuss the emerging area of science of misalignment: what does misaligned cognition look like and *why* might LLMs act misaligned? Plus: Sound interesting? Apply to MATS!

thumb_up_off_alt135

chat_bubble_outline4

repeat10

shareShare

Ilya Sutskever

@ilyasut

5 days ago

Important work

thumb_up_off_alt5,5K

chat_bubble_outline223

repeat384

shareShare

Nick Matarese

@nmatares

5 days ago

Nano Banana Pro can solve crosswords. Took about 180s in AI Studio. "Solve it. Use red pen."

thumb_up_off_alt1,1K

chat_bubble_outline26

repeat75

shareShare

Tyler John

@tyler_m_john

4 days ago

Oh my. Last I heard labs had given up on solving learning and were just going to scale up RL-augmented in-context learning. Will follow experimental results in this paradigm with great interest.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Dwarkesh Patel

@dwarkesh_sp

3 days ago

Tomorrow

thumb_up_off_alt11,11K

chat_bubble_outline643

repeat467

shareShare

Tyler John

@tyler_m_john

3 days ago

This is a dream job

thumb_up_off_alt21

chat_bubble_outline2

repeat1

shareShare

Joel Becker

@joel_bkr

3 days ago

How might METR' time horizon trend change if compute growth slows? In a new paper, Parker Whitfill, Ben Snodin, and I show that trends + a common (and contestable -- read on!) economic model of algorithmic progress can imply substantial delays in AI capability milestones.

How might <a href="/METR_Evals/">METR</a>' time horizon trend change if compute growth slows?

In a new paper, <a href="/whitfill_parker/">Parker Whitfill</a>, <a href="/bsnodin/">Ben Snodin</a>, and I show that trends + a common (and contestable -- read on!) economic model of algorithmic progress can imply substantial delays in AI capability milestones.

thumb_up_off_alt145

chat_bubble_outline7

repeat27

shareShare

Apollo Research

@apolloaievals

3 days ago

“Loss of control” lacks a common, actionable, definition and conceptualization. In our new research report we: 1) propose a new taxonomy, 2) put forward actionable mitigations today, and 3) motivate the need for preparedness. We propose a taxonomy for loss of control 👇🧵

thumb_up_off_alt38

chat_bubble_outline4

repeat12

shareShare

Nuño Sempere

@nunosempere

3 days ago

"forecasters believe there’s a 51% chance (45% to 60%) chance that there will be an AI-assisted cyberattack causing at least $1 billion in damages over the next three months, slightly up from a 44% chance (37% to 50%) in week 35 of this year."

thumb_up_off_alt5

chat_bubble_outline1

repeat1

shareShare

Charlotte Stix

@charlotte_stix

3 days ago

Despite increasing policy and research attention to Loss of Control, decision- and policymakers are still operating in the absence of a uniform conceptualization and definition of LoC. Today, we bridge this gap through a novel taxonomy & preparedness framework 👇Apollo Research

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Shakeel

@shakeelhashim

3 days ago

I’m so tired

thumb_up_off_alt92

chat_bubble_outline5

repeat1

shareShare

Marius Hobbhahn

@mariushobbhahn

3 days ago

Our governance team wrote a new paper on Loss of Control. I think it is the best overall characterization and explanation of the concept so far. I especially like this figure, which tries to quantify previous reports using the term!

thumb_up_off_alt34

chat_bubble_outline0

repeat5

shareShare

Tyler John

@tyler_m_john

3 days ago

You guys took this too seriously

thumb_up_off_alt7

chat_bubble_outline1

repeat1

shareShare

Daniel Eth (yes, Eth is my actual last name)

@daniel_271828

3 days ago

Alex hits it out of the park in this interview. He also exposes the hypocrisy of the Andreessen-OpenAI super PAC - they claim to want preemption just to avoid a patchwork of state laws… but Alex’s platform is for an actual federal standard, and they’re aiming at him the same

thumb_up_off_alt38

chat_bubble_outline2

repeat7

shareShare

Sriram Krishnan

@sriramk

3 days ago

Very excited for the Genesis Mission -> whitehouse.gov/fact-sheets/20…

thumb_up_off_alt485

chat_bubble_outline29

repeat36

shareShare

Tyler John

Kevin Roose

Igor Kurganov

Neel Nanda

Ilya Sutskever

Nick Matarese

Tyler John

Dwarkesh Patel

Tyler John

Joel Becker

Apollo Research

Nuño Sempere

Charlotte Stix

Shakeel

Marius Hobbhahn

Tyler John

Daniel Eth (yes, Eth is my actual last name)

Sriram Krishnan