Ben Levinstein (@ben_levinstein) Twitter Tweets • TwiCopy

Seán Ó hÉigeartaigh

a year ago

It's not done yet. Hearing reports that the Nobel prize for literature will be going to the authors of "OpenAI's nonprofit governance structure" for outstanding contributions to creative fiction.

thumb_up_off_alt3,3K

chat_bubble_outline34

repeat293

shareShare

Ben Levinstein

@ben_levinstein

a year ago

Why are humans winning the prize instead of the AI solving the actual problems? AlphaFold is more deserving. This seems like some woke DEI thing.

thumb_up_off_alt13

chat_bubble_outline2

repeat0

shareShare

Dissecting Inefficiency in Prediction Markets It is well known that polls are subject to statistical errors, and this error is accounted for by the margin of error. Betting markets, on the other hand, are subject to inefficiency. These inefficiencies can be accounted for by

thumb_up_off_alt144

chat_bubble_outline9

repeat29

shareShare

Owain Evans

@owainevans_uk

a year ago

New paper: Are LLMs capable of introspection, i.e. special access to their own inner states? Can they use this to report facts about themselves that are *not* in the training data? Yes — in simple tasks at least! This has implications for interpretability + moral status of AI 🧵

thumb_up_off_alt533

chat_bubble_outline25

repeat83

shareShare

Ben Levinstein

@ben_levinstein

a year ago

This was pretty cool to play around with. I asked it to turn the whole world into paperclips, though, and it struggled to find anything useful from Saks Fifth Avenue.

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Ben Levinstein

@ben_levinstein

a year ago

Sports are a counterexample to Kant's claim that you need to adopt the position of a disinterested observer to appreciate art.

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

LeBron James

@kingjames

a year ago

FREDDY FREEMAN WE ARE NOT WORTHY!!!!!🙌🏾🙌🏾🙌🏾🙌🏾🙌🏾

thumb_up_off_alt31,31K

chat_bubble_outline1,1K

repeat1,1K

shareShare

Ben Levinstein

@ben_levinstein

a year ago

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Lucas Beyer (bl16)

@giffmana

a year ago

Hahahaha

thumb_up_off_alt2,2K

chat_bubble_outline44

repeat139

shareShare

Ben Levinstein

@ben_levinstein

a year ago

Can any AI do a good job turning hand drawn diagrams into Tikz equivalents? I want to do this in Tikz and also hate using Tikz.

thumb_up_off_alt33

chat_bubble_outline8

repeat1

shareShare

LaTeX.org

@texgallery

a year ago

Ben Levinstein Daniel Litt Sure, here is the TikZ code; I added some explanation: tikz.org/drawing

thumb_up_off_alt5

chat_bubble_outline1

repeat1

shareShare

Ben Levinstein

@ben_levinstein

a year ago

GPT-4o seems so dumb and useless these days compared to Claude. Claude tells me to STFU multiple times a day, which stops lots of my work and hurts my feelings. I've tried switching over to GPT, but it's not the same. Do people still use 4o much for work- or coding-related tasks?

thumb_up_off_alt5

chat_bubble_outline2

repeat0

shareShare

Ben Levinstein

@ben_levinstein

a year ago

Ahahahahaha. What a dumbass.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Joe Carlsmith

@jkcarlsmith

a year ago

My current take on Apollo's recent scheming paper is that they aren’t emphasizing the most interesting results, which are the sandbagging results in section 3.6 and appendix A.6 (screenshot of the key numbers below). In more particular: the paper frames its results centrally as

thumb_up_off_alt239

chat_bubble_outline7

repeat42

shareShare

Anthropic

@anthropicai

a year ago

New Anthropic research: Alignment faking in large language models. In a series of experiments with Redwood Research, we found that Claude often pretends to have different views during training, while actually maintaining its original preferences.

thumb_up_off_alt4,4K

chat_bubble_outline212

repeat727

shareShare

zak miller

@zjmiller

a year ago

very excited about this explainer on AI self-awareness; one of the most important AI capabilities to keep tabs on imo

thumb_up_off_alt12

chat_bubble_outline1

repeat1

shareShare

Miles Brundage

@miles_brundage

a year ago

I'm old enough to remember when getting double digit scores on FrontierMath was considered super hard I'm 6 weeks old

thumb_up_off_alt794

chat_bubble_outline4

repeat47

shareShare

Daniel Litt

@littmath

a year ago

A couple more brief thoughts on o3’s (incredible) performance on FrontierMath.

thumb_up_off_alt636

chat_bubble_outline10

repeat58

shareShare

tanya

@tanya_sabrinaaa

a year ago

i actually read the Odyssey in the original greek. complete waste of time, i have no idea what those symbols mean

thumb_up_off_alt146,146K

chat_bubble_outline169

repeat7,7K

shareShare

Sean Carroll

@seanmcarroll

a year ago

Mindscape 301 | Tina Eliassi-Rad on Al, Networks, and Epistemic Instability. If we're all just vectors in a huge dataset, might as well turn it to our advantage. #MindscapePodcast preposterousuniverse.com/podcast/2025/0…

thumb_up_off_alt65

chat_bubble_outline7

repeat12

shareShare

Ben Levinstein

Seán Ó hÉigeartaigh

Ben Levinstein

Harry Crane

Owain Evans

Ben Levinstein

Ben Levinstein

LeBron James

Ben Levinstein

Lucas Beyer (bl16)

Ben Levinstein

LaTeX.org

Ben Levinstein

Ben Levinstein

Joe Carlsmith

Anthropic

zak miller

Miles Brundage

Daniel Litt

tanya

Sean Carroll