John Burden (@johnjburden) Twitter Tweets • TwiCopy

John Burden

@johnjburden

+ Follow

Programme Co-director of Kinds of Intelligence programme and Senior Research Fellow at @LeverhulmeCFI

ID: 1290967456991903744

calendar_today05-08-2020 11:06:53

341 Tweet

174 Followers

305 Following

Séb Krier

@sebkrier

6 months ago

it's not enough to have good ideas. they must be transcribed in the correct sacred format (illuminated manuscript), pay tribute to established scholars (tithe to the clergy), be judged by the ruling guild (peer review by the elders), and presented at costly gatherings (royal

thumb_up_off_alt128

chat_bubble_outline7

repeat22

shareShare

Bernie Sanders

@berniesanders

3 months ago

The CEO of Anthropic (a powerful AI company) predicts that AI could wipe out HALF of entry-level white collar jobs in the next 5 years. We must demand that increased worker productivity from AI benefits working people, not just wealthy stockholders on Wall St. AI IS A BIG DEAL.

thumb_up_off_alt4,4K

chat_bubble_outline305

repeat501

shareShare

METR

@metr_evals

3 months ago

At METR, we’ve seen increasingly sophisticated examples of “reward hacking” on our tasks: models trying to subvert or exploit the environment or scoring code to obtain a higher score. In a new post, we discuss this phenomenon and share some especially crafty instances we’ve seen.

thumb_up_off_alt221

chat_bubble_outline3

repeat37

shareShare

Ryan Greenblatt

@ryanpgreenblatt

3 months ago

This paper doesn't show fundamental limitations of LLMs: - The "higher complexity" problems require more reasoning than fits in the context length (humans would also take too long). - Humans would also make errors in the cases where the problem is doable in the context length. -

thumb_up_off_alt558

chat_bubble_outline24

repeat52

shareShare

James Miller

@jimdmiller

3 months ago

thumb_up_off_alt2,2K

chat_bubble_outline123

repeat346

shareShare

Maia

@maiamindel

2 months ago

Fun fact but "children who grow up in homes with few books do worse in school, so we should give them more books" is quite literally the textbook example of a confounding variable (how much the *parents* value learning and education)

thumb_up_off_alt2,2K

chat_bubble_outline36

repeat118

shareShare

Lorenzo Pacchiardi

@lpacchiardi

2 months ago

LLMs • agentic AI • #DataScience 🧵 1/ 🚨 New paper: “Measuring Data-Science Automation: A Survey of Evaluation Tools for AI Assistants & Agents.” If you care about the impact of LLMs and LLM agents on Data Science and how to measure it, this is for you!

thumb_up_off_alt8

chat_bubble_outline1

repeat6

shareShare

Pablo Moreno 🔸 🇪🇺 🇺🇦

@pablomorecasa

2 months ago

The June edition of the AI evaluation digest. If you want to be up to speed with the scientific literature on AI evaluation, this is a good place to start. open.substack.com/pub/aievaluati…

thumb_up_off_alt3

chat_bubble_outline0

repeat3

shareShare

Nirit Weiss-Blatt, PhD

@drtechlash

a month ago

🚨The UK AISI identified four methodological flaws in AI "scheming" studies (deceptive alignment) conducted by Anthropic, MTER, Apollo Research, and others: "We call researchers studying AI 'scheming' to minimise their reliance on anecdotes, design research with appropriate

thumb_up_off_alt265

chat_bubble_outline13

repeat55

shareShare

John Burden

@johnjburden

a month ago

Vibecoders, is there a good way to use cursor/windsurf/whatever with my chatgpt o3? I.e custom instructions and visibility of other conversation threads?

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare