@__nmca__ : large reasoning models are extremely good at reward hacking. A thread of examples from OpenAI's recent monitoring paper: (50/n) • TwiCopy

Nat McAleese

@__nmca__

+ FollowFollowing

Research @OpenAI. Previously @DeepMind. Views my own.

ID: 1436080998366777346

calendar_today09-09-2021 21:36:23

537 Tweet

13,13K Followers

336 Following

Nat McAleese

5 months ago

more_vert

editEdit
deleteDelete
arrow_back_ios arrow_forward_iosEmbed Vogel
shareShare via another apps

large reasoning models are extremely good at reward hacking. A thread of examples from OpenAI's recent monitoring paper: (0/n)

thumb_up_off_alt948

chat_bubble_outline14

repeat80

shareShare

house Füttern
explore Erkunden
local_fire_department Beliebt
local_fire_department Um
translate Sprache

Anmelden

house Füttern
explore Erkunden
local_fire_department Beliebt
local_fire_department Um
translate Sprache

Anmelden

search

Was ist los

Wochenstart

Chrupalla

strafen

Ökonom

Linux

Vermittlung

#ksvdsc

#yeswecamp

Messerangriffe

Taylor

Frieden

Linera

Cutter

höhere löhne

Merz

Who to follow

dahymso

@dahymso

+ FollowFollowing

magdeburg2025

@magdeburg2025

+ FollowFollowing

martinlutzwelt

@martinlutzwelt

+ FollowFollowing

pink_screen

@pink_screen

+ FollowFollowing

bmglindemann

@bmglindemann

+ FollowFollowing

hipgyznwwb6o1eq

@hipgyznwwb6o1eq

+ FollowFollowing

akafilmclub

@akafilmclub

+ FollowFollowing

uvmbrellaa

@uvmbrellaa

+ FollowFollowing

kayaba_sama

@kayaba_sama

+ FollowFollowing

cwcgermany

@cwcgermany

+ FollowFollowing

©2022 TwiCopy. All rights reserved

Choose Language

Türkçe Türkisch check_circle

English Englisch check_circle

Arabia Arabisch check_circle

Detusch Deutsch check_circle

Japonca japanisch check_circle