Nat McAleese (@__nmca__) 's Twitter Profile
Nat McAleese

@__nmca__

Research @OpenAI. Previously @DeepMind. Views my own.

ID: 1436080998366777346

calendar_today09-09-2021 21:36:23

537 Tweet

13,13K Followers

336 Following

Nat McAleese (@__nmca__) 's Twitter Profile Photo

large reasoning models are extremely good at reward hacking. A thread of examples from OpenAI's recent monitoring paper: (0/n)