Gabriel Goh (@gabeeegoooh) Twitter Tweets • TwiCopy

RL’s Razor: On-policy RL forgets less than SFT. Even at matched accuracy, RL shows less catastrophic forgetting Key factor: RL’s on-policy updates bias toward KL-minimal solutions Theory + LLM & toy experiments confirm RL stays closer to base model

thumb_up_off_alt636

chat_bubble_outline8

repeat101

shareShare

Gabriel Goh

@gabeeegoooh

4 months ago

amazing clarity

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Gabriel Goh

@gabeeegoooh

3 months ago

so that's what it looks like as a video

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Gabriel Goh

@gabeeegoooh

3 months ago

will i get more followers if I post my sora 2 access code here

thumb_up_off_alt10

chat_bubble_outline9

repeat0

shareShare

gabriel

@gabrielpeterss4

3 months ago

we just reached number 3 on app store! i have had so many friends tell me sora 2 is the first time ai made them laugh, this is truly a new experience!

thumb_up_off_alt1,1K

chat_bubble_outline188

repeat52

shareShare

Gabriel Goh

@gabeeegoooh

3 months ago

6921E0

thumb_up_off_alt25

chat_bubble_outline14

repeat0

shareShare

Gabriel Goh

@gabeeegoooh

3 months ago

i did not contribute much to sora 2 - but I had the pleasure of witnessing it's development. the level of care, attention and love put into a single model and a single product was amazing to witness. i've always believed deep learning rewards care, but I am now convinced the

thumb_up_off_alt234

chat_bubble_outline14

repeat17

shareShare

OpenAI

@openai

2 months ago

We’ve developed a new way to train small AI models with internal mechanisms that are easier for humans to understand. Language models like the ones behind ChatGPT have complex, sometimes surprising structures, and we don’t yet fully understand how they work. This approach

thumb_up_off_alt2,2K

chat_bubble_outline124

repeat258

shareShare

Gabriel Goh

Gabriel Goh

Gabriel Goh

Gabriel Goh

Gabriel Goh

Gabriel Goh

Gabriel Goh

Gabriel Goh

Gabriel Goh

Gabriel Goh

Eric Zelikman

Aran Komatsuzaki

Gabriel Goh

Gabriel Goh

Gabriel Goh

gabriel

Gabriel Goh

Gabriel Goh

OpenAI