@qw3rtman : Excited to discuss "SFT Memorizes, RL Generalizes" tomorrow at @haizelabs's NYC AI Reading Group with @leonardtang_ and @willccbb! We'll also explore a broader theme — "what does RL actually learn?", guided by some related works from the past week. • TwiCopy

Nimit Kalra

@qw3rtman

+ Follow

research @haizelabs, aligning rewards. ex @citadel @utaustin

$ pip install verdict

ID: 385428300

linkhttps://nimit.io/ calendar_today05-10-2011 13:50:20

99 Tweet

781 Followers

2,2K Following

Nimit Kalra

@qw3rtman

5 months ago

Excited to discuss "SFT Memorizes, RL Generalizes" tomorrow at Haize Labs's NYC AI Reading Group with Leonard Tang and will brown! We'll also explore a broader theme — "what does RL actually learn?", guided by some related works from the past week.

Excited to discuss "SFT Memorizes, RL Generalizes" tomorrow at <a href="/haizelabs/">Haize Labs</a>'s NYC AI Reading Group with <a href="/leonardtang_/">Leonard Tang</a> and <a href="/willccbb/">will brown</a>! We'll also explore a broader theme — "what does RL actually learn?", guided by some related works from the past week.

thumb_up_off_alt91

chat_bubble_outline5

repeat6

shareShare