Kyle Corbitt (@corbtt) 's Twitter Profile
Kyle Corbitt

@corbtt

Currently building @OpenPipeAI. Formerly @ycombinator, @google. I am always down to go on a quest.

ID: 823506858

calendar_today14-09-2012 15:44:30

1,1K Tweet

12,12K Followers

221 Following

Kyle Corbitt (@corbtt) 's Twitter Profile Photo

"RL from a single example works" "RL with random rewards works" "Base model pass@256 can match RL model pass@1" "RL updates a small % of params" Recent papers all point in the same direction: RL is mostly just eliciting latent behavior already learned in pretraining, not