Costa Huang (@vwxyzjn) 's Twitter Profile
Costa Huang

@vwxyzjn

RLHF @allen_ai; main dev of @cleanrl_lib; CS PhD @DrexelUniv; Ex @huggingface @CuraiHQ @weights_biases @NVIDIAAI @riotgames.

ID: 1238049606

linkhttps://costa.sh calendar_today03-03-2013 06:26:46

1,1K Tweet

5,5K Followers

1,1K Following

Costa Huang (@vwxyzjn) 's Twitter Profile Photo

Happy to share our work on reproducing RLHF scaling behaviors in OpenAI's work in summarizing from feedback. We built an RLHF pipeline from scratch and enumerated over 20+ implementation details πŸš€ Fun collab with Michael Noukhovitch @NeurIPS 2024, Arian Hosseini @ NeurIPS, Kashif Rasul, wang, and Lewis Tunstall πŸ“œ

Happy to share our work on reproducing RLHF scaling behaviors in <a href="/OpenAI/">OpenAI</a>'s work in summarizing from feedback. We built an RLHF pipeline from scratch and enumerated over 20+ implementation details πŸš€

Fun collab with <a href="/mnoukhov/">Michael Noukhovitch @NeurIPS 2024</a>, <a href="/arianTBD/">Arian Hosseini @ NeurIPS</a>, <a href="/krasul/">Kashif Rasul</a>, <a href="/weixunwang/">wang</a>, and <a href="/_lewtun/">Lewis Tunstall</a> 

πŸ“œ