
Sihao Chen
@soshsihao
Researcher @ Microsoft #OAR. Previously: @upennnlp @cogcomp @GoogleAI; #NLProc. Opnions my own.
ID: 2430441172
http://sihaoc.github.io 23-03-2014 07:08:52
107 Tweet
864 Followers
513 Following



๐ข ๐๐ข๐ฅ๐๐ ๐๐๐๐๐๐๐ค A large-scale preference dataset built from ๐ซ๐๐๐ฅ ๐ฎ๐ฌ๐๐ซ interactions with ChatGPT โ ๐๐๐ค+ preference pairs ๐ฃ๏ธ Built from ๐๐ chats ๐ Annotated with ๐๐ข๐๐ฅ๐จ๐ ๐ฎ๐ ๐ฌ๐ญ๐๐ญ๐, ๐๐จ๐ฆ๐๐ข๐ง, ๐ข๐ง๐ญ๐๐ง๐ญ, and more huggingface.co/datasets/microโฆ


Want to ๐๐ฎ๐ญ ๐๐ ๐ ๐ญ๐ซ๐๐ข๐ง๐ข๐ง๐ ๐ญ๐ข๐ฆ๐ ๐๐ฒ ๐ฎ๐ฉ ๐ญ๐จ ๐ร and boost performance? ๐ Meet ๐จ๐ ๐๐น๐ญ๐ป โ a lightweight, plug-and-play curriculum learning method you can drop into any mainstream RFT algorithms (PPO, GRPO, REINFORCE). Less compute. Better results. ๐งต 1/n







Huge congrats Ashish Sharma!!๐๐
