
Nimit Kalra
@qw3rtman
research @haizelabs, aligning rewards. ex @citadel @utaustin
$ pip install verdict
ID: 385428300
https://nimit.io/ 05-10-2011 13:50:20
99 Tweet
781 Followers
2,2K Following

Still noodling on this, but the generation-verification gap proposed by Yuda Song Hanlin Zhang Sham Kakade Udaya Ghai et al. in arxiv.org/abs/2412.02674 is a very nice framework that unifies a lot of thoughts around self-improvement/verification/bootstrapping reasoning