Arian Hosseini (@ariantbd) 's Twitter Profile
Arian Hosseini

@ariantbd

PhD candidate @Mila_Quebec working on language models, reasoning and alignment - Intern @Google Ex: @MSFTResearch

ID: 274357354

linkhttps://arianhosseini.github.io/ calendar_today30-03-2011 05:33:50

232 Tweet

496 Followers

272 Following

Arian Hosseini (@ariantbd) 's Twitter Profile Photo

Excited to share our new paper V-STaR - Common self-improvement methods only use correct self-generated solutions to bootstrap models - V-STaR utilizes iteratively self-generated correct and incorrect solutions to train a verifier using DPO arxiv.org/abs/2402.06457 🧵(1/4)

Excited to share our new paper V-STaR

- Common self-improvement methods only use correct self-generated solutions to bootstrap models
- V-STaR utilizes iteratively self-generated correct and incorrect solutions to train a verifier using DPO

arxiv.org/abs/2402.06457
🧵(1/4)