Arian Hosseini (@ariantbd) 's Twitter Profile
Arian Hosseini

@ariantbd

Research Scientist @GoogleDeepMind - LLM reasoning and alignment - prev: @Google @MSFTResearch

ID: 274357354

linkhttps://arianhosseini.github.io/ calendar_today30-03-2011 05:33:50

360 Tweet

1,1K Followers

313 Following

Arian Hosseini (@ariantbd) 's Twitter Profile Photo

Excited to share our new paper V-STaR - Common self-improvement methods only use correct self-generated solutions to bootstrap models - V-STaR utilizes iteratively self-generated correct and incorrect solutions to train a verifier using DPO arxiv.org/abs/2402.06457 🧵(1/4)

Excited to share our new paper V-STaR

- Common self-improvement methods only use correct self-generated solutions to bootstrap models
- V-STaR utilizes iteratively self-generated correct and incorrect solutions to train a verifier using DPO

arxiv.org/abs/2402.06457
🧵(1/4)