Alessandro Sordoni
@murefil
ML Team / MSR Montréal. Views are my own.
ID: 124319949
19-03-2010 01:02:45
499 Tweet
846 Followers
908 Following
Our paper on multi-head routing in modular LLMs has now been accepted at NeurIPS Conference (arxiv.org/abs/2211.03831), it was fun to work with Lucas Caccia and @sordonia! EdinburghNLP Mila - Institut québécois d'IA Microsoft Research
Thanks for the mention! 🎉🎉I'm thrilled to contribute to the Hugging Face community with Polytropon & MHR Edoardo Ponti Alessandro Sordoni 🚀🚀Your awesome methods have shown impressive results in our multitask use cases Ant Group ...and we have more to share soon. Stay tuned! 😉
LLM self-improvement works (STaR, SPIN, Self-Rewarding LM). We use correct/incorrect solutions generated during self-improvement to train a verifier with DPO, and use it to rank solutions at test time. DPO rankers work well! Thx Arian Hosseini and Rishabh Agarwal for leading the project