Geoffrey Cideron (@cdrgeo) 's Twitter Profile
Geoffrey Cideron

@cdrgeo

Research Engineer at Google DeepMind.
Spent time at FAIR London, INRIA Lille, and Instadeep.

ID: 1103977437346627584

calendar_today08-03-2019 11:15:06

20 Tweet

229 Followers

397 Following

Geoffrey Cideron (@cdrgeo) 's Twitter Profile Photo

It was great to work with Amartya Sanyal, Tim Rocktäschel, and Edward Grefenstette at FAIR London. This line of research is fascinating! Thank you for the opportunity! Additional gratitude to Roberto Calandra for the support and advice.

Johan Ferret (@johanferret) 's Twitter Profile Photo

Excited to announce that our #AAMAS2022 paper "Lazy-MDPs: Towards Interpretable RL by Learning When to Act" is on arXiv! 🦥 tl;dr - we introduce lazy-MDPs, modified MDPs that allow agents to defer decision-making to a third-party policy 📜 arxiv.org/abs/2203.08542 🧵👇

Olivier Bachem (@olivierbachem) 's Twitter Profile Photo

A common belief is that text auto encoders produce badly structured latent spaces with holes. We were surprised to find that using round-trip translations (e.g. en->de->en) one can obtain nicely structured latent spaces. Check out arxiv.org/pdf/2209.06792….

A common belief is that text auto encoders produce badly structured latent spaces with holes. We were surprised to find that using round-trip translations (e.g. en->de->en) one can obtain nicely structured latent spaces. Check out arxiv.org/pdf/2209.06792….
ëugene kharitonov 🏴‍☠️ (@n0mad_0) 's Twitter Profile Photo

We* are looking for a Student Researcher** to work with us on a project in intersection of modeling/generating speech/audio, NLP, and representation learning. *AudioLM team @ Google Research (Zalán Borsos, Neil Zeghidour, myself and many others!) **not-last-year PhD student

Johan Ferret (@johanferret) 's Twitter Profile Photo

Our #ACL2023 paper "Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback" is now on arXiv! tl;dr - we improve the factuality of summaries via RL, without human feedback! 📜 arxiv.org/abs/2306.00186 Thread (1/10) 👇

AK (@_akhaliq) 's Twitter Profile Photo

Google presents MusicRL Aligning Music Generation to Human Preferences paper page: huggingface.co/papers/2402.04… propose MusicRL, the first music generation system finetuned from human feedback. Appreciation of text-to-music models is particularly subjective since the concept of

Google presents MusicRL

Aligning Music Generation to Human Preferences

paper page: huggingface.co/papers/2402.04…

propose MusicRL, the first music generation system finetuned from human feedback. Appreciation of text-to-music models is particularly subjective since the concept of
Neil Zeghidour (@neilzegh) 's Twitter Profile Photo

Very proud of the work done by Geoffrey Cideron , one of my last projects at Google. When we released MusicLM in May ’23, we incorporated a feedback system to realize the first ever large-scale, organic improvement of music generation through RLHF. 🎶🧵

Johan Ferret (@johanferret) 's Twitter Profile Photo

Online feedback is crucial for alignment, so we propose a simple recipe to make any direct alignment method (think DPO / IPO / SLiC-HF) online using AI feedback 🧙‍♂️ In human evals, online methods yield on avg 66% wins, 28% ties and 6% losses vs offline methods (on TL;DR) 👀

Robert Dadashi (@robdadashi) 's Twitter Profile Photo

I am so proud to see Gemma released today! I have had a fantastic time working on post-training and RLHF with an amazing team. Cannot wait to see what the community builds with these models!

Robert Dadashi (@robdadashi) 's Twitter Profile Photo

I am very happy to announce that Gemma 1.1 Instruct 2B and “7B” are out! Here are a few details about the new models: 1/11