Geoffrey Cideron (@cdrgeo) Twitter Tweets • TwiCopy

Geoffrey Cideron

@cdrgeo

+ Follow

Research Engineer at Google DeepMind.
Spent time at FAIR London, INRIA Lille, and Instadeep.

ID: 1103977437346627584

calendar_today08-03-2019 11:15:06

20 Tweet

229 Followers

397 Following

Xuedong F.C.J.S Shang

@absolutsamuel

6 years ago

Matteo Hessel and Oriol Vinyals giving talks on Deep RL and games at #RLSS2019 DeepMindAI

Matteo Hessel and <a href="/OriolVinyalsML/">Oriol Vinyals</a> giving talks on Deep RL and games at #RLSS2019 <a href="/DeepMindAI/">DeepMindAI</a>

thumb_up_off_alt6

chat_bubble_outline0

repeat4

shareShare

It was great to work with Amartya Sanyal, Tim Rocktäschel, and Edward Grefenstette at FAIR London. This line of research is fascinating! Thank you for the opportunity! Additional gratitude to Roberto Calandra for the support and advice.

thumb_up_off_alt18

chat_bubble_outline0

repeat4

shareShare

Johan Ferret

@johanferret

4 years ago

Excited to announce that our #AAMAS2022 paper "Lazy-MDPs: Towards Interpretable RL by Learning When to Act" is on arXiv! 🦥 tl;dr - we introduce lazy-MDPs, modified MDPs that allow agents to defer decision-making to a third-party policy 📜 arxiv.org/abs/2203.08542 🧵👇

thumb_up_off_alt46

chat_bubble_outline3

repeat16

shareShare

Olivier Bachem

@olivierbachem

3 years ago

A common belief is that text auto encoders produce badly structured latent spaces with holes. We were surprised to find that using round-trip translations (e.g. en->de->en) one can obtain nicely structured latent spaces. Check out arxiv.org/pdf/2209.06792….

thumb_up_off_alt96

chat_bubble_outline4

repeat29

shareShare

Robert Dadashi

@robdadashi

3 years ago

Very proud to contribute to making RL agents more accessible and reproducible!

thumb_up_off_alt46

chat_bubble_outline0

repeat7

shareShare

ëugene kharitonov 🏴‍☠️

@n0mad_0

3 years ago

We* are looking for a Student Researcher** to work with us on a project in intersection of modeling/generating speech/audio, NLP, and representation learning. *AudioLM team @ Google Research (Zalán Borsos, Neil Zeghidour, myself and many others!) **not-last-year PhD student

thumb_up_off_alt37

chat_bubble_outline3

repeat12

shareShare

Johan Ferret

@johanferret

2 years ago

Our #ACL2023 paper "Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback" is now on arXiv! tl;dr - we improve the factuality of summaries via RL, without human feedback! 📜 arxiv.org/abs/2306.00186 Thread (1/10) 👇

thumb_up_off_alt87

chat_bubble_outline4

repeat25

shareShare

AK

@_akhaliq

2 years ago

Google presents MusicRL Aligning Music Generation to Human Preferences paper page: huggingface.co/papers/2402.04… propose MusicRL, the first music generation system finetuned from human feedback. Appreciation of text-to-music models is particularly subjective since the concept of

thumb_up_off_alt277

chat_bubble_outline0

repeat53

shareShare

Neil Zeghidour

@neilzegh

2 years ago

Very proud of the work done by Geoffrey Cideron , one of my last projects at Google. When we released MusicLM in May ’23, we incorporated a feedback system to realize the first ever large-scale, organic improvement of music generation through RLHF. 🎶🧵

thumb_up_off_alt60

chat_bubble_outline1

repeat4

shareShare

Johan Ferret

@johanferret

2 years ago

Online feedback is crucial for alignment, so we propose a simple recipe to make any direct alignment method (think DPO / IPO / SLiC-HF) online using AI feedback 🧙‍♂️ In human evals, online methods yield on avg 66% wins, 28% ties and 6% losses vs offline methods (on TL;DR) 👀

thumb_up_off_alt30

chat_bubble_outline1

repeat6

shareShare

Robert Dadashi

@robdadashi

2 years ago

I am so proud to see Gemma released today! I have had a fantastic time working on post-training and RLHF with an amazing team. Cannot wait to see what the community builds with these models!

thumb_up_off_alt55

chat_bubble_outline3

repeat7

shareShare

Robert Dadashi

@robdadashi

2 years ago

I am very happy to announce that Gemma 1.1 Instruct 2B and “7B” are out! Here are a few details about the new models: 1/11

thumb_up_off_alt368

chat_bubble_outline13

repeat68

shareShare