@m_wulfmeier : Imitation is the foundation of #LLM training. And it is a #ReinforcementLearning problem! Compared to supervised learning, RL -here inverse RL- better exploits sequential structure, online data and further extracts rewards. Beyond thrilled for our @GoogleDeepMind paper! A • TwiCopy

Markus Wulfmeier

@m_wulfmeier

+ Follow

Large-Scale Interactive Intelligence - Research @GoogleDeepMind European @ELLISforEurope - priors: @oxfordrobots @berkeley_ai @ETH @MIT

ID: 4484386293

linkhttps://sites.google.com/view/mwulfmeier/bio calendar_today14-12-2015 19:55:48

2,2K Tweet

12,12K Takipçi

1,1K Takip Edilen

Markus Wulfmeier

@m_wulfmeier

10 months ago

Imitation is the foundation of #LLM training. And it is a #ReinforcementLearning problem! Compared to supervised learning, RL -here inverse RL- better exploits sequential structure, online data and further extracts rewards. Beyond thrilled for our Google DeepMind paper! A

thumb_up_off_alt374

chat_bubble_outline10

repeat68

shareShare