Kamil Ciosek (@mlciosek) Twitter Tweets • TwiCopy

Kamil Ciosek

@mlciosek

+ Follow

Research Scientist @Spotify. Interested in machine learning, particularly reinforcement learning.

ID: 2937076767

linkhttps://www.ciosek.net/ calendar_today22-12-2014 15:59:19

26 Tweet

460 Followers

1,1K Following

Tristan Deleu

@tristandeleu

7 years ago

Our work on the reproducibility of meta-RL baselines (Bandits + MDPs) with MAML and Reptile is at the Reproducibility in ML workshop (C2) #ICML2018 together with Arian Hosseini & Simon Guiroy @MILAMontreal

thumb_up_off_alt88

chat_bubble_outline2

repeat21

shareShare

Kamil Ciosek

@mlciosek

6 years ago

Like policy gradients? In "Expected Policy Gradients for Reinforcement Learning", we study various quadrature schemes to decrease variance in gradient estimates. Final version is now published in JMLR (Journal of Machine Learning Research). See jmlr.org/papers/v21/18-….

thumb_up_off_alt53

chat_bubble_outline0

repeat12

shareShare

Kritika Prakash

@kritipraks

4 years ago

Kullback-Leibler divergence is not the same as Leibler-Kullback divergence

thumb_up_off_alt3,3K

chat_bubble_outline49

repeat285

shareShare

David Lindner

@davlindner

4 years ago

I'm excited to present our work on active reward learning at #NeurIPS2021! We propose a general way to make queries that are informative about the optimal policy. Joint work with Matteo Turchetta, Sebastian Tschiatschek, Kamil Ciosek, and Andreas Krause: arxiv.org/abs/2102.12466 👇(1/6)

thumb_up_off_alt14

chat_bubble_outline2

repeat2

shareShare

Spotify Research

@spotifyresearch

4 years ago

Interested in Imitation Learning? You can do it using a single call to a Reinforcement Learning oracle. See our #ICLR2022 paper “Imitation Learning by Reinforcement Learning” (openreview.net/pdf?id=1zwleyt…).

thumb_up_off_alt23

chat_bubble_outline0

repeat3

shareShare

Spotify Research

@spotifyresearch

4 years ago

Want to do imitation learning in a simple and efficient way? We released code for the ICLR 2022 paper “Imitation Learning by Reinforcement Learning”. See github.com/spotify-resear….

thumb_up_off_alt10

chat_bubble_outline0

repeat1

shareShare

Zhenwen Dai

@zhenwendai

3 years ago

Interested in working on exciting ML ideas for Spotify? Join us! We are looking for a Research Scientist Intern to join our research lab in London for summer 2023. lifeatspotify.com/jobs/summer-in… Spotify Research

thumb_up_off_alt20

chat_bubble_outline1

repeat3

shareShare

Kamil Ciosek

@mlciosek

2 months ago

For anyone worried their LLM might be making stuff up, we made a budget‐friendly truth serum (semantic entropy + Bayesian). See for yourself: youtube.com/watch?v=x_8ORG… Paper: arxiv.org/pdf/2504.03579

thumb_up_off_alt3

chat_bubble_outline0

repeat7

shareShare

Shimon Whiteson

@shimon8282

8 years ago

Our new paper: using Fourier analysis to derive policy gradients: we recast the integrals as convolutions, which a Fourier transform turns into multiplications. The resulting analysis unifies existing policy gradient results. arxiv.org/abs/1802.06891

thumb_up_off_alt103

chat_bubble_outline1

repeat24

shareShare

WhiRL

@whi_rl

8 years ago

"Fourier Policy Gradients" by Matthew Fellows, Kamil Ciosek and Shimon Whiteson arxiv.org/pdf/1802.06891…

thumb_up_off_alt10

chat_bubble_outline0

repeat6

shareShare

WhiRL

@whi_rl

8 years ago

Learn about our work on "Expected Policy Gradients" in this 13min video by Kamil Ciosek - with Kamil Ciosek Shimon Whiteson youtube.com/watch?v=x2NFiP…

thumb_up_off_alt12

chat_bubble_outline0

repeat4

shareShare