Michael Cohen (@michael05156007) 's Twitter Profile
Michael Cohen

@michael05156007

I do AGI Safety research. michael-k-cohen.com/publications. Once I was swiss chard for Halloween. Once Bill Clinton elbowed me in the face.

ID: 1158625106660225029

linkhttp://michael-k-cohen.com calendar_today06-08-2019 06:25:26

1,1K Tweet

1,1K Followers

162 Following

Michael Cohen (@michael05156007) 's Twitter Profile Photo

New paper! Over-optimization in RL is well-known, but it even occurs when KL(policy || base model) is constrained fairly tightly. Why? And can we fix it? 🧵

New paper! Over-optimization in RL is well-known, but it even occurs when KL(policy || base model) is constrained fairly tightly. Why? And can we fix it?  🧵