
Reece Keller
@rdkeller
CS+Neuro @CarnegieMellon. PhD Student with @xaqlab and @aran_nayebi working on autonomy in embodied agents.
ID: 1363568642542108673
21-02-2021 19:18:11
13 Tweet
202 Takipçi
351 Takip Edilen

New paper: World models + Program synthesis by Wasu Top Piriyakulkij 1. World modeling on-the-fly by synthesizing programs w/ 4000+ lines of code 2. Learns new environments from minutes of experience 3. Positive score on Montezuma's Revenge 4. Compositional generalization to new environments

Given the confusion around what RL does for reasoning in LLMs, Amrith Setlur & I wrote a new blog post on when RL simply sharpens the base model & when it discovers new reasoning strategies. Learn how to measure discovery + methods to enable it ⬇️ tinyurl.com/rlshadis
