Anirudh Buvanesh
@anirudhbuvanesh
ID: 1580609992784113665
13-10-2022 17:23:16
4 Tweet
5 Followers
127 Following
Introducing linear scaling of reasoning: ๐๐ก๐ ๐๐๐ซ๐ค๐จ๐ฏ๐ข๐๐ง ๐๐ก๐ข๐ง๐ค๐๐ซ Reformulate RL so thinking scales ๐(๐ง) ๐๐จ๐ฆ๐ฉ๐ฎ๐ญ๐, not O(n^2), with O(1) ๐ฆ๐๐ฆ๐จ๐ซ๐ฒ, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy ๐งต