
Rohan Pandey
@khoomeik
research @OpenAI || prev @CarnegieMellon '23 @ReworkdAI (YC S23) @AGIHouseSF
ID: 1228506265665462272
https://rpandey.tech 15-02-2020 02:28:25
4,4K Tweet
24,24K Followers
1,1K Following


Padding a transformer’s input with blank tokens (...) is a simple form of test-time compute. Can it increase the computational power of LLMs? 👀 New work with Ashish Sabharwal addresses this with *exact characterizations* of the expressive power of transformers with padding 🧵









word on the street is that Justus Mattern is forfeiting the opensource RL championship fight


Could you use sparse RL gradients to identify & understand circuits that are relevant for a behavior from an interp pov? cc Aryaman Arora

