Rohan Pandey (@khoomeik) Twitter Tweets • TwiCopy

Rohan Pandey

@khoomeik

+ Follow

research @OpenAI || prev @CarnegieMellon '23 @ReworkdAI (YC S23) @AGIHouseSF

ID: 1228506265665462272

linkhttps://rpandey.tech calendar_today15-02-2020 02:28:25

4,4K Tweet

24,24K Followers

1,1K Following

Rohan Pandey

@khoomeik

a year ago

Once we solve math and the world doesn’t react, we’ll realize that most problems are empirically bound, not intelligence bound. All the clusters built in the 2020s will then migrate from learning to simulation.

thumb_up_off_alt250

chat_bubble_outline22

repeat28

shareShare

William Merrill

@lambdaviking

6 months ago

Padding a transformer’s input with blank tokens (...) is a simple form of test-time compute. Can it increase the computational power of LLMs? 👀 New work with Ashish Sabharwal addresses this with *exact characterizations* of the expressive power of transformers with padding 🧵

thumb_up_off_alt275

chat_bubble_outline3

repeat37

shareShare

Rohan Pandey

@khoomeik

6 months ago

Kevin Zhu incredibly important question and i still don't think there's a good go-to citation for this, but intuition is that RL on-policy update sparsity + CoT enables generalization rather than just memorizing the data as in SFT arxiv.org/abs/2501.17161 x.com/saagnikkk/stat…

thumb_up_off_alt21

chat_bubble_outline3

repeat2

shareShare

Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

6 months ago

Half the projects for my undergrad Intro to Deep (LLM) RL pearls-lab.github.io/intro-deep-rl-… are exactly this. They came up with some pretty interesting stuff

thumb_up_off_alt158

chat_bubble_outline4

repeat10

shareShare

Rohan Pandey

@khoomeik

6 months ago

half the ppl pinging me about building novel RL envs want to RL web agents sick release from the bros, now u can just use hud sdk to build, train, and eval on web RL envs 🔥🔥🔥

thumb_up_off_alt29

chat_bubble_outline0

repeat0

shareShare

Rohan Pandey

@khoomeik

6 months ago

capabilities-pilled interp drop: we now know why transformers outperform SSMs 👀 (tldr don’t make your mamba conv kernel size too small, and if you’re using non-convolutional SSMs, it’s joever buddy)

thumb_up_off_alt79

chat_bubble_outline0

repeat0

shareShare

Rohan Pandey

@khoomeik

6 months ago

thumb_up_off_alt86

chat_bubble_outline11

repeat0

shareShare

Rohan Pandey

@khoomeik

6 months ago

yo @ the doomers: here's a great example of a less intelligent entity exercising control over a more intelligent one and by that i mean the woke right somehow convincing techno-libertarians to coalition with them

thumb_up_off_alt492

chat_bubble_outline12

repeat12

shareShare

Rohan Pandey

@khoomeik

6 months ago

>50k+ to 500k+ views on X Is this the beginning of the end for conferences and citations?

thumb_up_off_alt63

chat_bubble_outline4

repeat0

shareShare

Rohan Pandey

@khoomeik

6 months ago

getchu a gpu dealer that’s scalingpilled AND sanskritpilled thanks for the free ocr data btw 🤤

thumb_up_off_alt528

chat_bubble_outline10

repeat13

shareShare

Rohan Pandey

@khoomeik

6 months ago

word on the street is that Justus Mattern is forfeiting the opensource RL championship fight

word on the street is that <a href="/MatternJustus/">Justus Mattern</a> is forfeiting the opensource RL championship fight

thumb_up_off_alt35

chat_bubble_outline5

repeat3

shareShare

Rohan Pandey

@khoomeik

6 months ago

Could you use sparse RL gradients to identify & understand circuits that are relevant for a behavior from an interp pov? cc Aryaman Arora

thumb_up_off_alt28

chat_bubble_outline2

repeat0

shareShare

Rohan Pandey

@khoomeik

6 months ago

deeptechmaxxing

thumb_up_off_alt57

chat_bubble_outline4

repeat0

shareShare

Rohan Pandey

@khoomeik

6 months ago

white professor living in chicago presenting to a chinese audience in shanghai about indian ideas from the 7th century in relation to 21st century developments in ML i love globalism

thumb_up_off_alt225

chat_bubble_outline11

repeat21

shareShare