subho ghosh (@subhoghosh02) Twitter Tweets • TwiCopy

subho ghosh

@subhoghosh02

+ Follow

22. seeking local optima in ML & global maxima in life. training on life

ID: 1201157604212236288

linkhttp://Github.com/iGhoshSubho calendar_today01-12-2019 15:14:32

6,6K Tweet

1,1K Takipçi

407 Takip Edilen

Yunhao (Robin) Tang

@robinphysics

9 months ago

Maybe to one's surprise, taking KL estimates as `kl_loss` to minimize does *not* enforce the KL. This implementation, however, is quite common in open source RL repos and recent research papers. In short: grad of an unbiased KL estimate is not an unbiased estimate of KL grad.

thumb_up_off_alt651

chat_bubble_outline13

repeat51

shareShare

subho ghosh

@subhoghosh02

9 months ago

😭😂

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

subho ghosh

@subhoghosh02

9 months ago

Gonna make my own Sapphire fr! 😭

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

himanshu dubey

@himanshustwts

9 months ago

so reward hacking gonna brrrrr. will brown will be joining us for the next pod ⚡ you're not ready for this. more soon!

thumb_up_off_alt124

chat_bubble_outline17

repeat3

shareShare

subho ghosh

@subhoghosh02

9 months ago

Holy shit! 0 college attendance in last sem.

thumb_up_off_alt15

chat_bubble_outline5

repeat0

shareShare

subho ghosh

@subhoghosh02

9 months ago

when a new shape comes :)

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Naga/Abhi

@nagasaiabhinay

9 months ago

Too early to be sure but I'm trying to use optimal control ala RB Modulation but replacing consistency loss with reward signal as a kind of test time scaling technique. Baseline vs With reward model.

thumb_up_off_alt8

chat_bubble_outline2

repeat1

shareShare

subho ghosh

@subhoghosh02

9 months ago

This is sick!

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

subho ghosh

@subhoghosh02

9 months ago

every gpu kernel is a logical subdivision of divide and conquer

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Mathurin Massias

@mathusmassias

9 months ago

New paper on the generalization of Flow Matching arxiv.org/abs/2506.03719 🤯 Why does flow matching generalize? Did you know that the flow matching target you're trying to learn **can only generate training points**? with Quentin Bertrand, Anne Gagneux & Rémi Emonet 👇👇👇

thumb_up_off_alt1,1K

chat_bubble_outline15

repeat202

shareShare

Thien Tran

@gaunernst

9 months ago

Using MXFP8 on 5090 to get 2x speedup on Flux!

thumb_up_off_alt19

chat_bubble_outline2

repeat1

shareShare

subho ghosh

@subhoghosh02

9 months ago

why is it only 3hrs 😭

thumb_up_off_alt6

chat_bubble_outline1

repeat0

shareShare

Piyush

@catastro_piyush

9 months ago

subho ghosh true. was just noting down all the topics. piyushmaharanacats.blogspot.com/2025/06/rabbit…

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare

Naga/Abhi

@nagasaiabhinay

9 months ago

Hmm, Still using the LAION Aesthetic reward on Flux. You can tell the difference, but it doesn't feel quite there yet. Will play around with couple of other reward models and their combinations. This is relative reward by the way. Much more stable vs absolute reward.

thumb_up_off_alt5

chat_bubble_outline1

repeat1

shareShare

subho ghosh

@subhoghosh02

9 months ago

midjourney cooked!!

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

tenderizzation

@tenderizzation

9 months ago

the FP8 values in your model after 50 layers of quantize/dequantize operations

thumb_up_off_alt1,1K

chat_bubble_outline23

repeat120

shareShare

subho ghosh

@subhoghosh02

9 months ago

gm guys!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Naga/Abhi

@nagasaiabhinay

9 months ago

thumb_up_off_alt3

chat_bubble_outline1

repeat1

shareShare

subho ghosh

@subhoghosh02

9 months ago

all hail Aryan 🫡

all hail <a href="/aryanvs_/">Aryan</a> 🫡

thumb_up_off_alt7

chat_bubble_outline1

repeat0

shareShare

Zhihao Jia

@jiazhihao

9 months ago

One of the best ways to reduce LLM latency is by fusing all computation and communication into a single GPU megakernel. But writing megakernels by hand is extremely hard. 🚀Introducing Mirage Persistent Kernel (MPK), a compiler that automatically transforms LLMs into optimized

thumb_up_off_alt439

chat_bubble_outline6

repeat68

shareShare