Priyam Mazumdar (@data_adventurer) Twitter Tweets • TwiCopy

Priyam Mazumdar

@data_adventurer

+ Follow

Our eyes let us see the universe, AI may let us finally understand it! | PhD @ UIUC | Researcher @ NCSA AI Innovation | R&D Intern @ Sandia | Avid Photographer

ID: 1828057343885529088

linkhttps://www.priyammazumdar.com/ calendar_today26-08-2024 13:10:28

175 Tweet

38 Followers

39 Following

Priyam Mazumdar

@data_adventurer

4 months ago

I always find the weirdest fruits to try lol

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Priyam Mazumdar

@data_adventurer

4 months ago

Does any one else use coil whine for confirmation that something is actually happening?? I can’t tell if my gpus are talking or screaming in pain

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Priyam Mazumdar

@data_adventurer

4 months ago

layernorm has been updated to be parameterless, so we can do it without a weight or bias now! On to RMSNorm

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Priyam Mazumdar

@data_adventurer

4 months ago

RMSNorm is up! Just rope to go!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Priyam Mazumdar

@data_adventurer

4 months ago

Also, Liger Kernels, y'all are incredible for open sourcing so many of these triton kernels! It has been an incredible learning tool! github.com/linkedin/Liger…

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Priyam Mazumdar

@data_adventurer

4 months ago

writing a distributed dataloader that can also save and resume training sucked... the more I build mytorch the more I appreciate the awesome people who made PyTorch!!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Priyam Mazumdar

@data_adventurer

3 months ago

Have a few test runs going on a 560M Param model using only MyTorch! Having some stability issues im sorting through but this lo key might actually work!

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Priyam Mazumdar

@data_adventurer

3 months ago

training LLMs in fp16 really sucks... i wish CUPY had BF16 Support!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Policy Gradients are an extremely important idea that allows us to avoid doing a two step approach of first estimating Q values and then deriving the Policy from there. The main trick is how can we describe the derivative of our policy! Today we do that derivation and then a

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Priyam Mazumdar

@data_adventurer

13 days ago

EnCodec reimplementation video coming soon!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare