Shagun Sodhani (@shagunsodhani) 's Twitter Profile
Shagun Sodhani

@shagunsodhani

FAIR Researcher @AIatMeta Previously @Mila_Quebec, @MSFTResearch, @AdobeResearch, @IITRoorkee. All views are my own.

ID: 2568319752

linkhttps://shagunsodhani.com calendar_today15-06-2014 04:36:09

658 Tweet

2,2K Followers

555 Following

Shagun Sodhani (@shagunsodhani) 's Twitter Profile Photo

3 papers (with 1 spotlight) accepted at ICLR 2025 : 1. TorchRL torchrl : decision-making library for PyTorch openreview.net/forum?id=QxIto… 2. Motif: Intrinsic Motivation from AI Feedback openreview.net/forum?id=tmBKI… 3. Decision transformers for #offline #RL openreview.net/forum?id=vpV7f…

Shagun Sodhani (@shagunsodhani) 's Twitter Profile Photo

Kudos to vmoens for enabling pointwise arithmetic operations (using foreach) in tensordicts (data structure for torchrl )! github.com/pytorch/tensor… This should be helpful for usecases like implementing optimizers!

Shagun Sodhani (@shagunsodhani) 's Twitter Profile Photo

Attending my first Collision Conf next week and the vibes are very different from the typical science conferences that I am used to. Any tips from pro attendees on how to make the most of the #CollisionConf will be helpful! And hit me up if you want to meet during the event!

Shagun Sodhani (@shagunsodhani) 's Twitter Profile Photo

Giving a talk on "torch.func: Functional Transforms in PyTorch" next week at the Toronto Machine Learning Society (TMLS) (Toronto Machine Learning Summit) If you are attending and want to connect, send me a dm :)

Shagun Sodhani (@shagunsodhani) 's Twitter Profile Photo

Just came across nnsight.net for #interpreting the internals of PyTorch models. Skimmed the docs and it looks pretty awesome. Would love to hear some thoughts from people who have already tried it

Albert Gu (@_albertgu) 's Twitter Profile Photo

distillation.... mmm 🍻 state-of-the-art Mamba models with 1% of the compute, by leveraging pretrained Transformers! key insight: project the (quadratic) attention matrices onto (structured) SSM matrix mixers before end-to-end training led by students Aviv Bick Kevin Li

distillation.... mmm 🍻

state-of-the-art Mamba models with 1% of the compute, by leveraging pretrained Transformers!

key insight: project the (quadratic) attention matrices onto (structured) SSM matrix mixers before end-to-end training

led by students <a href="/avivbick/">Aviv Bick</a> <a href="/kevinyli_/">Kevin Li</a>
Haque Ishfaq (@haqueishfaq) 's Twitter Profile Photo

📢 𝐈 𝐚𝐦 𝐨𝐧 𝐭𝐡𝐞 𝐣𝐨𝐛 𝐦𝐚𝐫𝐤𝐞𝐭 for both industry and academic positions. Please reach out if I'd be a good fit for your industry research group or academic department. Research: reinforcement learning, efficient exploration, RLHF & more! hmishfaq.github.io

Shagun Sodhani (@shagunsodhani) 's Twitter Profile Photo

I have some 1-month gift subscriptions for The Pragmatic Engineer (The Pragmatic Engineer) to give away (courtesy of Gergely Orosz ). I’ve been subscribed for a while and find it super helpful for staying updated on tech and engineering. If you’d like to try it out, just reply 🙂

vmoens (@vincentmoens) 's Twitter Profile Photo

Today we're opensourcing a LeanRL, a simple RL library that provides recipes for fast RL training using torch.compile and cudagraphs. Using these, we get >6x speed-ups compared to the original CleanRL implementations. github.com/pytorch-labs/l… A thread ⬇️

Today we're opensourcing a LeanRL, a simple RL library that provides recipes for fast RL training using torch.compile and cudagraphs.
Using these, we get &gt;6x speed-ups  compared to the original CleanRL implementations.
github.com/pytorch-labs/l…
A thread ⬇️
Shagun Sodhani (@shagunsodhani) 's Twitter Profile Photo

My team at AI at Meta is hiring research #interns to work on newer architectures like #SSMs and #distillation (esp. cross-architecture distillation). If this interests you, please apply at metacareers.com/jobs/119904986…. You can dm your resume/queries here or at linkedin.com/in/shagunsodha…

Shagun Sodhani (@shagunsodhani) 's Twitter Profile Photo

I learnt a lot from CUDA mode community just by following along the discussions in the discord. Excited that the community is expanding its scope to GPU mode . Looking forward to learning even more from the community!

Shagun Sodhani (@shagunsodhani) 's Twitter Profile Photo

This is a great opportunity, esp for MS students who are often not eligible for research internships. Gabriel Synnaeve is an amazing mentor and FAIR Paris is an amazing team!

Shagun Sodhani (@shagunsodhani) 's Twitter Profile Photo

Excited to host a tutorial on training large deep learning models with PyTorch FSDP at #ODSCWest this Tuesday, where I'll share strategies for scaling models efficiently. 🔥 Open Data Science #AI #ML #PyTorch #scaling

Excited to host a tutorial on training large deep learning models with <a href="/PyTorch/">PyTorch</a> FSDP at #ODSCWest this Tuesday, where I'll share strategies for scaling models efficiently. 🔥 <a href="/_odsc/">Open Data Science</a> #AI #ML #PyTorch #scaling