Shashank Gupta (@shashank27392) Twitter Tweets • TwiCopy

Ji-Ha

6 months ago

I got recommended Terence Tao's YouTube channel created in 2010, where he uploaded his first video just yesterday! He showcases his process of formalizing a proof in Lean 4 with the help of GitHub Copilot and the "canonical" tactic in Lean.

thumb_up_off_alt595

chat_bubble_outline7

repeat43

shareShare

Pramod Goyal

@goyal__pramod

6 months ago

Gpt-2 is just 174 lines of code... How crazy is that

thumb_up_off_alt3,3K

chat_bubble_outline122

repeat197

shareShare

Probability and Statistics

@probnstat

6 months ago

Lectures on Unbiased Estimation by Lester Mackey web.stanford.edu/~lmackey/stats…

thumb_up_off_alt219

chat_bubble_outline4

repeat38

shareShare

Kevin Patrick Murphy

@sirbayes

6 months ago

I am pleased to announce a new version of my RL tutorial. Major update to the LLM chapter (eg DPO, GRPO, thinking), minor updates to the MARL and MBRL chapters and various sections (eg offline RL, DPG, etc). Enjoy! arxiv.org/abs/2412.05265

thumb_up_off_alt2,2K

chat_bubble_outline23

repeat445

shareShare

Ahmad Beirami @ ICLR 2025

@abeirami

6 months ago

As we go through a lot of excitement about RL recently with lots of cool work/results, here is a reminder that RL with a reverse KL-regularizer to the base model cannot learn new skills that were not already present in the base model. It can only amplify the existing weak skills.

thumb_up_off_alt475

chat_bubble_outline12

repeat52

shareShare

Simon Shaolei Du

@simonshaoleidu

6 months ago

PPO vs. DPO? 🤔 Our new paper proves that it depends on whether your models can represent the optimal policy and/or reward. Paper: arxiv.org/abs/2505.19770 Led by Ruizhe Shi Minhak Song

thumb_up_off_alt97

chat_bubble_outline0

repeat18

shareShare

Tianyuan Zhang

@tianyuanzhang99

6 months ago

Bored of linear recurrent memories (e.g., linear attention) and want a scalable, nonlinear alternative? Our new paper “Test-Time Training Done Right” propose LaCT (Large Chunk Test-Time Training) — a highly efficient, massively scalable nonlinear memory with: 💡 Pure PyTorch

thumb_up_off_alt390

chat_bubble_outline5

repeat74

shareShare

yobibyte

@y0b1byte

5 months ago

another good one!

thumb_up_off_alt436

chat_bubble_outline1

repeat42

shareShare

Charlie London

@charlielondon02

5 months ago

I believe that policy gradient methods with only terminal rewards will have to break down at some level of task ood-ness/complexity, and PRMs will be necessary. This looks like a really cool addition to the theory of PRMs and CoT

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

alphaXiv

@askalphaxiv

4 months ago

Google has shared the system prompt that got Gemini 2.5 Pro IMO 2025 Gold Medal 🏅 paper now #1 trending on alphaXiv 📈

thumb_up_off_alt2,2K

chat_bubble_outline26

repeat248

shareShare

Niloofar (on faculty job market!)

@niloofar_mire

4 months ago

🧵 Academic job market season is almost here! There's so much rarely discussed—nutrition, mental and physical health, uncertainty, and more. I'm sharing my statements, essential blogs, and personal lessons here, with more to come in the upcoming weeks! ⬇️ (1/N)

thumb_up_off_alt245

chat_bubble_outline3

repeat36

shareShare

Satnam Singh

@satnam6502

4 months ago

Delip Rao e/σ Amey | अमेय I have often failed interviews. I have even failed interview where I was asked an interview question I used regularly at Google which I knew inside out. I fail coding interviews not because I can't code, but because the stressful synthetic nature of the situation causes my brain

thumb_up_off_alt368

chat_bubble_outline18

repeat17

shareShare

Fortune India

@fortuneindia

4 months ago

Shah Rukh Khan (Shah Rukh Khan) wins his first-ever National Award—33 years after debuting on the big screen—for his performance in the action thriller film Jawan at the 2023 National Awards. The veteran actor shares the accolade with Vikrant Massey (Vikrant Massey), who won the award

Shah Rukh Khan (<a href="/iamsrk/">Shah Rukh Khan</a>) wins his first-ever National Award—33 years after debuting on the big screen—for his performance in the action thriller film Jawan at the 2023 National Awards. The veteran actor shares the accolade with Vikrant Massey (<a href="/VikrantMassey/">Vikrant Massey</a>), who won the award

thumb_up_off_alt2,2K

chat_bubble_outline24

repeat657

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

4 months ago

The awesome Welch Labs released an incredible YouTube video about how AI image/video generation works! This great intro video discusses CLIP, diffusion models, and classifier-free guidance in a visually easy-to-understand, approachable, concise way Definitely check it out!

thumb_up_off_alt416

chat_bubble_outline8

repeat61

shareShare

Shashank Gupta

@shashank27392

3 months ago

🚀 Excited to give a talk at Lossfunk, this Friday evening on Reinforcement Learning for Recommender Systems and Foundational Models! If you’re in Bangalore, I’d love to see you there! Please feel free to join in person or online. Registration link: lnkd.in/eN2RSxmG.

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

𝑺𝒉𝒆𝒃𝒂𝒔

@shebas_10dulkar

3 months ago

A few more Answers from 𝗦𝗮𝗰𝗵𝗶𝗻 𝗧𝗲𝗻𝗱𝘂𝗹𝗸𝗮𝗿 during today's ‘Ask Me Anything’ session on Reddit 💙 (1/7)

thumb_up_off_alt7,7K

chat_bubble_outline33

repeat431

shareShare

Maria Heuß

@maria_heuss

3 months ago

The call for papers for the #ECIR2026 IR-for-Good Track is now online here: ecir2026.eu/calls/call-for… Abstracts due: October 21 Papers due: October 28 We are revamping this track. For a summary of the changes that we are introducing this year, follow Bhaskar's thread.

thumb_up_off_alt3

chat_bubble_outline0

repeat3

shareShare

Maria Heuß

@maria_heuss

2 months ago

Submissions are now OPEN for the #IR4Good track at #ECIR2026! Submit your societally-motivated papers to this special track: easychair.org/conferences/?c…. Abstracts due October 21, Papers due October 28 Call: ecir2026.eu/calls/call-for…

thumb_up_off_alt1

chat_bubble_outline2

repeat2

shareShare

Nan Jiang

@nanjiang_cs

2 months ago

My 3rd blogpost on PG, the topic I am least familiar with but get asked a lot, so I thought I'd just put together the very limited stuff I know on this topic. Somehow the post gets cynical from time to time🙃 nanjiang.cs.illinois.edu/2025/09/29/pg.…

thumb_up_off_alt142

chat_bubble_outline1

repeat23

shareShare

Caglar

@caglar_ee

a month ago

Video lectures, UCLA Reinforcement Learning of Large Language Models spring 2025, by Ernest Ryu ernestryu.com/courses/RL-LLM… youtube.com/playlist?list=…

thumb_up_off_alt596

chat_bubble_outline1

repeat65

shareShare