Nived Rajaraman (@nived_rajaraman) Twitter Tweets • TwiCopy

Nived Rajaraman

@nived_rajaraman

+ Follow

EECS PhD student at Berkeley. Former intern at Deepmind. Reinforcement learning. I organize the BLISS seminar bliss.eecs.berkeley.edu/Seminar/index.…

ID: 2612375550

linkhttps://nivedr.github.io/ calendar_today08-07-2014 21:38:16

9 Tweet

77 Takipçi

111 Takip Edilen

Lior Pachter

@lpachter

3 years ago

This is sickening.I hope the survivor is ok. How is he still listed as teaching this year @stanford?

thumb_up_off_alt24

chat_bubble_outline8

repeat5

shareShare

I'll be at #NeurIPS2023, and the academic job market this year! RT will be greatly appreciated! I work on statistics and information theory, with applications in robust statistics, offline RL, game theory, human-AI interactions and LLMs. I'm recently working on better

thumb_up_off_alt69

chat_bubble_outline0

repeat20

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

9 months ago

Scaling Test-Time Compute Without Verification or RL is Suboptimal "In this paper, we prove that finetuning LLMs with verifier-based (VB) methods based on RL or search is far superior to verifier-free (VF) approaches based on distilling or cloning search traces, given a fixed

thumb_up_off_alt620

chat_bubble_outline14

repeat101

shareShare

Amrith Setlur

@setlur_amrith

9 months ago

🚨 RL or distillation/SFT: what to use to train next reasoning model? Which 📈 perf faster as we scale test compute? We answer these in a principled way so you don't have to burn GPUs🔥. 🎯 Ans: RL w/ rewards or verification >> SFT/distillation 😱 arxiv.org/pdf/2502.12118 🧵⤵️

thumb_up_off_alt228

chat_bubble_outline2

repeat41

shareShare

Eric Zhao

@ericzhao28

8 months ago

Thinking for longer (e.g. o1) is only one of many axes of test-time compute. In a new Google AI paper, we instead focus on scaling the search axis. By just randomly sampling 200x & self-verifying, Gemini 1.5 ➡️ o1 performance. The secret: self-verification is easier at scale!

Thinking for longer (e.g. o1) is only one of many axes of test-time compute. In a new <a href="/Google_AI/">Google AI</a> paper, we instead focus on scaling the search axis. By just randomly sampling 200x & self-verifying, Gemini 1.5 ➡️ o1 performance. The secret: self-verification is easier at scale!

thumb_up_off_alt2,2K

chat_bubble_outline49

repeat267

shareShare

Conference on Parsimony and Learning (CPAL)

@cpalconf

8 months ago

Our first Rising Stars session featured fantastic talks by Tianlong Chen Congyue Deng Nived Rajaraman Yihua Zhang Grigorios Chrysos

Our first Rising Stars session featured fantastic talks by <a href="/TianlongChen4/">Tianlong Chen</a> <a href="/CongyueD/">Congyue Deng</a> <a href="/Nived_Rajaraman/">Nived Rajaraman</a> <a href="/zyh2022/">Yihua Zhang</a> Grigorios Chrysos

thumb_up_off_alt15

chat_bubble_outline0

repeat3

shareShare

Aviral Kumar

@aviral_kumar2

7 months ago

@nived_rajaraman will give an oral talk at the VerifAI workshop on why RL or verification is needed to effectively scale test-time compute! Lots of interesting insights in this paper! At VerifAI workshop, 3:45pm, April 27 arxiv.org/abs/2502.12118 x.com/setlur_amrith/…

thumb_up_off_alt15

chat_bubble_outline1

repeat3

shareShare

Nived Rajaraman

@nived_rajaraman

6 months ago

A quick reminder that the deadline for abstract submissions to FoPt (May 19) is fast approaching! Submit your best works!

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Nived Rajaraman

@nived_rajaraman

6 months ago

The abstract submission deadline for FoPt has been extended to the 21st of May (11:59pm UTC). Submission website: openreview.net/group?id=learn…

thumb_up_off_alt11

chat_bubble_outline0

repeat3

shareShare

Ahmad Beirami @ ICLR 2025

@abeirami

4 months ago

At es-fomo workshop, talk to Rishabh Tiwari about scaling test-time compute as a function of user-facing latency (instead of FLOPS)

At es-fomo workshop, talk to <a href="/rish2k1/">Rishabh Tiwari</a> about scaling test-time compute as a function of user-facing latency (instead of FLOPS)

thumb_up_off_alt24

chat_bubble_outline1

repeat3

shareShare

Dylan Foster 🐢

@canondetortugas

4 months ago

Announcing the first workshop on Foundations of Language Model Reasoning (FoRLM) at NeurIPS 2025! 📝Soliciting abstracts that advance foundational understanding of reasoning in language models, from theoretical analyses to rigorous empirical studies. 📆 Deadline: Sept 3, 2025

thumb_up_off_alt158

chat_bubble_outline1

repeat27

shareShare

Dylan Foster 🐢

@canondetortugas

2 months ago

MSR NYC is hiring spring and summer interns in AI/ML/RL!

thumb_up_off_alt407

chat_bubble_outline8

repeat27

shareShare

Dylan Foster 🐢

@canondetortugas

2 months ago

Excited to announce our NeurIPS ’25 tutorial: Foundations of Imitation Learning: From Language Modeling to Continuous Control With Adam Block & Max Simchowitz (Max Simchowitz)

thumb_up_off_alt221

chat_bubble_outline3

repeat31

shareShare

Nived Rajaraman

Lior Pachter

Banghua Zhu

Tanishq Mathew Abraham, Ph.D.

Amrith Setlur

Eric Zhao

Conference on Parsimony and Learning (CPAL)

Aviral Kumar

Nived Rajaraman

Nived Rajaraman

Ahmad Beirami @ ICLR 2025

Dylan Foster 🐢

Dylan Foster 🐢

Dylan Foster 🐢