Chirag Nagpal (@nagpalchirag) 's Twitter Profile
Chirag Nagpal

@nagpalchirag

ID: 107757999

linkhttp://cs.cmu.edu/~chiragn calendar_today23-01-2010 16:43:38

1,1K Tweet

1,1K Followers

728 Following

James Campbell (@jam3scampbell) 's Twitter Profile Photo

I fucking love CMU. Looking through the course catalog, there's like >25 courses that cover LLMs and topics on the frontier of AI. This is what happens when you give Machine Learning, Language Technology, Robotics, etc their own entire departments, as god intended ๐Ÿซก

Lucas Beyer (bl16) (@giffmana) 's Twitter Profile Photo

People think my job is juggling equations and shit, but in reality 80% of my focus time is spent reducing memory usage one way or another.

Prateek Jain (@jainprateek_) 's Twitter Profile Photo

Excited to share that the Machine Learning and Optimization team at Google DeepMind India is hiring Research Scientists and Research Engineers! If you're passionate about cutting-edge AI research and building efficient, elastic, and safe LLMs, we'd love to hear from you. Check

Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

Google presents Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning Achieves 5 โˆ’ 6ร— gain in sample efficiency, 1.5 โˆ’ 5ร— more compute-efficiency, and > 6% gain in accuracy, over ORMs on test-time search arxiv.org/abs/2410.08146

Google presents Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning

Achieves 5 โˆ’ 6ร— gain in sample efficiency, 1.5 โˆ’ 5ร— more compute-efficiency, and > 6% gain in accuracy, over ORMs on test-time search

arxiv.org/abs/2410.08146
Amrith Setlur (@setlur_amrith) 's Twitter Profile Photo

๐Ÿšจ Exciting new results with dense process reward models (PRMs) for reasoning. Our PRMs scale โœ… search compute by 1.5-5x โœ… RL sample efficiency by 6x โœ… 3-4x โฌ†๏ธ accuracy gains vs prior works โŒ human supervision What's the secret sauce ๐Ÿค”?: See ๐Ÿงต โฌ‡๏ธ arxiv.org/pdf/2410.08146

๐Ÿšจ Exciting new results with dense process reward models (PRMs) for reasoning.  Our PRMs scale
โœ… search compute by 1.5-5x
โœ… RL sample efficiency by 6x
โœ… 3-4x โฌ†๏ธ accuracy gains vs prior works
โŒ human supervision

What's the secret sauce ๐Ÿค”?:  See ๐Ÿงต โฌ‡๏ธ
arxiv.org/pdf/2410.08146
Aviral Kumar (@aviral_kumar2) 's Twitter Profile Photo

๐ŸšจNew paper led by Amrith Setlur on process rewards for reasoning! Our PRMs that model specific notion of "progress" reward (NO human supervision) improve: - compute efficiency of search by 1.5-5x - online RL by 6x - 3-4x vs past PRM results arxiv.org/abs/2410.08146 How? ๐Ÿงต๐Ÿ‘‡

๐ŸšจNew paper led by <a href="/setlur_amrith/">Amrith Setlur</a> on process rewards for reasoning!

Our PRMs that model specific notion of "progress" reward (NO human supervision) improve:
- compute efficiency of search by 1.5-5x
- online RL by 6x
- 3-4x vs past PRM results

arxiv.org/abs/2410.08146

How? ๐Ÿงต๐Ÿ‘‡
Amit Sethi (@amit_sethi) 's Twitter Profile Photo

My student's paper was accepted in NeurIPS Conference workshop but the CanadianPM authorities denied his visa saying that he doesn't have ties outside Canada (his entire family is in India) and that attending the most important machine learning conference isn't legitimate business. ๐Ÿ˜ญ

Aravind Srinivas (@aravsrinivas) 's Twitter Profile Photo

Yep. I have been waiting for my green card for like the last 3 years. Still havenโ€™t gotten it. People mostly have no idea when they talk about immigration.

Abhilasha Ravichander (@lasha_nlp) 's Twitter Profile Photo

โœจIโ€™m on the faculty job market for 2024-2025! โœจ My research focuses on advancing Responsible AIโ€”enhancing factuality, robustness, and transparency in AI systems. Iโ€™m at #EMNLP2024 this week๐ŸŒด and would love to chat about research and hear any advice!

Chirag Nagpal (@nagpalchirag) 's Twitter Profile Photo

Many years ago a certain Sir Humphrey and Jim Hacker from a certain Department of Administrative Affairs were tasked to reduce government inefficiency. I'm sure today they are proud of their cousins across the pond continuing their timeless legacy.

Ahmad Beirami (@abeirami) 's Twitter Profile Photo

Excited to share ๐ˆ๐ง๐Ÿ๐€๐ฅ๐ข๐ ๐ง! Alignment optimization objective implicitly assumes ๐˜ด๐˜ข๐˜ฎ๐˜ฑ๐˜ญ๐˜ช๐˜ฏ๐˜จ from the resulting aligned model. But we are increasingly using different and sometimes sophisticated inference-time compute algorithms. How to resolve this discrepancy?๐Ÿงต

Excited to share ๐ˆ๐ง๐Ÿ๐€๐ฅ๐ข๐ ๐ง!

Alignment optimization objective implicitly assumes ๐˜ด๐˜ข๐˜ฎ๐˜ฑ๐˜ญ๐˜ช๐˜ฏ๐˜จ from the resulting aligned model. But we are increasingly using different and sometimes sophisticated inference-time compute algorithms. 

How to resolve this discrepancy?๐Ÿงต
Marktechpost AI Research News โšก (@marktechpost) 's Twitter Profile Photo

Google DeepMind Researchers Introduce InfAlign: A Machine Learning Framework for Inference-Aware Language Model Alignment Researchers at Google DeepMind and Google Research have developed InfAlign, a machine-learning framework designed to align language models with

Google DeepMind Researchers Introduce InfAlign: A Machine Learning Framework for Inference-Aware Language Model Alignment

Researchers at Google DeepMind and Google Research have developed InfAlign, a machine-learning framework designed to align language models with