Paria Rashidinejad (@paria_rd) 's Twitter Profile
Paria Rashidinejad

@paria_rd

Incoming Assistant Professor of ECE @USC | Research Scientist at FAIR, GenAI @AIatMeta | PhD @Berkeley_EECS @berkeley_ai @CHAI_Berkeley

ID: 1634327660007669760

calendar_today10-03-2023 22:57:55

10 Tweet

104 Followers

191 Following

Max Simchowitz (@max_simchowitz) 's Twitter Profile Photo

Hey Everyone!! I will be giving a lecture at the RL Theory Virtual Seminar tomorrow, on my new paper about the β€œPitfalls of Imitation Learning" in continuous action spaces. 🧡 below; please read because the time is somewhat TBD..... 🧐

Hey Everyone!! I will be giving a lecture at the RL Theory Virtual Seminar tomorrow, on my new paper about the β€œPitfalls of Imitation Learning" in continuous action spaces.  🧡 below;  please read because the time is somewhat TBD..... 🧐
Jason Lee (@jasondeanlee) 's Twitter Profile Photo

Our new work on scaling laws that includes compute, model size, and number of samples. The analysis involves an extremely fine-grained analysis of online sgd built up over the last 8 years of understanding sgd on simple toy models (tensors, single index models, multi index model)

Nived Rajaraman (@nived_rajaraman) 's Twitter Profile Photo

Announcing the first workshop on Foundations of Post-Training (FoPT) at COLT 2025! πŸ“ Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models! β”‚ πŸ—“οΈ Deadline: May 19, 2025

Announcing the first workshop on Foundations of Post-Training (FoPT) at COLT 2025!

πŸ“ Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models!
β”‚
πŸ—“οΈ Deadline: May 19, 2025
Yuandong Tian (@tydsh) 's Twitter Profile Photo

πŸ“’ Our travel planner solver (arxiv.org/abs/2410.16456, published in EMNLP Demo Track'24, and arxiv.org/abs/2411.13904) is now open sourced in github.com/facebookresear… πŸš€πŸš€ In these works, we build LLM-equipped agent that can take user inputs in natural language, in either the

Simon Shaolei Du (@simonshaoleidu) 's Twitter Profile Photo

PPO vs. DPO? πŸ€” Our new paper proves that it depends on whether your models can represent the optimal policy and/or reward. Paper: arxiv.org/abs/2505.19770 Led by Ruizhe Shi Minhak Song

Xian Li (@xl_nlp) 's Twitter Profile Photo

Scaling RL is πŸ”₯ what's the fuel? We propose a new paradigm to generate "synthetic environments", unlimited tasks with verifiable rewards for self-improvement. Welcome to the era of experience πŸ™Œ

Mehrdad Farajtabar (@mfarajtabar) 's Twitter Profile Photo

🧡 1/8 The Illusion of Thinking: Are reasoning models like o1/o3, DeepSeek-R1, and Claude 3.7 Sonnet really "thinking"? πŸ€” Or are they just throwing more compute towards pattern matching? The new Large Reasoning Models (LRMs) show promising gains on math and coding benchmarks,

🧡 1/8 The Illusion of Thinking: Are reasoning models like o1/o3, DeepSeek-R1, and Claude 3.7 Sonnet really "thinking"? πŸ€” Or are they just throwing more compute towards pattern matching?

The new Large Reasoning Models (LRMs) show promising gains on math and coding benchmarks,
Amrith Setlur (@setlur_amrith) 's Twitter Profile Photo

Introducing e3 πŸ”₯ Best <2B model on math πŸ’ͺ Are LLMs implementing algos βš’οΈ OR is thinking an illusion 🎩.? Is RL only sharpening the base LLM distrib. πŸ€” OR discovering novel strategies outside base LLM πŸ’‘? We answer these ‡️ 🚨 arxiv.org/abs/2506.09026 🚨 matthewyryang.github.io/e3/

Introducing e3 πŸ”₯ Best &lt;2B model on math πŸ’ͺ
Are LLMs implementing algos βš’οΈ OR is thinking an illusion 🎩.? Is RL only sharpening the base LLM distrib. πŸ€” OR discovering novel strategies outside base LLM πŸ’‘?  We answer these ‡️
🚨 arxiv.org/abs/2506.09026
🚨 matthewyryang.github.io/e3/
Jiawei Zhao (@jiawzhao) 's Twitter Profile Photo

You can skip prompts that aren’t useful for the current policy during training! πŸ” Efficient prompt selection is key to scaling RL training for LLM reasoning. We are actively building algos for efficient and scalable RL training system. Stay tuned!

Xian Li (@xl_nlp) 's Twitter Profile Photo

The future of pretraining is synthetically organic β™»οΈβ™»οΈπŸŒ±πŸŒ± We show that grounded synthetic data outperforms high-quality human-written web data such as curated DCLM.

Csaba Szepesvari (@csabaszepesvari) 's Twitter Profile Photo

First position paper I ever wrote. "Beyond Statistical Learning: Exact Learning Is Essential for General Intelligence" arxiv.org/abs/2506.23908 Background: I'd like LLMs to help me do math, but statistical learning seems inadequate to make this happen. What do you all think?

Brandon Amos (@brandondamos) 's Twitter Profile Photo

Excited to release AlgoTune!! It's a benchmark and coding agent for optimizing the runtime of numerical code πŸš€ algotune.io πŸ“š algotune.io/paper.pdf πŸ€– github.com/oripress/AlgoT… with Ofir Press Ori Press Patrick Kidger Bartolomeo Stellato Arman Zharmagambetov & many others 🧡