Aditya Modi (@adityamodi94) Twitter Tweets • TwiCopy

Filip Piekniewski🌻 🐘:@[email protected]

8 years ago

Another #AI winter article this time from John Langford (Microsoft). Apparently I'm not isolated in my views. hunch.net/?p=9604328

thumb_up_off_alt22

chat_bubble_outline1

repeat8

shareShare

Hal Daumé III

@haldaume3

7 years ago

John Langford on "A Real World Reinforcement Learning Research Program" -- basically laying out alternatives to the "games first, real problems later" approach to reinforcement learning research. And also a note about hiring.... 😁😁😁 hunch.net/?p=9828091

thumb_up_off_alt185

chat_bubble_outline3

repeat60

shareShare

Amir-massoud Farahmand

@sologen

7 years ago

If you are interested in model-based reinforcement learning (MBRL), you want to read Iterative Value-Aware Model Learning, which is accepted at #NeurIPS2018. papers.nips.cc/paper/8121-ite…

thumb_up_off_alt47

chat_bubble_outline1

repeat7

shareShare

Ben Recht

@beenwrekt

7 years ago

Dimitri has a new monograph on RL and is soliciting feedback: web.mit.edu/dimitrib/www/R…

thumb_up_off_alt231

chat_bubble_outline1

repeat75

shareShare

Debadeepta Dey

@debadeepta

7 years ago

RL in the real-world: How to optimize computational pipelines on-the-fly! Eric Horvitz Besmira Nushi 💙💛 Adith Swaminathan @seanandrist alekh agarwal aditya modi Microsoft Research

thumb_up_off_alt13

chat_bubble_outline1

repeat6

shareShare

Microsoft Research

@msftresearch

6 years ago

In a dynamic world, static configurations are no longer enough. Researchers propose a metareasoning approach to software pipeline optimization that leverages RL to monitor pipelines and adjust module parameters on the fly for optimal performance: aka.ms/AA78h6t #AAAI20

thumb_up_off_alt691

chat_bubble_outline1

repeat84

shareShare

Clément Canonne (on Blue🦋Sky)

@ccanonne_

6 years ago

📊 So, the results... Learning discrete distributions over a finite domain of size k to distance ε, with probability 1-δ: how hard can it be? 1/9 x.com/ccanonne_/stat…

thumb_up_off_alt26

chat_bubble_outline1

repeat8

shareShare

TCS+

@tcs_plus

6 years ago

We just scheduled Thomas Steinke (Thomas Steinke) to talk about his recent paper "Reasoning About Generalization via Conditional Mutual Information" (with Lydia Lydia Zakynthinou) on March 11! Mark your calendars, and stay tuned for further details! arxiv.org/abs/2001.09122

thumb_up_off_alt49

chat_bubble_outline0

repeat9

shareShare

Yann LeCun

@ylecun

3 years ago

A new flavor of ConvNet crushes various flavors of transformers (as well as state-space models) for sequence modeling with long-range dependencies.

thumb_up_off_alt881

chat_bubble_outline15

repeat112

shareShare

Nan Jiang

@nanjiang_cs

3 years ago

Come to Hall J #315 at 11a and Jinglin & Aditya Modi will tell you abt general learnability of Reward-free RL! R-f RL exhaustively explores the env & thus has heavily relied on linear structures. We now can handle non-linear FA w/ Bellman-eluder dim. More findings👇(1/2)

Come to Hall J #315 at 11a and Jinglin & <a href="/adityamodi94/">Aditya Modi</a> will tell you abt general learnability of Reward-free RL!

R-f RL exhaustively explores the env & thus has heavily relied on linear structures. We now can handle non-linear FA w/ Bellman-eluder dim. More findings👇(1/2)

thumb_up_off_alt10

chat_bubble_outline1

repeat2

shareShare

Aniket Deshmukh

@aniketde92

3 years ago

The 2nd Workshop on Decision Making for Modern IR and Recsys The Web Conference is calling for paper (decisionmaking4ir.github.io/WWW-2023/)! The paper submission deadline is Feb 6. #AI #ML #recsys #decisionmaking #bandits #reinforcementlearning #informationretrieval

thumb_up_off_alt1

chat_bubble_outline1

repeat3

shareShare

Aditya Modi

@adityamodi94

a year ago

Come, check out our poster on contextual goal-oriented RL tmr at NeurIPS!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Allen Nie (🇺🇦☮️)

@allen_a_nie

6 months ago

Decision-making with LLM can be studied with RL! Can an agent solve a task with text feedback (OS terminal, compiler, a person) efficiently? How can we understand the difficulty? We propose a new notion of learning complexity to study learning with language feedback only. 🧵👇

thumb_up_off_alt79

chat_bubble_outline2

repeat16

shareShare

Allen Nie (🇺🇦☮️)

@allen_a_nie

5 months ago

Provably Learning from Language Feedback TLDR: RL theory can help us do better inference-time exploration with feedback. Work done with Wanqiao Xu, Ruijie Zheng, Ching-An Cheng @ICML2025, Aditya Modi, Adith Swaminathan 📰 arxiv.org/pdf/2506.10341 📍EXAIT Best Paper/Oral Sat 8:45-9:30 am

Provably Learning from Language Feedback

TLDR: RL theory can help us do better inference-time exploration with feedback.

Work done with <a href="/wanqiao_xu/">Wanqiao Xu</a>, <a href="/ruijie_zheng12/">Ruijie Zheng</a>, <a href="/chinganc_rl/">Ching-An Cheng @ICML2025</a>, <a href="/adityamodi94/">Aditya Modi</a>, <a href="/adith387/">Adith Swaminathan</a>

📰 arxiv.org/pdf/2506.10341
📍EXAIT Best Paper/Oral Sat 8:45-9:30 am

thumb_up_off_alt21

chat_bubble_outline1

repeat8

shareShare

Allen Nie (🇺🇦☮️)

@allen_a_nie

5 months ago

If you missed Wanqiao Xu’s presentation, here are some of our slides! (The workshop will post full slides later on their website) Paper: arxiv.org/abs/2506.10341

If you missed <a href="/wanqiao_xu/">Wanqiao Xu</a>’s presentation, here are some of our slides! (The workshop will post full slides later on their website)

Paper: arxiv.org/abs/2506.10341

thumb_up_off_alt115

chat_bubble_outline3

repeat18

shareShare

ICML Conference

@icmlconf

8 years ago

ICML 2017 videos have been posted icml.cc/Conferences/20…

thumb_up_off_alt222

chat_bubble_outline6

repeat95

shareShare

Steve Noah

@steve_os

8 years ago

No idea what this game is called, but whoever made it, is the devil. x.com/P_MEN876/statu…

thumb_up_off_alt140,140K

chat_bubble_outline3,3K

repeat75,75K

shareShare

Ben Recht

@beenwrekt

8 years ago

At 4% test error, the y-axis of this plot is contained in a 95% confidence interval. Each *data point* required 450 GPUs for 7 days. x.com/OriolVinyalsML…

thumb_up_off_alt145

chat_bubble_outline6

repeat65

shareShare