Mehrdad Moghimi (@mehrdadm96) Twitter Tweets • TwiCopy

Parishad BehnamGhader

a year ago

Instruction-following retrievers can efficiently and accurately search for harmful and sensitive information on the internet! 🌐💣 Retrievers need to be aligned too! 🚨🚨🚨 Work done with the wonderful Nicholas Meade and Siva Reddy 🔗 mcgill-nlp.github.io/malicious-ir/ Thread: 🧵👇

thumb_up_off_alt42

chat_bubble_outline2

repeat16

shareShare

Amirhossein Kazemnejad

@a_kazemnejad

a year ago

Introducing nanoAhaMoment: Karpathy-style, single file RL for LLM library (<700 lines) - super hackable - no TRL / Verl, no abstraction💆‍♂️ - Single GPU, full param tuning, 3B LLM - Efficient (R1-zero countdown < 10h) comes with a from-scratch, fully spelled out YT video [1/n]

thumb_up_off_alt1,1K

chat_bubble_outline15

repeat164

shareShare

Matthew Jackson

@jacksonmattt

a year ago

🌹 Today we're releasing Unifloral, our new library for Offline Reinforcement Learning! We make research easy: ⚛️ Single-file 🤏 Minimal ⚡️ End-to-end Jax Best of all, we unify prior methods into one algorithm - a single hyperparameter space for research! ⤵️

thumb_up_off_alt138

chat_bubble_outline5

repeat35

shareShare

Gautam Kamath

@thegautamkamath

a year ago

I wrote a post on how to connect with people (i.e., make friends) at CS conferences. These events can be intimidating so here's some suggestions on how to navigate them I'm late for #ICLR2025 #NAACL2025, but just in time for #AISTATS2025 and timely for #ICML2025 acceptances! 1/4

thumb_up_off_alt647

chat_bubble_outline4

repeat83

shareShare

Jacob E. Kooi

@jacobekooi

10 months ago

📢New paper on arXiv: Hadamax Encoding: Elevating Performance in Model-Free Atari. (arxiv.org/abs/2505.15345) Our Hadamax (Hadamard max-pooling) encoder architecture improves the recent PQN algorithm’s Atari performance by 80%, allowing it to significantly surpass Rainbow-DQN!

thumb_up_off_alt47

chat_bubble_outline2

repeat9

shareShare

Younggyo Seo

@younggyoseo

10 months ago

Excited to present FastTD3: a simple, fast, and capable off-policy RL algorithm for humanoid control -- with an open-source code to run your own humanoid RL experiments in no time! Thread below 🧵

thumb_up_off_alt517

chat_bubble_outline14

repeat107

shareShare

Peyman Milanfar

@docmilanfar

9 months ago

thumb_up_off_alt119

chat_bubble_outline0

repeat7

shareShare

Mehrdad Moghimi

@mehrdadm96

8 months ago

I’ll be at #ICML2025 and would love to connect and chat about everything RL. Feel free to reach out if you’re around!

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Mehrdad Moghimi

@mehrdadm96

8 months ago

Had a great time presenting my work at #ICML2025! Really enjoyed the conversations and meeting so many awesome people.

thumb_up_off_alt44

chat_bubble_outline0

repeat3

shareShare

Mehrdad Moghimi

@mehrdadm96

8 months ago

One of the best talks I’ve attended at #ICML2025: “Open-Ended and AI-Generating Algorithms in the Era of Foundation Models” by the brilliant Jeff Clune at the EXAIT workshop.

One of the best talks I’ve attended at #ICML2025:
“Open-Ended and AI-Generating Algorithms in the Era of Foundation Models” by the brilliant <a href="/jeffclune/">Jeff Clune</a> at the EXAIT workshop.

thumb_up_off_alt10

chat_bubble_outline0

repeat0

shareShare

Mehrdad Moghimi

@mehrdadm96

7 months ago

Keynotes from RLC 2025 are out! youtube.com/playlist?list=…

thumb_up_off_alt29

chat_bubble_outline0

repeat3

shareShare

Milad Aghajohari

@maghajohari

5 months ago

Introducing linear scaling of reasoning: 𝐓𝐡𝐞 𝐌𝐚𝐫𝐤𝐨𝐯𝐢𝐚𝐧 𝐓𝐡𝐢𝐧𝐤𝐞𝐫 Reformulate RL so thinking scales 𝐎(𝐧) 𝐜𝐨𝐦𝐩𝐮𝐭𝐞, not O(n^2), with O(1) 𝐦𝐞𝐦𝐨𝐫𝐲, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy 🧵

thumb_up_off_alt919

chat_bubble_outline14

repeat200

shareShare

Gautam Kamath

@thegautamkamath

3 months ago

Thomas G. Dietterich X This Chrome extension allows you to disable that tab (and hide a bunch of other features that I don't care about) chromewebstore.google.com/detail/control…

thumb_up_off_alt9

chat_bubble_outline0

repeat1

shareShare

Mehrdad Moghimi

@mehrdadm96

3 months ago

Highly recommend!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Open Source Intel

@osint613

2 months ago

We have passed the 60 hour mark of the nationwide internet blackout in Iran. - NetBlocks

thumb_up_off_alt768

chat_bubble_outline55

repeat225

shareShare

Ahmad Beirami @ ICLR 2025

@abeirami

2 months ago

x.com/i/article/2012…

thumb_up_off_alt453

chat_bubble_outline1

repeat80

shareShare

Peyman Milanfar

@docmilanfar

2 months ago

So the story goes that Iran was ruled by Zahhak, an evil tyrant who had two snakes growing from his shoulders that required a daily meal of human brains. For decades, the country lived in terror as young men were sacrificed to feed the snakes. One day, Kaveh, a simple blacksmith

thumb_up_off_alt1,1K

chat_bubble_outline19

repeat226

shareShare

Milad Aghajohari

@maghajohari

a month ago

We're organizing a workshop on ICML on multi-agent societies and looking for reviewers. Review two max papers (April 27-May 12). We will hand out 10 best reviewer awards of $100 as thanks. Register to review here: forms.gle/z3znC6Ed9zdnk9…

thumb_up_off_alt64

chat_bubble_outline1

repeat9

shareShare