Fahim Tajwar (@fahimtajwar10) Twitter Tweets • TwiCopy

Murtaza Dalal

5 months ago

Incredibly excited to share that Neural MP got accepted to IROS as an Oral presentation!! Huge congrats to the whole team (Jiahui(Jim) Yang, Russell Mendonca , youssef khaky, Russ Salakhutdinov, Deepak Pathak), but especially Jiahui(Jim) Yang for making this happen after I graduated! This now

thumb_up_off_alt85

chat_bubble_outline2

repeat7

shareShare

Allen Nie (🇺🇦☮️)

@allen_a_nie

5 months ago

Decision-making with LLM can be studied with RL! Can an agent solve a task with text feedback (OS terminal, compiler, a person) efficiently? How can we understand the difficulty? We propose a new notion of learning complexity to study learning with language feedback only. 🧵👇

thumb_up_off_alt79

chat_bubble_outline2

repeat16

shareShare

Yiding Jiang

@yidingjiang

5 months ago

A mental model I find useful: all data acquisition (web scrapes, synthetic data, RL rollouts, etc.) is really an exploration problem 🔍. This perspective has some interesting implications for where AI is heading. Wrote down some thoughts: yidingjiang.github.io/blog/post/expl…

thumb_up_off_alt412

chat_bubble_outline5

repeat56

shareShare

Sukjun (June) Hwang

@sukjun_hwang

4 months ago

Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data

thumb_up_off_alt2,2K

chat_bubble_outline58

repeat355

shareShare

Yiding Jiang

@yidingjiang

4 months ago

I will be at ICML next week. If you are interested in chatting about anything related to generalization, exploration, and algorithmic information theory + computation, please get in touch 😀 (DM or email)! My coauthors and I will be presenting 2 papers 👇:

thumb_up_off_alt86

chat_bubble_outline6

repeat9

shareShare

Yiding Jiang

@yidingjiang

4 months ago

Abitha will be presenting our work on training language models to predict further into the future beyond the next token and the benefits this objective brings. x.com/gm8xx8/status/…

thumb_up_off_alt18

chat_bubble_outline0

repeat5

shareShare

Fahim Tajwar

@fahimtajwar10

4 months ago

Please attend Yiding Jiang 's oral presentation of our work, Paprika, at ICML!

thumb_up_off_alt23

chat_bubble_outline0

repeat2

shareShare

Alex Robey

@alexrobey23

4 months ago

On Monday, I'll be presenting a tutorial on jailbreaking LLMs + the security of AI agents with Hamed Hassani and Amin Karbasi at ICML. I'll be in Vancouver all week -- send me a DM if you'd like to chat about jailbreaking, AI agents, robots, distillation, or anything else!

thumb_up_off_alt77

chat_bubble_outline2

repeat9

shareShare

Gokul Swamy

@g_k_swamy

4 months ago

Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on these claims and find out this unexpected behavior hinges on the inclusion of certain *heuristics* in the RL algorithm. Our blog post: tinyurl.com/heuristics-con…

thumb_up_off_alt477

chat_bubble_outline11

repeat69

shareShare

Fahim Tajwar

@fahimtajwar10

4 months ago

Please checkout Gaurav's insanely cool work on memorization, if you are at ICML!

thumb_up_off_alt13

chat_bubble_outline0

repeat0

shareShare

Fahim Tajwar

@fahimtajwar10

3 months ago

Please check out Wen-Tse's very cool work!

thumb_up_off_alt9

chat_bubble_outline0

repeat0

shareShare

Sachin Goyal

@goyalsachin007

2 months ago

1/Excited to share the first in a series of my research updates on LLM pretraining🚀. Our new work shows *distilled pretraining*—increasingly used to train deployable models—has trade-offs: ✅ Boosts test-time scaling ⚠️ Weakens in-context learning ✨ Needs tailored data curation

thumb_up_off_alt328

chat_bubble_outline5

repeat64

shareShare

Fahim Tajwar

@fahimtajwar10

2 months ago

Please check this cool work from my friend Gaurav Ghosal !

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Fahim Tajwar

@fahimtajwar10

2 months ago

Please check out this very cool work from Jubayer Ibn Hamid !

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Anikait Singh

@anikait_singh_

2 months ago

🚨🚨New Paper: Training LLMs to Discover Abstractions for Solving Reasoning Problems Introducing RLAD, a two-player RL framework for LLMs to discover 'reasoning abstractions'—natural language hints that encode procedural knowledge for structured exploration in reasoning.🧵⬇️

thumb_up_off_alt586

chat_bubble_outline14

repeat116

shareShare

Sachin Goyal

@goyalsachin007

22 days ago

📢 Multi-token prediction has long struggled with defining the right “auxiliary target,” leading to tons of heuristics. We show a core limitation of these and propose a simple & sweet idea: future summary prediction. Introducing what I call 🚀TL;DR token pretraining🚀

thumb_up_off_alt245

chat_bubble_outline3

repeat43

shareShare