Fahim Tajwar (@fahimtajwar10) 's Twitter Profile
Fahim Tajwar

@fahimtajwar10

PhD Student @mldcmu @SCSatCMU
BS/MS from @Stanford

ID: 1385279468126625798

linkhttps://tajwarfahim.github.io/ calendar_today22-04-2021 17:09:23

58 Tweet

360 Followers

303 Following

Murtaza Dalal (@mihdalal) 's Twitter Profile Photo

Incredibly excited to share that Neural MP got accepted to IROS as an Oral presentation!! Huge congrats to the whole team (Jiahui(Jim) Yang, Russell Mendonca , youssef khaky, Russ Salakhutdinov, Deepak Pathak), but especially Jiahui(Jim) Yang for making this happen after I graduated! This now

Allen Nie (🇺🇦☮️) (@allen_a_nie) 's Twitter Profile Photo

Decision-making with LLM can be studied with RL! Can an agent solve a task with text feedback (OS terminal, compiler, a person) efficiently? How can we understand the difficulty? We propose a new notion of learning complexity to study learning with language feedback only. 🧵👇

Decision-making with LLM can be studied with RL! Can an agent solve a task with text feedback (OS terminal, compiler, a person) efficiently? How can we understand the difficulty? We propose a new notion of learning complexity to study learning with language feedback only. 🧵👇
Yiding Jiang (@yidingjiang) 's Twitter Profile Photo

A mental model I find useful: all data acquisition (web scrapes, synthetic data, RL rollouts, etc.) is really an exploration problem 🔍. This perspective has some interesting implications for where AI is heading. Wrote down some thoughts: yidingjiang.github.io/blog/post/expl…

Sukjun (June) Hwang (@sukjun_hwang) 's Twitter Profile Photo

Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data

Yiding Jiang (@yidingjiang) 's Twitter Profile Photo

I will be at ICML next week. If you are interested in chatting about anything related to generalization, exploration, and algorithmic information theory + computation, please get in touch 😀 (DM or email)! My coauthors and I will be presenting 2 papers 👇:

I will be at ICML next week. If you are interested in chatting about anything related to generalization, exploration, and algorithmic information theory + computation, please get in touch 😀 (DM or email)!

My coauthors and I will be presenting 2 papers 👇:
Yiding Jiang (@yidingjiang) 's Twitter Profile Photo

Abitha will be presenting our work on training language models to predict further into the future beyond the next token and the benefits this objective brings. x.com/gm8xx8/status/…

Alex Robey (@alexrobey23) 's Twitter Profile Photo

On Monday, I'll be presenting a tutorial on jailbreaking LLMs + the security of AI agents with Hamed Hassani and Amin Karbasi at ICML. I'll be in Vancouver all week -- send me a DM if you'd like to chat about jailbreaking, AI agents, robots, distillation, or anything else!

Gokul Swamy (@g_k_swamy) 's Twitter Profile Photo

Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on these claims and find out this unexpected behavior hinges on the inclusion of certain *heuristics* in the RL algorithm. Our blog post: tinyurl.com/heuristics-con…

Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on these claims and find out this unexpected behavior hinges on the inclusion of certain *heuristics* in the RL algorithm. Our blog post: tinyurl.com/heuristics-con…
Sachin Goyal (@goyalsachin007) 's Twitter Profile Photo

1/Excited to share the first in a series of my research updates on LLM pretraining🚀. Our new work shows *distilled pretraining*—increasingly used to train deployable models—has trade-offs: ✅ Boosts test-time scaling ⚠️ Weakens in-context learning ✨ Needs tailored data curation

1/Excited to share the first in a series of my research updates on LLM pretraining🚀.
Our new work shows *distilled pretraining*—increasingly used to train deployable models—has trade-offs:
✅ Boosts test-time scaling
⚠️ Weakens in-context learning
✨ Needs tailored data curation
Anikait Singh (@anikait_singh_) 's Twitter Profile Photo

🚨🚨New Paper: Training LLMs to Discover Abstractions for Solving Reasoning Problems Introducing RLAD, a two-player RL framework for LLMs to discover 'reasoning abstractions'—natural language hints that encode procedural knowledge for structured exploration in reasoning.🧵⬇️

🚨🚨New Paper: Training LLMs to Discover Abstractions for Solving Reasoning Problems

Introducing RLAD, a two-player RL framework for LLMs to discover 'reasoning abstractions'—natural language hints that encode procedural knowledge for structured exploration in reasoning.🧵⬇️
Sachin Goyal (@goyalsachin007) 's Twitter Profile Photo

📢 Multi-token prediction has long struggled with defining the right “auxiliary target,” leading to tons of heuristics. We show a core limitation of these and propose a simple & sweet idea: future summary prediction. Introducing what I call 🚀TL;DR token pretraining🚀

📢 Multi-token prediction has long struggled with defining the right “auxiliary target,” leading to tons of heuristics. We show a core limitation of these and propose a simple & sweet idea: future summary prediction.

Introducing what I call 
🚀TL;DR token pretraining🚀