Athul Paul Jacob (@apjacob03) Twitter Tweets • TwiCopy

good girl

@goodgirlxsz

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Why do we treat train and test times so differently? Why is one “training” and the other “in-context learning”? Just take a few gradients during test-time — a simple way to increase test time compute — and get a SoTA in ARC public validation set 61%=avg. human score! ARC Prize

thumb_up_off_alt1,1K

chat_bubble_outline36

repeat345

shareShare

Shikhar

@shikharmurty

a year ago

Super excited to share NNetnav : A new method for generating complex demonstrations to train web agents—driven entirely via exploration! Here's how we’re building useful browser agents, without expensive human supervision: 🧵👇 Code: github.com/MurtyShikhar/N… Preprint:

thumb_up_off_alt134

chat_bubble_outline4

repeat39

shareShare

Laura Ruis

@lauraruis

a year ago

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

thumb_up_off_alt966

chat_bubble_outline24

repeat208

shareShare

Yoshua Bengio

@yoshua_bengio

a year ago

Thank you to the NeurIPS Conference for this recognition of the Generative Adversarial Nets paper published ten years ago with my colleagues Ian Goodfellow, Jean Pouget-Abadie, @memimo, Bing Xu, David Warde-Farley 🇺🇦 @[email protected] , Sherjil Ozair and Aaron Courville.

thumb_up_off_alt469

chat_bubble_outline7

repeat25

shareShare

Sherjil Ozair

@sherjilozair

a year ago

Very happy to hear that GANs are getting the test of time award at NeurIPS 2024. The NeurIPS test of time awards are given to papers which have stood the test of the time for a decade. I took some time to reminisce how GANs came about and how AI has evolve in the last decade.

thumb_up_off_alt982

chat_bubble_outline18

repeat119

shareShare

Andrej Karpathy

@karpathy

a year ago

The (true) story of development and inspiration behind the "attention" operator, the one in "Attention is All you Need" that introduced the Transformer. From personal email correspondence with the author 🇺🇦 Dzmitry Bahdanau @ NeurIPS ~2 years ago, published here and now (with permission) following

thumb_up_off_alt6,6K

chat_bubble_outline137

repeat1,1K

shareShare

Jacob Andreas

@jacobandreas

a year ago

Is your CS dept worried about what academic research should be in the age of LLMs? Hire one of my lab members! Leshem Choshen (Leshem (Legend) Choshen 🤖🤗), Pratyusha Sharma (Pratyusha Sharma) and Ekin Akyürek (Ekin Akyürek) are all on the job market with unique perspectives on the future of NLP: 🧵

thumb_up_off_alt230

chat_bubble_outline5

repeat28

shareShare

Jacob Andreas

@jacobandreas

a year ago

Leshem Choshen (Leshem (Legend) Choshen 🤖🤗 @ACL) is developing the technical+social infrastructure needed to support communal training of large-scale ML systems. Before MIT was one of the inventors of "model merging" techniques; these days he’s thinking about collecting & learning from user feedback.

Leshem Choshen (<a href="/LChoshen/">Leshem (Legend) Choshen 🤖🤗 @ACL</a>) is developing the technical+social infrastructure needed to support communal training of large-scale ML systems. Before MIT was one of the inventors of "model merging" techniques; these days he’s thinking about collecting & learning from user feedback.

thumb_up_off_alt44

chat_bubble_outline1

repeat5

shareShare

Jacob Andreas

@jacobandreas

a year ago

Pratyusha Sharma (Pratyusha Sharma) studies how language shapes learned representations in artificial and biological intelligence. Her discoveries have shaped our understanding of systems as diverse as LLMs and sperm whales.

Pratyusha Sharma (<a href="/pratyusha_PS/">Pratyusha Sharma</a>) studies how language shapes learned representations in artificial and biological intelligence. Her discoveries have shaped our understanding of systems as diverse as LLMs and sperm whales.

thumb_up_off_alt65

chat_bubble_outline1

repeat8

shareShare

Jacob Andreas

@jacobandreas

a year ago

Ekin Akyürek (Ekin Akyürek) builds tools for understanding & controlling algorithms that underlie reasoning in language models. You’ve likely seen his work on in-context learning; I'm just as excited about past work on linguistic generalization & future work on test-time scaling.

Ekin Akyürek (<a href="/akyurekekin/">Ekin Akyürek</a>) builds tools for understanding & controlling algorithms that underlie reasoning in language models. You’ve likely seen his work on in-context learning; I'm just as excited about past work on linguistic generalization & future work on test-time scaling.

thumb_up_off_alt46

chat_bubble_outline3

repeat7

shareShare

Eugene Vinitsky 🍒🦋

@eugenevinitsky

10 months ago

We've built a simulated driving agent that we trained on 1.6 billion km of driving with no human data. It is SOTA on every planning benchmark we tried. In self-play, it goes 20 years between collisions.

thumb_up_off_alt884

chat_bubble_outline33

repeat95

shareShare

Samuel Sokota

@ssokota

10 months ago

Model-free deep RL algorithms like NFSP, PSRO, ESCHER, & R-NaD are tailor-made for games with hidden information (e.g. poker). We performed the largest-ever comparison of these algorithms. We find that they do not outperform generic policy gradient methods, such as PPO. 1/N

thumb_up_off_alt351

chat_bubble_outline9

repeat59

shareShare

Gokul Swamy

@g_k_swamy

9 months ago

1.5 yrs ago, we set out to answer a seemingly simple question: what are we *actually* getting out of RL in fine-tuning? I'm thrilled to share a pearl we found on the deepest dive of my PhD: the value of RL in RLHF seems to come from *generation-verification gaps*. Get ready to🤿!

thumb_up_off_alt1,1K

chat_bubble_outline24

repeat231

shareShare

Eugene Vinitsky 🍒🦋

@eugenevinitsky

9 months ago

Hiring researchers and engineers for a stealth, applied research company with a focus on RL x foundation models. Folks on the team already are leading RL / learning researchers. If you think you'd be good at the research needed to get things working in practice, email me

thumb_up_off_alt522

chat_bubble_outline10

repeat35

shareShare

Eugene Vinitsky 🍒🦋

@eugenevinitsky

9 months ago

There are so many cool things that can be built today with RL. Was great catching up with Nathan Lambert and talking about RL scaling and self-driving

thumb_up_off_alt40

chat_bubble_outline0

repeat5

shareShare

Belinda Li @ ICLR 2025

@belindazli

9 months ago

Past work has shown that world state is linearly decodable from LMs trained on text and games like Othello. But how do LMs *compute* these states? We investigate state tracking using permutation composition as a model problem, and discover interpretable, controllable procedures🧵

thumb_up_off_alt219

chat_bubble_outline3

repeat43

shareShare

MIT NLP

@nlp_mit

8 months ago

Hello everyone! We are quite a bit late to the twitter party, but welcome to the MIT NLP Group account! follow along for the latest research from our labs as we dive deep into language, learning, and logic 🤖📚🧠

thumb_up_off_alt543

chat_bubble_outline26

repeat52

shareShare

Ashish Vaswani

@ashvaswani

6 months ago

Check out our latest research on data. We're releasing 24T tokens of richly labelled web data. We found it very useful for our internal data curation efforts. Excited to see what you build using Essential-Web v1.0!

thumb_up_off_alt575

chat_bubble_outline15

repeat78

shareShare

Eugene Vinitsky 🍒🦋

@eugenevinitsky

5 months ago

Still in stealth but our team has grown to 20 and we're still hiring. If you're interested in joining the research frontier of deploying RL+LLM systems, shoot me an email to chat at ICML!

thumb_up_off_alt324

chat_bubble_outline6

repeat15

shareShare

Eugene Vinitsky 🍒🦋

@eugenevinitsky

5 months ago

You've got a couple of GPUs and a desire to build a self-driving car from scratch. How can you turn those GPUs and self-play training into an incredible looking agent? Come to W-821 at 4:30 in B2-B3 to learn from Marco Cusumano-Towner Taylor W. Killian Ozan Sener

thumb_up_off_alt47

chat_bubble_outline4

repeat7

shareShare