Athul Paul Jacob (@apjacob03) 's Twitter Profile
Athul Paul Jacob

@apjacob03

@MIT_CSAIL @UWaterloo | Co-created AIs: CICERO, Diplodocus | Ex: @google, @generalcatalyst, FAIR @MetaAI, @MSFTResearch AI, @MITIBMLab, @Mila_Quebec

ID: 373885410

calendar_today15-09-2011 10:27:31

356 Tweet

1,1K Takipçi

481 Takip Edilen

Ekin Akyürek (@akyurekekin) 's Twitter Profile Photo

Why do we treat train and test times so differently? Why is one “training” and the other “in-context learning”? Just take a few gradients during test-time — a simple way to increase test time compute — and get a SoTA in ARC public validation set 61%=avg. human score! ARC Prize

Why do we treat train and test times so differently?

Why is one “training” and the other “in-context learning”?

Just take a few gradients during test-time — a simple way to increase test time compute — and  get a SoTA in ARC public validation set 61%=avg. human score! <a href="/arcprize/">ARC Prize</a>
Shikhar (@shikharmurty) 's Twitter Profile Photo

Super excited to share NNetnav : A new method for generating complex demonstrations to train web agents—driven entirely via exploration! Here's how we’re building useful browser agents, without expensive human supervision: 🧵👇 Code: github.com/MurtyShikhar/N… Preprint:

Laura Ruis (@lauraruis) 's Twitter Profile Photo

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this:

Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢

🧵⬇️
Sherjil Ozair (@sherjilozair) 's Twitter Profile Photo

Very happy to hear that GANs are getting the test of time award at NeurIPS 2024. The NeurIPS test of time awards are given to papers which have stood the test of the time for a decade. I took some time to reminisce how GANs came about and how AI has evolve in the last decade.

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

The (true) story of development and inspiration behind the "attention" operator, the one in "Attention is All you Need" that introduced the Transformer. From personal email correspondence with the author 🇺🇦 Dzmitry Bahdanau @ NeurIPS ~2 years ago, published here and now (with permission) following

The (true) story of development and inspiration behind the "attention" operator, the one in "Attention is All you Need" that introduced the Transformer. From personal email correspondence with the author <a href="/DBahdanau/">🇺🇦 Dzmitry Bahdanau @ NeurIPS</a> ~2 years ago, published here and now (with permission) following
Jacob Andreas (@jacobandreas) 's Twitter Profile Photo

Is your CS dept worried about what academic research should be in the age of LLMs? Hire one of my lab members! Leshem Choshen (Leshem (Legend) Choshen 🤖🤗), Pratyusha Sharma (Pratyusha Sharma) and Ekin Akyürek (Ekin Akyürek) are all on the job market with unique perspectives on the future of NLP: 🧵

Jacob Andreas (@jacobandreas) 's Twitter Profile Photo

Leshem Choshen (Leshem (Legend) Choshen 🤖🤗 @ACL) is developing the technical+social infrastructure needed to support communal training of large-scale ML systems. Before MIT was one of the inventors of "model merging" techniques; these days he’s thinking about collecting & learning from user feedback.

Leshem Choshen (<a href="/LChoshen/">Leshem (Legend) Choshen 🤖🤗 @ACL</a>) is developing the technical+social infrastructure needed to support communal training of large-scale ML systems. Before MIT was one of the inventors of "model merging" techniques; these days he’s thinking about collecting &amp; learning from user feedback.
Jacob Andreas (@jacobandreas) 's Twitter Profile Photo

Pratyusha Sharma (Pratyusha Sharma) studies how language shapes learned representations in artificial and biological intelligence. Her discoveries have shaped our understanding of systems as diverse as LLMs and sperm whales.

Pratyusha Sharma (<a href="/pratyusha_PS/">Pratyusha Sharma</a>) studies how language shapes learned representations in artificial and biological intelligence. Her discoveries have shaped our understanding of systems as diverse as LLMs and sperm whales.
Jacob Andreas (@jacobandreas) 's Twitter Profile Photo

Ekin Akyürek (Ekin Akyürek) builds tools for understanding & controlling algorithms that underlie reasoning in language models. You’ve likely seen his work on in-context learning; I'm just as excited about past work on linguistic generalization & future work on test-time scaling.

Ekin Akyürek (<a href="/akyurekekin/">Ekin Akyürek</a>) builds tools for understanding &amp; controlling algorithms that underlie reasoning in language models. You’ve likely seen his work on in-context learning; I'm just as excited about past work on linguistic generalization &amp; future work on test-time scaling.
Eugene Vinitsky 🍒🦋 (@eugenevinitsky) 's Twitter Profile Photo

We've built a simulated driving agent that we trained on 1.6 billion km of driving with no human data. It is SOTA on every planning benchmark we tried. In self-play, it goes 20 years between collisions.

We've built a simulated driving agent that we trained on 1.6 billion km of driving with no human data. 
It is SOTA on every planning benchmark we tried.
In self-play, it goes 20 years between collisions.
Samuel Sokota (@ssokota) 's Twitter Profile Photo

Model-free deep RL algorithms like NFSP, PSRO, ESCHER, & R-NaD are tailor-made for games with hidden information (e.g. poker). We performed the largest-ever comparison of these algorithms. We find that they do not outperform generic policy gradient methods, such as PPO. 1/N

Model-free deep RL algorithms like NFSP, PSRO, ESCHER, &amp; R-NaD are tailor-made for games with hidden information (e.g. poker).

We performed the largest-ever comparison of these algorithms.

We find that they do not outperform generic policy gradient methods, such as PPO. 1/N
Gokul Swamy (@g_k_swamy) 's Twitter Profile Photo

1.5 yrs ago, we set out to answer a seemingly simple question: what are we *actually* getting out of RL in fine-tuning? I'm thrilled to share a pearl we found on the deepest dive of my PhD: the value of RL in RLHF seems to come from *generation-verification gaps*. Get ready to🤿!

1.5 yrs ago, we set out to answer a seemingly simple question: what are we *actually* getting out of RL in fine-tuning? I'm thrilled to share a pearl we found on the deepest dive of my PhD: the value of RL in RLHF seems to come from *generation-verification gaps*. Get ready to🤿!
Eugene Vinitsky 🍒🦋 (@eugenevinitsky) 's Twitter Profile Photo

Hiring researchers and engineers for a stealth, applied research company with a focus on RL x foundation models. Folks on the team already are leading RL / learning researchers. If you think you'd be good at the research needed to get things working in practice, email me

Belinda Li @ ICLR 2025 (@belindazli) 's Twitter Profile Photo

Past work has shown that world state is linearly decodable from LMs trained on text and games like Othello. But how do LMs *compute* these states? We investigate state tracking using permutation composition as a model problem, and discover interpretable, controllable procedures🧵

Past work has shown that world state is linearly decodable from LMs trained on text and games like Othello. But how do LMs *compute* these states? We investigate state tracking using permutation composition as a model problem, and discover interpretable, controllable procedures🧵
MIT NLP (@nlp_mit) 's Twitter Profile Photo

Hello everyone! We are quite a bit late to the twitter party, but welcome to the MIT NLP Group account! follow along for the latest research from our labs as we dive deep into language, learning, and logic 🤖📚🧠

Hello everyone! We are quite a bit late to the twitter party, but welcome to the MIT NLP Group account! follow along for the latest research from our labs as we dive deep into language, learning, and logic 🤖📚🧠
Ashish Vaswani (@ashvaswani) 's Twitter Profile Photo

Check out our latest research on data. We're releasing 24T tokens of richly labelled web data. We found it very useful for our internal data curation efforts. Excited to see what you build using Essential-Web v1.0!

Eugene Vinitsky 🍒🦋 (@eugenevinitsky) 's Twitter Profile Photo

Still in stealth but our team has grown to 20 and we're still hiring. If you're interested in joining the research frontier of deploying RL+LLM systems, shoot me an email to chat at ICML!

Eugene Vinitsky 🍒🦋 (@eugenevinitsky) 's Twitter Profile Photo

You've got a couple of GPUs and a desire to build a self-driving car from scratch. How can you turn those GPUs and self-play training into an incredible looking agent? Come to W-821 at 4:30 in B2-B3 to learn from Marco Cusumano-Towner Taylor W. Killian Ozan Sener

You've got a couple of GPUs and a desire to build a self-driving car from scratch. How can you turn those GPUs and self-play training into an incredible looking agent? Come to W-821 at 4:30 in B2-B3 to learn from <a href="/Mdjxjxnsk/">Marco Cusumano-Towner</a> <a href="/tw_killian/">Taylor W. Killian</a> <a href="/ozansener/">Ozan Sener</a>