AvivTamarLab (@avivtamarlab) 's Twitter Profile
AvivTamarLab

@avivtamarlab

Aviv Tamar's Robot Learning Lab at Technion ECE

ID: 1756314953613537280

calendar_today10-02-2024 13:52:30

14 Tweet

15 Followers

1 Following

Aviv Tamar (@avivtamar1) 's Twitter Profile Photo

On Mar 21 we forgot to shut down a machine running a simple sanity check experiment. A couple days later, we were surprised to see beautiful results, which we couldn’t quite explain! 👇

On Mar 21 we forgot to shut down a machine running a simple sanity check experiment. A couple days later, we were surprised to see beautiful results, which we couldn’t quite explain! 
👇
Aviv Tamar (@avivtamar1) 's Twitter Profile Photo

DOTE won BEST PAPER at #nsdi23 !!! DOTE trains a deep neural network that directly outputs traffic engineering configurations. This works great for traffic that is difficult to predict accurately, e.g., MSFT's costumer facing WAN Yarin Perry will present Wed 14:40EDT 👇

DOTE won BEST PAPER at #nsdi23 !!!

DOTE trains a deep neural network that directly outputs traffic engineering configurations. This works great for traffic that is difficult to predict accurately, e.g., MSFT's costumer facing WAN

Yarin Perry will present Wed 14:40EDT
👇
Aviv Tamar (@avivtamar1) 's Twitter Profile Photo

Working on robotic bin picking? #ICRA2023 work led by Osaro’s research team shows how to improve throughput at deployment time Idea: optimize sequence of tools changes based on pretrained grasp success maps = better throughput for free! arxiv.org/abs/2302.07940 Poster Wed 9am

Aviv Tamar (@avivtamar1) 's Twitter Profile Photo

Meta-RL is all about inferring the task from a history of observations. But how to best learn a history embedding? In ContraBAR ( #ICML2023 w\ Era Choshen ) we investigate a contrastive learning approach. Paper: arxiv.org/pdf/2306.02418… Code: github.com/ec2604/ContraB… 👇

Aviv Tamar (@avivtamar1) 's Twitter Profile Photo

Check out these beautiful videos 🤩 Deep *dynamic* latent particles is a new object-based video prediction method, led by Tal Daniel Key idea: Latent variables = particles, making it easier to learn latent dynamics 👇

Aviv Tamar (@avivtamar1) 's Twitter Profile Photo

Teacher-student algos are great for learning w/ partial observability: teacher trained with full info -> student imitates it. But what if full-info policy is very different from partial-info? TGRL cleverly balances imitation with RL, leading to a very practical method #ICML2023

Orr Krupnik (@orrkrup) 's Twitter Profile Photo

What do you do when your robot world model just doesn't cut it? Fine-tune it, of course! New paper in #CoRL2023 next week, "Fine-Tuning Generative Models as an Inference Method for Robotic Tasks" 1/ >>> orrkrup.com/mace

Aviv Tamar (@avivtamar1) 's Twitter Profile Photo

We recently had a bit of a breakthrough in generalization in RL, led by Ev Zisselman TL;DR: learning MaxEnt exploration generalizes better than maximizing reward. We use this to set a new SOTA for ProcGen + significantly improve on hard games like Heist! #NeurIPS2023 Details👇

Zohar Rimon (@zoharrimon) 's Twitter Profile Photo

Our work "MAMBA: An Effective World Model Approach for Meta Reinforcement Learning" got accepted to ICLR 2026! It was super fun working on this one with tom jurgenson, Orr Krupnik, Gilad Adler, and Aviv Tamar Paper: arxiv.org/abs/2403.09859 Code: github.com/zoharri/mamba đź§µ [1/9]

Aviv Tamar (@avivtamar1) 's Twitter Profile Photo

Generalization in RL is hard. Compositional generalization is even harder… We made some progress in our #ICLR2024 spotlight w/ Dan Haramati and Tal Daniel RL trains a robotic manipulation policy that generalizes to different numbers of objects Code+paper: sites.google.com/view/entity-ce…

Mirco Mutti (@mirco_mutti) 's Twitter Profile Photo

When does meta training truly benefits RL efficiency? In our #ICML2024 paper, Aviv Tamar and me analyse the conditions under which fast regret rates can be achieved at test time arxiv.org/abs/2406.02282 1/5

Aviv Tamar (@avivtamar1) 's Twitter Profile Photo

This project completely reshaped my view on tree search + neural networks arxiv.org/abs/2406.02103 Using a NN for value/policy in MCTS is standard, but if the network errs, search performance goes down. We asked: if we have uncertainty estimates, can we exploit them?

This project completely reshaped my view on tree search + neural networks
arxiv.org/abs/2406.02103

Using a NN for value/policy in MCTS is standard, but if the network errs, search performance goes down. We asked: if we have uncertainty estimates, can we exploit them?
Aviv Tamar (@avivtamar1) 's Twitter Profile Photo

Want to learn / teach RL? Check out new book draft: Reinforcement Learning - Foundations sites.google.com/view/rlfoundat… W/ shiemannor and Yishay Mansour This is a rigorous first course in RL, based on our teaching at TAU CS and Technion ECE.

Want to learn / teach RL? 
Check out new book draft:
Reinforcement Learning - Foundations
sites.google.com/view/rlfoundat…
W/ <a href="/shiemannor/">shiemannor</a> and <a href="/YishayMansour/">Yishay Mansour</a>
This is a rigorous first course in RL, based on our teaching at TAU CS and Technion ECE.
Aviv Tamar (@avivtamar1) 's Twitter Profile Photo

Robots wil eventually be able to explore/adapt, but how can we trust strategies that are hard to interpret? We take the first step in *interpretable* exploration, and find a tree-like exploration rule that is both efficient (low regret) and interpretable (shallow tree) #ICML25