Alexander Nikulin @ ICLR (@how_uhh) 's Twitter Profile
Alexander Nikulin @ ICLR

@how_uhh

howuhh.github.io dunnolab.ai

ID: 533400228

calendar_today22-03-2012 17:18:12

1,1K Tweet

194 Followers

607 Following

Peter Henderson (@peterhndrsn) 's Twitter Profile Photo

The next ~1-4 years will be taking the 2017-2020 years of Deep RL and scaling up: exploration, generalization, long-horizon tasks, credit assignment, continual learning, multi-agent interaction! Lots of cool work to be done! 🎮🤖 But we shouldn't forget big lessons from back

The next ~1-4 years will be taking the 2017-2020 years of Deep RL and scaling up: exploration, generalization, long-horizon tasks, credit assignment, continual learning, multi-agent interaction! Lots of cool work to be done! 🎮🤖

But we shouldn't forget big lessons from back
Denis Tarasov (@ml_is_overhyped) 's Twitter Profile Photo

LLMs are amazing because they can learn in context — read, adapt, and act. Can we do the same for reinforcement learning? That’s the promise of In-Context RL (ICRL). But existing offline ICRL methods don’t even optimize rewards. Our new paper shows why RL matters 🧵

LLMs are amazing because they can learn in context — read, adapt, and act.

Can we do the same for reinforcement learning? That’s the promise of In-Context RL (ICRL).

But existing offline ICRL methods don’t even optimize rewards.

Our new paper shows why RL matters
🧵
Vladislav Kurenkov (@vladkurenkov) 's Twitter Profile Photo

🚀 Introducing cadrille: a new SOTA model for CAD reconstruction from images, point clouds, and text—all in one framework with the use of RLVR. Multimodal inputs + RLVR = clean, editable 3D models. 🧵👇

🚀 Introducing cadrille: a new SOTA model for CAD reconstruction from images, point clouds, and text—all in one framework with the use of RLVR.

Multimodal inputs + RLVR = clean, editable 3D models.

🧵👇
Alexander Nikulin @ ICLR (@how_uhh) 's Twitter Profile Photo

I think space station 13 could be the perfect benchmark for VLA agents... very deep skill system, multi-agency, open-endedness, a lot of role-play... custom server with agents and humans would be a dream

Tim Rocktäschel (@_rockt) 's Twitter Profile Photo

Great post by Mikael Henaff (after ascending, what an achievement!) on what makes The NetHack Learning Environment so extremely difficult for AI (even LLMs: balrogai.com). "While NetHack is complex in comparison to other RL benchmarks, it still contains only a tiny fraction of the

Ilya Zisman (@suessmannn) 's Twitter Profile Photo

Ivan Rubachev During my university years, I thought it was cool to skip classes because I had a real, paying job. Now I’m skipping work to learn what I missed in those classes. 🤭

Karl Pertsch (@karlpertsch) 's Twitter Profile Photo

We’re releasing the RoboArena today!🤖🦾 Fair & scalable evaluation is a major bottleneck for research on generalist policies. We’re hoping that RoboArena can help! We provide data, model code & sim evals for debugging! Submit your policies today and join the leaderboard! :) 🧵

Denis Tarasov (@ml_is_overhyped) 's Twitter Profile Photo

I’m asking for help. I was meant to start my PhD with Tim Rocktäschel and Roberta Raileanu at UCL, but my UK background check was refused. My appeal seems unlikely to succeed, so I’m urgently searching for any PhD or research positions in academia or industry. Any help is appreciated.

Nikita Kachaev (@judokach) 's Twitter Profile Photo

👀 Action fine-tuning often blinds VLA models: they lose the visual–language (VL) priors that made them smart. We show how to keep those priors intact with a tiny alignment loss. 🤖 ↓

Vladislav Kurenkov (@vladkurenkov) 's Twitter Profile Photo

We released 87 hours of LeRobot SO 100/101 datasets. It is a unified, cleaned, and annotated repackage of 598 open-source community datasets (SO100 and SO101), totaling 22,709 episodes, ~9.4M frames, and 563 tasks.