Eric Cai (@eywcai) 's Twitter Profile
Eric Cai

@eywcai

ID: 1815163002607247360

calendar_today21-07-2024 23:13:05

53 Tweet

39 Takipçi

81 Takip Edilen

Geoffrey Hinton (@geoffreyhinton) 's Twitter Profile Photo

I think Elon Musk should be expelled from the British Royal Society. Not because he peddles conspiracy theories and makes Nazi salutes, but because of the huge damage he is doing to scientific institutions in the US. Now let's see if he really believes in free speech.

Vikash Kumar (@vikashplus) 's Twitter Profile Photo

Everyone, lets EVALUATE on grasps with more than deformable objects (which confirms to any grasp you make) to access generalization.

Suneel Belkhale (@suneel_belkhale) 's Twitter Profile Photo

Trained a robot policy and want to measure generalization? Generalization evals vary across studies, and this makes progress hard to track. Enter ★-Gen: a taxonomy of generalization for manipulation that guides evaluation design and fosters new benchmarks. 🧵⬇️ 1/8

Tairan He (@tairanhe99) 's Twitter Profile Photo

Today I was asked again on "Why humanoid?" The answer I gave is "It is the best embodiment to leverage human data" Introducing Humanoid Policy ~ Human Policy! A scalable way to collect and co-train humanoid policy with human data! human-as-robot.github.io Code and datasets are

Tairan He (@tairanhe99) 's Twitter Profile Photo

Back in undergrad, I spent days inventing "new RL algos" on single-CPU Humanoid-v2 just to hit SOTA. Totally overfit. Now vanilla PPO gets humanoids walking in 10 mins with parallel envs. I’ve come to value results that either: 1. Scale RL via massive sim + multi-GPU/node

Stefan Stojanov (@sstj389) 's Twitter Profile Photo

Video prediction foundation models implicitly learn how objects move in videos. Can we learn how to extract these representations to accurately track objects in videos _without_ any supervision? Yes! 🧵 Work done with: Rahul Venkatesh, Seungwoo (Simon) Kim, Jiajun Wu and Daniel Yamins

Mihir Prabhudesai (@mihirp98) 's Twitter Profile Photo

1/ Happy to share UniDisc - Unified Multimodal Discrete Diffusion – We train a 1.5 billion parameter transformer model from scratch on 250 million image/caption pairs using a **discrete diffusion objective**. Our model has all the benefits of diffusion models but now in

Anirudha Majumdar (@majumdar_ani) 's Twitter Profile Photo

I sent a message to my PhD students and postdocs at Princeton University a couple of weeks ago regarding freezes/cuts to federal research funding (this was before the freeze on federal funding to Princeton). I am sharing it here in case others find it helpful in having similar

Chris Paxton (@chris_j_paxton) 's Twitter Profile Photo

It really, really bothers me when people (in ai/robotics) are overhyping/overselling this stuff. I generally chose what I say sort of carefully, and I do genuinely believe it when I say I'm extremely optimistic about robotics/ai. Often feels like many do not

Ayush Jain (@ayushjain1144) 's Twitter Profile Photo

1/ Despite having access to rich 3D inputs, embodied agents still rely on 2D VLMs—due to the lack of large-scale 3D data and pre-trained 3D encoders. We introduce UniVLG, a unified 2D-3D VLM that leverages 2D scale to improve 3D scene understanding. univlg.github.io

David Bau (@davidbau) 's Twitter Profile Photo

Dear MAGA friends, I have been worrying about STEM in the US a lot, because right now the Senate is writing new laws that cut 75% of the STEM budget in the US. Sorry for the long post, but the issue is really important, and I want to share what I know about it. The entire

Phillip Isola (@phillip_isola) 's Twitter Profile Photo

Our computer vision textbook is now available for free online here: visionbook.mit.edu We are working on adding some interactive components like search and (beta) integration with LLMs. Hope this is useful and feel free to submit Github issues to help us improve the text!

Yunzhu Li (@yunzhuliyz) 's Twitter Profile Photo

**Steerability** remains one of the key issues for current vision-language-action models (VLAs). Natural language is often ambiguous and vague: "Hang a mug on a branch" vs "Hang the left mug on the right branch." Many works claim to handle language input, yet the tasks are often

Rajat Kumar Jenamani (@rkjenamani) 's Twitter Profile Photo

Most assistive robots live in labs. We want to change that. FEAST enables care recipients to personalize mealtime assistance in-the-wild, with minimal researcher intervention across diverse in-home scenarios. 🏆 Outstanding Paper & Systems Paper Finalist Robotics: Science and Systems 🧵1/8

Yufei Wang (@yufeiwang25) 's Twitter Profile Photo

Introducing ArticuBot🤖at #RSS2025, in which we learn a single policy for manipulating diverse articulated objects across 3 robot embodiments in different labs, kitchens & lounges, achieved via large-scale simulation and hierarchical imitation learning. articubot.github.io 🧵

Rajat Kumar Jenamani (@rkjenamani) 's Twitter Profile Photo

Really excited to share that FEAST won the Best Paper Award at #RSS2025! Huge thanks to everyone who’s shaped this work, from roboticists to care recipients, caregivers, and occupational therapists. ❤️

Really excited to share that FEAST won the Best Paper Award at #RSS2025!

Huge thanks to everyone who’s shaped this work, from roboticists to care recipients, caregivers, and occupational therapists. ❤️
Yunchu (@yunchuzh) 's Twitter Profile Photo

How can we continuously improve large pretrained behavior policies when 0-shot performance is not good enough? Directly finetuning the base policy via RL tends to be sample-inefficient. Can we squeeze more juice from the base policy to enable automatic and efficient performance

Russ Tedrake (@russtedrake) 's Twitter Profile Photo

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the