Itβs notoriously difficult to model the mechanics of compliant robot jaw tips during grasping! We found that a new tool from computer graphics can help. IPC-GraspSim, from AUTOLabUC Berkeley. Paper, data, video: sites.google.com/berkeley.edu/iβ¦ (1/9)
Wouldnβt it be nice if ChatGPT could find your missing keys for you? Our latest research from Berkeley AI Research + Google AI suggests that robots can use large language models (LLMs) to find hidden objects faster. π§΅π
Can vision and language models be extended to include touch? Yes! We will present a new touch-vision-language dataset collected in the wild and Touch-Vision-Language Models (TVLMs) trained on this dataset at #ICML2024. π 1/6
tactile-vlm.github.io
Vision-language models perform diverse tasks via in-context learning. Time for robots to do the same! Introducing In-Context Robot Transformer (ICRT): a robot policy that learns new tasks by prompting with robot trajectories, without any fine-tuning.
icrt.dev
[1/N]
1/N Most Vision-Language-Action models need tons of data for finetuning, and still fail for new objects and instructions. Introducing OTTER, a lightweight, easy-to-train model that uses text-aware visual features to nail unseen tasks out of the box! Here's how it works π
Can we scale up robot data collection without a robot? We propose a pipeline to scale robot dataset from one human demonstration. Through a real2render2real pipeline, policies trained with the generated data can be deployed directly on a real robot.
Can we track object part motions from a monocular video? Check out POD! With an object scan and a monocular video, we can learn an object configuration model. This could be useful for reconstructing articulated objects for robot learning.
(1/N) How close are we to enabling robots to solve the long-horizon, complex tasks that matter in everyday life?
π¨ We are thrilled to invite you to join the 1st BEHAVIOR Challenge @NeurIPS 2025, submission deadline: 11/15.
π Prizes:
π₯ $1,000
π₯ $500
π₯ $300