Michael Ryoo (@ryoo_michael) Twitter Tweets • TwiCopy

Michael Ryoo

@ryoo_michael

+ Follow

prof. with Stony Brook Univ. / research scientist with Salesforce AI Research

ID: 1448359208769015811

linkhttp://michaelryoo.com/ calendar_today13-10-2021 18:45:42

35 Tweet

333 Followers

68 Following

Ted Xiao

@xiao_ted

2 years ago

Looking forward to showcasing one of the first foundation models for robotics at #RSS2023 next week! Presenting "RT-1: Robotics Transformer for Real-world Control at Scale" from the Google DeepMind robotics team. Website: robotics-transformer.github.io Session: Tuesday 7/12, 3PM-5PM

thumb_up_off_alt79

chat_bubble_outline1

repeat20

shareShare

Xiang Li

@xiangli54505720

2 years ago

Introducing Crossway Diffusion, a diffusion-based visuomotor policy taking advantage of SSL. In short: we add state decoders to reconstruct states during training diffusion policy and it works better. More at: arxiv.org/abs/2307.01849

thumb_up_off_alt8

chat_bubble_outline0

repeat3

shareShare

Michael Ryoo

@ryoo_michael

2 years ago

"Diffusion Illusions: Hiding Images in Plain Sight" received #CVPR2023 Outstanding Demo Award. diffusionillusions.com Congratulations Ryan Burgert Kanchana Ranasinghe Xiang Li!

thumb_up_off_alt28

chat_bubble_outline2

repeat5

shareShare

Google DeepMind

@googledeepmind

2 years ago

Today, we announced 𝗥𝗧-𝟮: a first of its kind vision-language-action model to control robots. 🤖 It learns from both web and robotics data and translates this knowledge into generalised instructions. Find out more: dpmd.ai/introducing-rt2

thumb_up_off_alt1,1K

chat_bubble_outline38

repeat443

shareShare

Karol Hausman

@hausman_k

2 years ago

PaLM-E or GPT-4 can speak in many languages and understand images. What if they could speak robot actions? Introducing RT-2: robotics-transformer2.github.io our new model that uses a VLM (up to 55B params) backbone and fine-tunes it to directly output robot actions!

thumb_up_off_alt581

chat_bubble_outline18

repeat115

shareShare

Michael Ryoo

@ryoo_michael

a year ago

Introducing LLaRA !!! github.com/LostXine/LLaRA It's a new robot action model, dataset, and framework based on LLMs/VLMs. It's opensource and trainable at an academic scale (7B LLaVA-based), so you can finetune it for your robotics task!

thumb_up_off_alt15

chat_bubble_outline0

repeat1

shareShare

AK

@_akhaliq

a year ago

Salesforce presents xGen-MM (BLIP-3) A Family of Open Large Multimodal Models discuss: huggingface.co/papers/2408.08… This report introduces xGen-MM (also known as BLIP-3), a framework for developing Large Multimodal Models (LMMs). The framework comprises meticulously curated

thumb_up_off_alt313

chat_bubble_outline7

repeat76

shareShare

Michael Ryoo

@ryoo_michael

a year ago

BLIP-3-Video is out!

thumb_up_off_alt13

chat_bubble_outline1

repeat2

shareShare

Salesforce AI Research

@sfresearch

a year ago

🚨🎥🚨🎥🚨 xGen-MM-Vid (BLIP-3-Video) is now available on Hugging Face! Our compact VLM achieves SOTA performance with just 32 tokens for video understanding. Features explicit temporal encoder + BLIP-3 architecture. Try it out! 🤗32 Token Model: bit.ly/3PBNBBz 🤗128

🚨🎥🚨🎥🚨 xGen-MM-Vid (BLIP-3-Video) is now available on <a href="/huggingface/">Hugging Face</a>!

Our compact VLM achieves SOTA performance with just 32 tokens for video understanding. Features explicit temporal encoder + BLIP-3 architecture. Try it out!

🤗32 Token Model: bit.ly/3PBNBBz
🤗128

thumb_up_off_alt9

chat_bubble_outline1

repeat5

shareShare

Michael Ryoo

@ryoo_michael

10 months ago

LLaRA will appear at #ICLR2025 !! It is an efficient transformation of a VLM into a robot VLA. For more details: github.com/LostXine/LLaRA

thumb_up_off_alt38

chat_bubble_outline1

repeat6

shareShare

Conference on Robot Learning

@corl_conf

8 months ago

#CoRL2025 Hey Robot Learning Community! CoRL 2025 will be held in Seoul, Korea, Sep 27 - 30. Submission deadline: Apr 30 AoE. It's two weeks to go! Information: corl.org We are excited to receive your great work on robot learning!

thumb_up_off_alt41

chat_bubble_outline1

repeat5

shareShare

Michael Ryoo

@ryoo_michael

7 months ago

What we end up having at CoRL 2025 will depend on the result.

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare