Qingqing Zhao (@qingqing_zhao_) Twitter Tweets • TwiCopy

Moo Jin Kim

a year ago

✨ Introducing 𝐎𝐩𝐞𝐧𝐕𝐋𝐀 — an open-source vision-language-action model for robotics! 👐 - SOTA generalist policy - 7B params - outperforms Octo, RT-2-X on zero-shot evals 🦾 - trained on 970k episodes from OpenX dataset 🤖 - fully open: model/code/data all online 🤗 🧵👇

thumb_up_off_alt695

chat_bubble_outline24

repeat160

shareShare

Ellis Brown

@_ellisbrown

a year ago

Cambrian-1 🪼 Through a vision-centric lens, we study every aspect of building Multimodal LLMs except the LLMs themselves. As a byproduct, we achieve superior performance at the 8B, 13B, 34B scales. 📄arxiv.org/abs/2406.16860 🌎cambrian-mllm.github.io 🤗huggingface.co/nyu-visionx

thumb_up_off_alt132

chat_bubble_outline2

repeat31

shareShare

Xuxin Cheng

@xuxin_cheng

a year ago

Introduce Open-𝐓𝐞𝐥𝐞𝐕𝐢𝐬𝐢𝐨𝐧🤖: ⁣ We need an intuitive and remote teleoperation interface to collect more robot data. 𝐓𝐞𝐥𝐞𝐕𝐢𝐬𝐢𝐨𝐧 lets you immersively operate a robot even if you are 3000 miles away, like in the movie 𝘈𝘷𝘢𝘵𝘢𝘳. Open-sourced!

thumb_up_off_alt1,1K

chat_bubble_outline47

repeat227

shareShare

Qingqing Zhao

@qingqing_zhao_

a year ago

PhysAvatar has been accepted into #ECCV2024! 🎉 We create virtual avatars with realistic garment dynamics and lighting, with a physics simulator in the loop.

thumb_up_off_alt197

chat_bubble_outline7

repeat20

shareShare

Zipeng Fu

@zipengfu

a year ago

Introduce Mobility VLA - Google's foundation model for navigation - started as my intern project: - Gemini 1.5 Pro for high-level image & text understanding - topological graphs for low-level navigation - supports multimodal instructions co-lead Zhuo Xu, Lewis Chiang, Jie Tan

thumb_up_off_alt175

chat_bubble_outline3

repeat27

shareShare

Ankur Handa

@ankurhandos

a year ago

At #RSS2024, we are excited to share AutoMate: Specialist and Generalist Assembly Policies over Diverse Geometries, which will be presented at RSS 2024. AutoMate is the first simulation-based framework for learning specialist (part-specific) and generalist (unified) assembly

thumb_up_off_alt44

chat_bubble_outline1

repeat6

shareShare

Peizhuo Li

@peizhuol

a year ago

Want to #WalkTheDog in the metaverse? In our project at #SIGRAPH2024 with Sebastian Starke , Yuting Ye and Olga Sorkine-Hornung, we develop an approach to learn a common 1D phase manifold from motion datasets across different morphologies, *without* any supervision (1/2)

thumb_up_off_alt104

chat_bubble_outline1

repeat13

shareShare

Zilu Li

@zluleee

a year ago

We'll present "Neural Control Variates with Automatic Integration" at the Monte Carlo for PDE session in Mile High 4 at 11:25 tomorrow. Come to our talk to learn more about creating control variates function with arbitrary neural networks! #SIGGRAPH2024

thumb_up_off_alt61

chat_bubble_outline0

repeat8

shareShare

Tsung-Yi Lin

@tsungyilincv

a year ago

My first SIGGRAPH at #SIGGRAPH2024 ! Chen-Hsuan Lin Jiashu Xu Donglai Xiang will show 3D scene generation in real time from scratch along with other 10 RTL participants. Join us at 6p today!

thumb_up_off_alt32

chat_bubble_outline2

repeat5

shareShare

Qingqing Zhao

@qingqing_zhao_

a year ago

very cool🔥

thumb_up_off_alt41

chat_bubble_outline0

repeat0

shareShare

Qingqing Zhao

@qingqing_zhao_

a year ago

Congrats Tsung-Yi and well deserved 🎉

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Qingqing Zhao

@qingqing_zhao_

a year ago

kudos to the team!

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Moo Jin Kim

@moo_jin_kim

9 months ago

Introducing OFT—an Optimized Fine-Tuning recipe for VLAs! Fine-tuning OpenVLA w/ OFT, we see: -25-50x faster inference ⚡️ -SOTA 97.1% avg SR in LIBERO 💪 -high-freq control w/ 7B model on real bimanual robot -outperforms π₀, RDT-1B, DiT Policy, MDT, Diffusion Policy, ACT 🧵👇

thumb_up_off_alt390

chat_bubble_outline15

repeat70

shareShare

Hansheng Chen

@hanshengch

7 months ago

Excited to share our work: Gaussian Mixture Flow Matching Models (GMFlow) github.com/lakonik/gmflow GMFlow generalizes diffusion models by predicting Gaussian mixture denoising distributions, enabling precise few-step sampling and high-quality generation.

thumb_up_off_alt122

chat_bubble_outline1

repeat31

shareShare

Boyang Deng

@boyang_deng

7 months ago

Curious about how cities have changed in the past decade? We use MLLMs to analyse 40 million Street View images to answer this. Do you know that "juice shops became a thing in NYC" and "miles of overpasses were painted BLUE in SF"? More at→boyangdeng.com/visual-chronic… (vid ↓ w/ 🔊)

thumb_up_off_alt88

chat_bubble_outline1

repeat15

shareShare