Jiayuan Mao (@maojiayuan) 's Twitter Profile
Jiayuan Mao

@maojiayuan

PhD Student at @MIT_LISLab/@MITCoCoSci

ID: 1054159620

linkhttp://jiayuanm.com calendar_today02-01-2013 04:18:41

16 Tweet

684 Followers

123 Following

Learning and Intelligent Systems (LIS) @ MIT (@mit_lislab) 's Twitter Profile Photo

Hello, world! We are the Learning and Intelligent Systems group MIT CSAIL, headed by Leslie Pack Kaelbling & Tomás Lozano-Pérez. We work on AI, ML, and robotics, and we’ll be mostly tweeting about new work by our group.

Learning and Intelligent Systems (LIS) @ MIT (@mit_lislab) 's Twitter Profile Photo

New blog alert! In our lab blog's first ever post, Leslie Kaelbling writes about the engineering science of embodied intelligence. lis.csail.mit.edu/the-engineerin…

Joy Hsu (@joycjhsu) 's Twitter Profile Photo

How can we build a modular and compositional system that understands 3D scenes? Excited to introduce our #CVPR2024 paper — NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations, w Jiayuan Mao and Jiajun Wu . Check out our poster next week at Tue-AM-249.

How can we build a modular and compositional system that understands 3D scenes? Excited to introduce our <a href="/CVPR/">#CVPR2024</a> paper — NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations, w <a href="/maojiayuan/">Jiayuan Mao</a> and <a href="/jiajunwu_cs/">Jiajun Wu</a> . Check out our poster next week at Tue-AM-249.
Zhutian Yang (@zhutianyang_) 's Twitter Profile Photo

Our #CoRL2023 paper shows that by composing the energies of diffusion models trained to sample for individual constraint types such as collision-free, spatial relations, and physical stability, it can solve novel combinations of known constraints. diffusion-ccsp.github.io (🧵 1/N)

Yilun Du (@du_yilun) 's Twitter Profile Photo

Introducing a way to convert synthesized robot videos to robot execution without using any action labels! flow-diffusion.github.io We also release a codebase (with pretrained models) for text-to-video generation. Train your own models for robot control in only 1 day with 4 GPUs!

Joy Hsu (@joycjhsu) 's Twitter Profile Photo

What’s left w/ foundation models? We found that they still can't ground modular concepts across domains. We present Logic-Enhanced FMs:🤝FMs & neuro-symbolic concept learners. We learn abstractions of concepts like “left” across domains & do domain-independent reasoning w/ LLMs.

Nishanth Kumar (@nishanthkumar23) 's Twitter Profile Photo

Ever heard about "Bilevel Planning" or "Task and Motion Planning", but been unsure what those words mean? Ever wanted a gentle intro to these methods so you can just understand what's going on? Our new blog post might help! lis.csail.mit.edu/bilevel-planni…

Jiayuan Mao (@maojiayuan) 's Twitter Profile Photo

Definitely one of my top 3 favourite papers :) It marries deep learning with a minimal set of universal grammar rules for grounded language learning. It draws inspiration from lexicalist linguistics and cognitive science (bootstrapping from core knowledge).

Jiayuan Mao (@maojiayuan) 's Twitter Profile Photo

Check out our new framework that automatically generates planning domain knowledge with LLMs, learns to ground it, and verifies it through interaction. We believe that learning such verifiable and compositional planning representations from language is important for embodied AI!

Fangchen Liu (@fangchenliu_) 's Twitter Profile Photo

Can we leverage VLMs for robot manipulation in the open world? Checkout our new work MOKA, a simple and effective visual prompting method!

Fangchen Liu (@fangchenliu_) 's Twitter Profile Photo

The key idea is to query GPT-4V to perform multiple choice from a set of keypoints and waypoints. Here is an example: suppose the current task is to sweep the trash bag off the table. We mark the involved objects with a set of points, and overlay the image with grids.

The key idea is to query GPT-4V to perform multiple choice from a set of keypoints and waypoints. Here is an example: suppose the current task is to sweep the trash bag off the table. We mark the involved objects with a set of points, and overlay the image with grids.
Chen Wang (@chenwang_j) 's Twitter Profile Photo

Can we use wearable devices to collect robot data without actual robots? Yes! With a pair of gloves🧤! Introducing DexCap, a portable hand motion capture system that collects 3D data (point cloud + finger motion) for training robots with dexterous hands Everything open-sourced

Yilun Du (@du_yilun) 's Twitter Profile Photo

Introducing our @icml_conf paper: Learning Iterative Reasoning through Energy Diffusion! We formulate reasoning as optimizing a sequence of energy landscapes. This enables us to solve harder problems at test time with more complex optimization. Website: energy-based-model.github.io/ired/