Sanskar Pandey (@sanskxr02) 's Twitter Profile
Sanskar Pandey

@sanskxr02

building the cognitive layer for compliance + side quests

ID: 1774869724016599045

linkhttps://github.com/sanskar240 calendar_today01-04-2024 18:41:49

2,2K Tweet

98 Takipçi

266 Takip Edilen

Chris Paxton (@chris_j_paxton) 's Twitter Profile Photo

World models for policy evaluation; the fact that world model performance is highly correlated with real-world performance is incredibly valuable on its own.

Sanskar Pandey (@sanskxr02) 's Twitter Profile Photo

Interesting conversation - I am particularly curious about how WMs/video models stand to benefit from object centric interaction data with just enough causal and semantic descriptors

Michelle (@michellelsun) 's Twitter Profile Photo

Love seeing egocentric data treated as more than "just video". Explicit state transitions (hands, objects, actions) is the missing piece between raw videos and world models. Curious to see how these scale across tasks and environments!

Sanskar Pandey (@sanskxr02) 's Twitter Profile Photo

Today we’re open-sourcing a small first subset of state–action–state′ trajectories from egocentric factory video. Built on Build AI data and enriched by us, this is early infrastructure for learning-ready video in world models and offline RL. Still very much iterating on the

Prime Intellect (@primeintellect) 's Twitter Profile Photo

We believe the next breakthrough in long-horizon agents is training models to manage their own context. Introducing our new research direction on Recursive Language Models. We are sharing our initial experiments showing the promise of RLMs. primeintellect.ai/blog/rlm

Chris Paxton (@chris_j_paxton) 's Twitter Profile Photo

Teleoperation data does not seem to scale well on its own -- we need lots of video data of humans and robots interacting with their environment to learn physics and environment interaction. This will unlock general robot intelligence.

Sanskar Pandey (@sanskxr02) 's Twitter Profile Photo

This is an interesting take for sure,the biggest issue I’ve had with any “in-the-wild” data collection technique is simply that the data collected is constrained to simple grasping or quasi-static pick-and-place actions

Sanskar Pandey (@sanskxr02) 's Twitter Profile Photo

To drive the point home - regardless of how detailed your IMU collection stack is, this point still holds. Trying to recover n-DoF actions does not solve the embodiment gap. It really just seems like a painful way of solving for an under-specified inverse problem