Seonghyeon Ye (@seonghyeonye) 's Twitter Profile
Seonghyeon Ye

@seonghyeonye

PhD student KAIST AI (@kaist_ai), Research Intern @NVIDIAAI GEAR | Prev: @MSFTResearch

ID: 1356902144016670720

linkhttps://seonghyeonye.github.io/ calendar_today03-02-2021 09:47:46

227 Tweet

984 Followers

474 Following

Jack Parker-Holder (@jparkerholder) 's Twitter Profile Photo

Introducing 🧞Genie 2 🧞 - our most capable large-scale foundation world model, which can generate a diverse array of consistent worlds, playable for up to a minute. We believe Genie 2 could unlock the next wave of capabilities for embodied agents 🧠.

Seungone Kim @ NAACL2025 (@seungonekim) 's Twitter Profile Photo

#NLProc Just because GPT-4o is 17 times more expensive than GPT-4o-mini, does that mean it generates synthetic data 17 times better? Introducing the AgoraBench, a benchmark for evaluating data generation capabilities of LMs.

#NLProc 
Just because GPT-4o is 17 times more expensive than GPT-4o-mini, does that mean it generates synthetic data 17 times better? 

Introducing the AgoraBench, a benchmark for evaluating data generation capabilities of LMs.
Jiafei Duan (@djiafei) 's Twitter Profile Photo

🚀Excited to introduce our latest work- SAT: Spatial Aptitude Training, a groundbreaking approach to enhance spatial reasoning in Multimodal Language Models (MLMs). SAT isn't just about understanding static object positions but dives deep into dynamic spatial reasoning. 🧵

Jim Fan (@drjimfan) 's Twitter Profile Photo

I believe solving robotics = 90% engineering + 10% research vision. Project GR00T is NVIDIA's moonshot initiative to build physical AGI for humanoid robots. The GEAR Lab is assembling a crack team right now. Join us! Openings: - Sr. Research Engineer, Robotics Systems - Sr. RE,

Shayne Longpre (@shayneredford) 's Twitter Profile Photo

✨New Report✨ Our data ecosystem audit across text, speech, and video (✏️,📢,📽️) finds: 📈 Rising reliance on web, synthetic, and YouTube data. 🛑 80%+ datasets carry hidden restrictions. 🌍 Relative representation in languages and creators has not improved for 10+ yrs.

Zhou Xian (@zhou_xian_) 's Twitter Profile Photo

Everything you love about generative models — now powered by real physics! Announcing the Genesis project — after a 24-month large-scale research collaboration involving over 20 research labs — a generative physics engine able to generate 4D dynamical worlds powered by a physics

Jim Fan (@drjimfan) 's Twitter Profile Photo

Introducing NVIDIA Cosmos, an open-source, open-weight Video World Model. It's trained on 20M hours of videos and weighs from 4B to 14B. Cosmos offers two flavors: diffusion (continuous tokens) and autoregressive (discrete tokens); and two generation modes: text->video and

Seongyun Lee (@sylee_ai) 's Twitter Profile Photo

🎉 Excited to share that our paper "How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?" has been accepted to #ICLR2025! 🖼 Vision-Language Adaptation empowers LLMs to process visual information—but how does it impact their safety? 🛡 And what about

🎉 Excited to share that our paper "How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?" has been accepted to #ICLR2025!

🖼 Vision-Language Adaptation empowers LLMs to process visual information—but how does it impact their safety?

🛡 And what about
Jiafei Duan (@djiafei) 's Twitter Profile Photo

Can we build a generalist robotic policy that doesn’t just memorize training data and regurgitate it during test time, but instead remembers past actions as memory and conditions its decisions on them?🤖💡 Introducing SAM2Act—a multi-view robotic transformer-based policy that

Tairan He (@tairanhe99) 's Twitter Profile Photo

🚀 Can we make a humanoid move like Cristiano Ronaldo, LeBron James and Kobe Byrant? YES! 🤖 Introducing ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills Website: agile.human2humanoid.com Code: github.com/LeCAR-Lab/ASAP

Jim Fan (@drjimfan) 's Twitter Profile Photo

We RL'ed humanoid robots to Cristiano Ronaldo, LeBron James, and Kobe Byrant! These are neural nets running on real hardware at our GEAR lab. Most robot demos you see online speed videos up. We actually *slow them down* so you can enjoy the fluid motions. I'm excited to announce

Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

Microsoft presents: Magma: A Foundation Model for Multimodal AI Agents - SotA on UI navigation and robotic manipulation tasks - Pretrained on a large dataset annotated with Set-of-Mark (SoM) for action grounding and Trace-of-Mark (ToM) for action planning.

Microsoft presents:

Magma: A Foundation Model for Multimodal AI Agents

- SotA on UI navigation and robotic manipulation tasks
- Pretrained on a large dataset annotated with Set-of-Mark (SoM) for action grounding and Trace-of-Mark (ToM) for action planning.
Jianwei Yang (@jw2yang4ai) 's Twitter Profile Photo

Thanks for featuring our work! Aran Komatsuzaki. 🔥Today we are thrilled to announce our MSR flagship project Magma! This is a fully open-sourced project. We will roll out all the stuff: code, model and training data through the following days. Check out our full work here:

Jianwei Yang (@jw2yang4ai) 's Twitter Profile Photo

🔥In Magma, we talked a lot about spatial/temporal intelligence beyond verbal intelligencen as advocated by Dr. Fei-Fei Li. So how to interpret it? Today I am happy to announce a new demo Magma-Gaming: 👉huggingface.co/spaces/microso… Rather than asking LLMs to write game code, we

Yuke Zhu (@yukez) 's Twitter Profile Photo

Thrilled to announce GR00T N1, our open foundation model for generalist humanoid robots! GR00T N1 adopts a dual-system design, leverages the entire data pyramid for model training, and supports various robot embodiments. GR00T N1 embodies years of fundamental research, spanning

Seonghyeon Ye (@seonghyeonye) 's Twitter Profile Photo

Finally, our GR00T N1 is released! 🦾🦾 Excited to contribute and share the model. Great to see our previous work, LAPA, being used as a pretraining objective for large-scale training on human, robot, and synthetic videos. More exciting things to come this year!!

Jim Fan (@drjimfan) 's Twitter Profile Photo

We got lots of great community feedback on our open-source GR00T N1! Check out our Github, star, fork, contribute back! Let's solve generally intelligent robots together, one commit at a time. github.com/NVIDIA/Isaac-G…

We got lots of great community feedback on our open-source GR00T N1! Check out our Github, star, fork, contribute back! Let's solve generally intelligent robots together, one commit at a time.

github.com/NVIDIA/Isaac-G…
Shayne Longpre (@shayneredford) 's Twitter Profile Photo

Thrilled our global data ecosystem audit was accepted to #ICLR2025! Empirically, we find: 1⃣ Soaring synthetic text data: ~10M tokens (pre-2018) to 100B+ (2024). 2⃣ YouTube is now 70%+ of speech/video data but could block third-party collection. 3⃣ <0.2% of data from

Thrilled our global data ecosystem audit was accepted to #ICLR2025!

Empirically, we find:

1⃣ Soaring synthetic text data: ~10M tokens (pre-2018) to 100B+ (2024).

2⃣ YouTube is now 70%+ of speech/video data but could block third-party collection.

3⃣ &lt;0.2% of data from
Seungone Kim @ NAACL2025 (@seungonekim) 's Twitter Profile Photo

🏆Glad to share that our BiGGen Bench paper has received the best paper award at NAACL HLT 2025! x.com/naaclmeeting/s… 📅 Ballroom A, Session I: Thursday May 1st, 16:00-17:30 (MDT) 📅 Session M (Plenary Session): Friday May 2nd, 15:30-16:30 (MDT) 📅 Virtual Conference: Tuesday