Taekyung Ki (@taekyungki) Twitter Tweets • TwiCopy

Baku

@bk_sakurai

a year ago

*動画生成：Sunoで作ったオリジナル曲をComfyUI-FLOATで歌ってもらう #comfyui note投稿しました。 note.com/bakushu/n/n1f8…

thumb_up_off_alt2,2K

chat_bubble_outline16

repeat313

shareShare

DTM vs FM👇 Lots of interest in how Difference Transition Matching (DTM) connects to Flow Matching (FM). Here is a short animation that illustrates Theorem 1 in our paper: For a very small step size (1/T), DTM converges to an Euler step of FM.

thumb_up_off_alt327

chat_bubble_outline2

repeat52

shareShare

Taekyung Ki

@taekyungki

10 months ago

Thank you for sharing our work!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Pika

@pika_labs

10 months ago

Some news: We're building the next big thing — the first-ever AI-only social video app, built on a highly expressive human video model. Over the past few weeks, we’ve been testing it in private beta. Now, we’re opening early access: download the iOS app to join the waitlist, or

thumb_up_off_alt1,1K

chat_bubble_outline224

repeat186

shareShare

Minki Kang

@mkkang_1133

8 months ago

Our Agent Distillation paper is accepted at #NeurIPS2025 Spotlight! 🚀 Turn your small LM into a strong agent 💪 Code: github.com/Nardien/agent-…

thumb_up_off_alt91

chat_bubble_outline3

repeat11

shareShare

Jaehyeong Jo

@jaehyeong_jo

5 months ago

I'll be at NeurIPS to present the final paper of my PhD: Continuous Diffusion Model for Language Modeling (arxiv.org/abs/2502.11564) We present a continuous diffusion model for language modeling using tools from Riemannian geometry, opening up a new direction for diffusion LMs!

thumb_up_off_alt281

chat_bubble_outline6

repeat33

shareShare

Dongki kim

@dongkikim95

5 months ago

I'm excited to be presenting our work at NeurIPS 2025! 🗓 When: Wed, Dec 3, 2025 11:00 AM - 2:00 PM (PST) 📍Where: Exhibit Hall C, D, E #1606 If you're attending NeurIPS, please stop by. I'd love to chat about the work and AI4Science!

thumb_up_off_alt7

chat_bubble_outline0

repeat3

shareShare

DailyPapers

@huggingpapers

4 months ago

Avatar Forcing A real-time interactive head avatar generation framework that enables natural conversation with 500ms latency. Uses diffusion forcing for causal motion generation and direct preference optimization for expressive interactions without labeled data.

thumb_up_off_alt53

chat_bubble_outline3

repeat11

shareShare

Ryan Chan

@ryan_resolution

4 months ago

We just upgraded XLeRobot 🚀 Built by the MakerMods team Isaac Sin, Mr Thompson and QI LIU. • Easier to build • Improved chassis • Reduced 3D-print time and material • Designed in collaboration with the original author Vector Wang Fully open-source Full build

We just upgraded XLeRobot 🚀

Built by the MakerMods team <a href="/IsaacSin12/">Isaac Sin</a>, <a href="/ThomasSchicksal/">Mr Thompson</a> and <a href="/QILIU9203/">QI LIU</a>.

• Easier to build
• Improved chassis
• Reduced 3D-print time and material
• Designed in collaboration with the original author <a href="/VectorWang2/">Vector Wang</a>

Fully open-source
Full build

thumb_up_off_alt194

chat_bubble_outline10

repeat26

shareShare

Wildminder

@wildmindai

3 months ago

Self-Refining Video Sampling: inference-time method using a video generator as its own refiner to correct physics and motion. no retraining needed; scores >70% human preference; is validated on Wan2.2 & Cosmos. agwmon.github.io/self-refine-vi…

thumb_up_off_alt261

chat_bubble_outline4

repeat39

shareShare

Sangwon Jang

@jangsangwon7

3 months ago

What if your video generator could refine itself—at inference time? ❌No new models. ❌No retraining. ❌No external verifier. 💡 Introducing Self-Refining Video Sampling By reinterpreting a pretrained generator (Wan2.2, Cosmos) as a denoising autoencoder, we enable iterative

thumb_up_off_alt177

chat_bubble_outline3

repeat26

shareShare

Saining Xie

@sainingxie

3 months ago

if you are building video diffusion / world simulators, try this new sampler. temporal consistency pins videos to a low-dimensional manifold in the total pixel space. self-refinement sampling keeps them there.

thumb_up_off_alt258

chat_bubble_outline2

repeat19

shareShare

Ilir Aliu - eu/acc

@iliraliu_

2 months ago

Learning from robot data? Standard. Direct Video-Action Models (DVA) is different: treat robot control as video generation, then translate the generated video into actions. Built by , the system pre-trains causal video models from scratch and can run complex

thumb_up_off_alt242

chat_bubble_outline6

repeat39

shareShare

Zhikai Zhang

@zhikai273

2 months ago

🎾Introducing LATENT: Learning Athletic Humanoid Tennis Skills from Imperfect Human Motion Data Dynamic movements, agile whole-body coordination, and rapid reactions. A step toward athletic humanoid sports skills. Project: zzk273.github.io/LATENT/ Code: github.com/GalaxyGeneralR…

thumb_up_off_alt4,4K

chat_bubble_outline161

repeat632

shareShare

Physical Intelligence

@physical_int

2 months ago

We developed an RL method for fine-tuning our models for precise tasks in just a few hours or even minutes. Instead of training the whole model, we add an “RL token” output to π-0.6, our latest model, which is used by a tiny actor and critic to learn quickly with RL.

thumb_up_off_alt1,1K

chat_bubble_outline24

repeat255

shareShare

Sander Dieleman

@sedielem

a month ago

"Diffusability" is all about the spectrum. arxiv.org/abs/2603.14645 If you enjoyed my blog post about diffusion as spectral autoregression, and are wondering how this relates to latent diffusion, give this paper a read!

thumb_up_off_alt457

chat_bubble_outline7

repeat71

shareShare

Taekyung Ki

Baku

Neta Shaul

Taekyung Ki

Pika

Minki Kang

Jaehyeong Jo

Dongki kim

DailyPapers

Ryan Chan

Wildminder

Sangwon Jang

Saining Xie

Ilir Aliu - eu/acc

Zhikai Zhang

Physical Intelligence

Sander Dieleman