Vu Tran (@vu0tran) 's Twitter Profile
Vu Tran

@vu0tran

Founder @ butterflies.ai

ID: 275876356

calendar_today02-04-2011 05:33:53

145 Tweet

9,9K Followers

47 Following

Vu Tran (@vu0tran) 's Twitter Profile Photo

The interesting thing about Sora 2 is the big difference between t2v and i2v in terms of quality. The best analogy I can give is that Sora 2 seems to understand videos as complete "sentences" but its understanding of "words" isn't as comprehensive as other video models out there

Vu Tran (@vu0tran) 's Twitter Profile Photo

Here are some technical insights looking at Sora 2. I'm pretty sure they split their temporal and spatial layers in their model for better efficiency. If you look closely at any Sora 2 video, you can see the background jitter between frames. This isn't problem in other video

Vu Tran (@vu0tran) 's Twitter Profile Photo

Wow. Sora 2 struggles with Image to Video. Left is Sora, Right is Boba Anime 1.4. Sora 2 also seems to crank up the "crazy and bizarre" filter lol. "woman places her tea cup down and hurdles over the chair behind her"

Vu Tran (@vu0tran) 's Twitter Profile Photo

Here's a side by side of Sora 2 and Boba Anime for I2V. It's interesting how poorly Sora 2 does for I2V. I used a very complicated prompt this time: anime woman stands up, pulls out a sword, swings it over her head 3 times and it glows. She then points the glowing sword while

Vu Tran (@vu0tran) 's Twitter Profile Photo

My bet will be less. 15%. It's hard. Social apps are all about social currency and attention. The double edged sword here is platform arbitrage. Bc the videos are 10 seconds long, the format is too fungible with TikTok, so the best videos will just get downloaded and cross posted

Vu Tran (@vu0tran) 's Twitter Profile Photo

I'm 100% sure Grok Imagine is just using nearly off the shelf WAN, and they're also not implementing it well. How can I tell? 1. They're using a lightning step distilled version of WAN. You can tell because Grok Imagine often produces slow motion. It also crushes the expressive

Vu Tran (@vu0tran) 's Twitter Profile Photo

I was being sarcastic. I like the guy! 100% of his portfolio are index funds IMO, if we can grift and trick everyone into investing in only index funds, the world will be a better place lmao @imkevinxu

Vu Tran (@vu0tran) 's Twitter Profile Photo

Wow! Reception on Boba Anime 1.4 has been HUGE! 1.4 greatly improves prompt coherence and allows for much more dynamic motion. I'm already training v1.5 which will improve action scenes 10x. Meanwhile, architecting v2.0 has already begun. It will go toe to toe with Sora.

Vu Tran (@vu0tran) 's Twitter Profile Photo

I've been so deep in training that I had a dream the other day that I had to explain DiT (diffusion transformers) but it was to a bunch of 8 year olds. I got super pissed and started yelling at them: "THEY'RE TENSORS THEY'RE CALLED FUCKING TENSORS!!!" smh

Vu Tran (@vu0tran) 's Twitter Profile Photo

Making good progress on motion. I think it's definitely better than Vidu or anything else out there minus Sora 2. Still a bit more training to go.