Bin Lin (@linbin46984) 's Twitter Profile
Bin Lin

@linbin46984

Peking University

ID: 1727640166117122048

linkhttps://github.com/LinB203 calendar_today23-11-2023 10:48:23

52 Tweet

1,1K Takipçi

82 Takip Edilen

Bin Lin (@linbin46984) 's Twitter Profile Photo

The Open-Sora Plan team releases Arxiv papers, which include details on WF-VAE model, Diffusion model, training stability, data, prompt enhancement, I2V, and ControlNet. Open-Sora Plan: arxiv.org/abs/2412.00131 WF-VAE: arxiv.org/abs/2411.17459 Feel free to discuss, share and cite.

Bin Lin (@linbin46984) 's Twitter Profile Photo

Excited to share that our latest research Open-Sora Plan report is being featured on the arXiv discussion forum alphaXiv Akshat Shrivastava and I will be on alphaXiv to answer any questions you have on the paper. alphaxiv.org/abs/2412.00131…

Bin Lin (@linbin46984) 's Twitter Profile Photo

👉👉👉A novel perspective uses the Monte Carlo Language Tree to analyze LLMs, revealing that training approximates the Data-Tree. This suggests LLM reasoning is probabilistic pattern-matching, explaining phenomena like hallucinations, CoT, and token bias.

Bin Lin (@linbin46984) 's Twitter Profile Photo

🚀 SwapAnyone: End-to-end, seamless body-swapping—no more lighting glitches or unnatural blends! 🥇 EnvHarmony for smooth fusion 🥈 HumanAction-32K for diverse training 🥉 SOTA performance, open & closed models Page: pku-yuangroup.github.io/SwapAnyone/ GitHub: github.com/PKU-YuanGroup/…

Bin Lin (@linbin46984) 's Twitter Profile Photo

🚨 Hot Take: GPT-4o might NOT be a purely autoregressive model! 🚨 There’s a high chance it has a diffusion head. 🤯 If true, this could be a game-changer for AI architecture. What do you think? 🤔👇 arxiv.org/pdf/2504.02782

🚨 Hot Take: GPT-4o might NOT be a purely autoregressive model! 🚨

There’s a high chance it has a diffusion head. 🤯 If true, this could be a game-changer for AI architecture. What do you think? 🤔👇

arxiv.org/pdf/2504.02782
Bin Lin (@linbin46984) 's Twitter Profile Photo

📊Benchmarking: Evaluated 16 S2V models to reveal strengths and weaknesses in complex scenes. 🎥OpenS2V-5M: 5.4M 720p image-text-video triplets via cross-video linking & multi-view synthesis. 🚀Code & data are open-source. github.com/PKU-YuanGroup/…

Bin Lin (@linbin46984) 's Twitter Profile Photo

🚀 Introducing FlashI2V: The game-changer in Image-to-Video generation! 🔥 Solving conditional image leakage with Latent Shifting & Fourier Guidance. 1.3B parameters, outperforms CogVideoX1.5-5B in speed, quality & generalization. github.com/PKU-YuanGroup/…

🚀 Introducing FlashI2V: The game-changer in Image-to-Video generation! 🔥 Solving conditional image leakage with Latent Shifting & Fourier Guidance. 1.3B parameters, outperforms CogVideoX1.5-5B in speed, quality & generalization. 

github.com/PKU-YuanGroup/…