Fangfu Liu (@fangfu0830) Twitter Tweets • TwiCopy

el.cine

9 months ago

wow.. ChatGPT just dropped Image Editor you can now select an area of the image to add, remove or change things, its only available to some users now here's how to get access and some tricks:

thumb_up_off_alt4,4K

chat_bubble_outline117

repeat423

shareShare

Chongjie(CJ) Ye

@ychngji6

9 months ago

✨Hi3DGen now runs locally via ComfyUI! 🏠GPT-4o/Gemini + Hi3DGen = Your dream home in minutes! Get started: github.com/Stable-X/Comfy…

thumb_up_off_alt521

chat_bubble_outline6

repeat97

shareShare

🚀🚀🚀Introducing VideoScene (CVPR'25) - a turbo upgrade of ReconX! Our one-step video diffusion model bridges the gap from video to 3D, outpacing slow multi-step pipelines. Paper: arxiv.org/abs/2504.01956 Project Page: hanyang-21.github.io/VideoScene Code: github.com/hanyang-21/Vid…

thumb_up_off_alt139

chat_bubble_outline0

repeat33

shareShare

Fangfu Liu

@fangfu0830

9 months ago

Thanks AK for sharing our work on CVPR 2025, #CVPR for one-step 3D consistent video generation that bridges the gap from video to 3D! Paper: arxiv.org/abs/2504.01956 Project Page: hanyang-21.github.io/VideoScene GitHub: github.com/hanyang-21/Vid… #VIDEO #aigc

thumb_up_off_alt33

chat_bubble_outline0

repeat4

shareShare

AK

@_akhaliq

9 months ago

Pusa is out on Hugging Face Thousands Timesteps Video Diffusion Model A single model that unlocks: • Text-to-Video • Image-to-Video • Start/End Frames to Video • Video Transitions • Video Extensions • Next-frame prediction • Novel sampling

thumb_up_off_alt419

chat_bubble_outline13

repeat91

shareShare

AK

@_akhaliq

9 months ago

Video Game Bench introduce a research preview of VideoGameBench, a benchmark which challenges vision-language models to complete, in real-time, a suite of 20 different popular video games from both hand-held consoles and PC GPT-4o, Claude Sonnet 3.7, Gemini 2.5 Pro, and Gemini

thumb_up_off_alt1,1K

chat_bubble_outline43

repeat152

shareShare

Fangfu Liu

@fangfu0830

8 months ago

Great Talk！

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Google Cloud

@googlecloud

8 months ago

Gemini + Imagen + Veo = ✨ cinematic magic ✨ At #GoogleCloudNext, we used Gemini, Imagen and Veo on Vertex AI to build this simple video creation experience. Check it out ⬇️

thumb_up_off_alt2,2K

chat_bubble_outline41

repeat372

shareShare

Avi Chawla

@_avichawla

7 months ago

Fine-tune 100+ LLMs directly from a UI! LLaMA-Factory lets you train and fine-tune open-source LLMs and VLMs without writing any code. Supports 100+ models, multimodal fine-tuning, PPO, DPO, experiment tracking, and much more! 100% open-source with 50k stars!

thumb_up_off_alt2,2K

chat_bubble_outline27

repeat411

shareShare

Fangfu Liu

@fangfu0830

7 months ago

Elevate Visual-Spatial Intelligence with Spatial-MLLM! 🚀🚀🚀 Discover how we incorporate 3D information to help MLLMs better think in space in our work: Spatial-MLLM. 🔗Code: github.com/diankun-wu/Spa… 🌐Project Page: diankun-wu.github.io/Spatial-MLLM/ 📄Paper: arxiv.org/abs/2505.23747

thumb_up_off_alt172

chat_bubble_outline3

repeat27

shareShare

Fangfu Liu

@fangfu0830

7 months ago

Big thanks to AK for sharing our work! We're thrilled to announce Spatial-MLLM, our latest work to improve spatial reasoning in multimodal large language models. The model code is open-sourced!🎉 Code: github.com/diankun-wu/Spa… Project page: diankun-wu.github.io/Spatial-MLLM/

thumb_up_off_alt62

chat_bubble_outline2

repeat10

shareShare

Bin Lin

@linbin46984

7 months ago

🚀UniWorld: a unified model that skips VAEs and uses semantic features from SigLIP! Using just 1% of BAGEL’s data, it outperforms on image editing and excels in understanding & generation. 🌟Now data, model, training & evaluation script are open-source! github.com/PKU-YuanGroup/…

thumb_up_off_alt190

chat_bubble_outline4

repeat33

shareShare

Fangfu Liu

@fangfu0830

7 months ago

⚡️⚡️⚡️Introducing 4D-Fly (CVPR'25) - for fast reconstructing 4D scenes from monocular videos in minutes. Compared to previous methods, our approach achieves a 20x speed-up while maintaining comparable or superior reconstruction quality. Project page: diankun-wu.github.io/4D-Fly/ #4D

thumb_up_off_alt118

chat_bubble_outline0

repeat19

shareShare

Fangfu Liu

@fangfu0830

6 months ago

Video-T1 has been accepted to #ICCV2025. See you all in Hawaii !🌴

thumb_up_off_alt15

chat_bubble_outline0

repeat1

shareShare

Fangfu Liu

@fangfu0830

6 months ago

DimensionX has been accepted to #ICCV2025. See you all in Hawaii !🌴

thumb_up_off_alt43

chat_bubble_outline0

repeat5

shareShare

Fangfu Liu

@fangfu0830

6 months ago

🚀 Unveiling Unify Model and Spatial Intelligence at #ICCV2025 in our LangScene-X! Unify 3D scene reconstruction, generation, and understanding in one video diffusion model! Code is open sourced at github.com/liuff19/LangSc…

thumb_up_off_alt10

chat_bubble_outline0

repeat1

shareShare

Fangfu Liu

@fangfu0830

6 months ago

Big thanks to AK for sharing LangScene-X, our latest work to unify 3D scene reconstruction, generation, and understanding in a single video diffusion! The model code is open-sourced!🎉 Code: github.com/liuff19/LangSc… Project page: liuff19.github.io/LangScene-X/

thumb_up_off_alt49

chat_bubble_outline1

repeat13

shareShare