MMLab@NTU (@mmlabntu) 's Twitter Profile
MMLab@NTU

@mmlabntu

Multimedia Laboratory @NTUsg, affiliated with S-Lab.
Computer Vision, Image Processing, Computer Graphics, Deep Learning

ID: 1394997810584428547

linkhttp://www.mmlab-ntu.com calendar_today19-05-2021 12:46:26

69 Tweet

1,1K Takipçi

18 Takip Edilen

Chen Change Loy (@ccloy) 's Twitter Profile Photo

📽️📽️ The code of Rerender A Video is now available at github.com/williamyang199… #SIGGRAPHAsia2023 #SIGGRAPHAsia

Chen Change Loy (@ccloy) 's Twitter Profile Photo

AK Check out MMLab@NTU concurrent work titled "Interpret Vision Transformers as ConvNets with Dynamic Convolutions": 📄 Read the paper here: arxiv.org/abs/2309.10713 🧐 We explored replacing softmax in Transformers with constant scaling and ReLU (with optional BN/LN). Constant

Chen Change Loy (@ccloy) 's Twitter Profile Photo

Chase Lean Try StableSR, a diffusion model-based upscaler. We paid extra efforts to maintain fidelity. Code and model: github.com/IceClear/Stabl….

<a href="/chaseleantj/">Chase Lean</a> Try StableSR, a diffusion model-based upscaler. We paid extra efforts to maintain fidelity. 

Code and model: github.com/IceClear/Stabl….
Ziwei Liu (@liuziwei7) 's Twitter Profile Photo

🔥🔥We are excited to announce #Vchitect, an open-source project for video generative models Hugging Face 📽️LaVie (Text2Video Model) - Code: github.com/Vchitect/LaVie - huggingface.co/spaces/Vchitec… 📽️SEINE (Image2Video Model) - Code: github.com/Vchitect/SEINE - huggingface.co/spaces/Vchitec…

MMLab@NTU (@mmlabntu) 's Twitter Profile Photo

EdgeSAM - Prompt-In-the-Loop Distillation for On-Device Deployment of SAM 🔗 Project page: mmlab-ntu.com/project/edgesa… 🔗 GitHub: github.com/chongzhou96/Ed… 🤗 Hugging Face: huggingface.co/spaces/chongzh…

MMLab@NTU (@mmlabntu) 's Twitter Profile Photo

EdgeSAM - Prompt-In-the-Loop Distillation for On-Device Deployment of SAM 🔗 Project page: mmlab-ntu.com/project/edgesa… 🔗 GitHub: github.com/chongzhou96/Ed… 🤗 Hugging Face: huggingface.co/spaces/chongzh…

Chen Change Loy (@ccloy) 's Twitter Profile Photo

🔬 Our study introduces "Upscale-A-Video," a text-guided latent diffusion framework for video upscaling. It ensures temporal coherence locally & globally, balancing fidelity and quality. 🚀 Project page: shangchenzhou.com/projects/upsca… 💻 GitHub: github.com/sczhou/Upscale… 🎥 Video:

Xingang Pan (@xingangp) 's Twitter Profile Photo

(1/2) We are actively seeking PhD candidates from various countries to foster diversity in our research group at Nanyang Technological University. Know someone interested in a PhD with us? Please refer them to our team. Thanks for supporting diversity in academia! 🌍🎓

The AI Talks (@theaitalksorg) 's Twitter Profile Photo

The Upcoming AI talk: 🌋LLaVA🦙 A Vision-and-Language Approach to Computer Vision in the Wild by Chunyuan Li Chunyuan Li More info: mailchi.mp/1242f078b2b1/a… Subscribe us: mailchi.mp/4417dc2cde83/t…

The Upcoming AI talk: 

🌋LLaVA🦙
A Vision-and-Language Approach to Computer Vision in the Wild by Chunyuan Li <a href="/ChunyuanLi/">Chunyuan Li</a>

More info: 
mailchi.mp/1242f078b2b1/a…
Subscribe us:
mailchi.mp/4417dc2cde83/t…
Chen Change Loy (@ccloy) 's Twitter Profile Photo

📸🌟 Attention all photography and imaging enthusiasts! Join us at the Third MIPI Workshop at #CVPR2024! 📍 Location: Arch 213 ⏰ Time: 08:30 AM - 12:10 PM 🌐 Website: mipi-challenge.org Don't miss out on an exciting lineup of speakers: 🔹 Lei Zhang: How Far Are We From

📸🌟 Attention all photography and imaging enthusiasts! Join us at the Third MIPI Workshop at #CVPR2024!

📍 Location: Arch 213 
⏰ Time: 08:30 AM - 12:10 PM 
🌐 Website: mipi-challenge.org

Don't miss out on an exciting lineup of speakers:
🔹 Lei Zhang: How Far Are We From
Chen Change Loy (@ccloy) 's Twitter Profile Photo

We turned our method, rejected by CVPR and ECCV, into the iOS app "Cutcha". EdgeSAM, our fast Segment Anything Model, runs at over 30 FPS on an iPhone 14. Enjoy intuitive one-touch object selection and precise editing—all processed locally on your device. No cloud needed!

We turned our method, rejected by CVPR and ECCV, into the iOS app "Cutcha".

EdgeSAM, our fast Segment Anything Model, runs at over 30 FPS on an iPhone 14. Enjoy intuitive one-touch object selection and precise editing—all processed locally on your device. No cloud needed!
Size Wu (@wusize) 's Twitter Profile Photo

🔥 We release Harmon: a unified framework for multimodal understanding & generation with a shared visual encoder (vs. decoupled Janus/-Pro). 💥 SOTA on GenEval, MJHQ, WISE 🧠 Strong understanding performance 📄 Paper: huggingface.co/papers/2503.21… 🔗 Code: github.com/wusize/Harmon

Chen Change Loy (@ccloy) 's Twitter Profile Photo

🚀 Meet Harmon – a unified model for both image generation and understanding! Trained with a shared masked autoregressive encoder, it sets new benchmarks on GenEval & MJHQ30K. 🖼️💬 Try the live demo now on Hugging Face: 👉 huggingface.co/spaces/wusize/… Paper: arxiv.org/abs/2503.21979

AK (@_akhaliq) 's Twitter Profile Photo

Aero-1-Audio is out on Hugging Face Trained in <24h on just 16×H100 Handles 15+ min audio seamlessly Outperforms bigger models like Whisper, Qwen-2-Audio & commercial services from ElevenLabs/Scribe

Ziqi Huang (@ziqi_huang_) 's Twitter Profile Photo

🎬 𝗖𝗩𝗣𝗥 𝟮𝟬𝟮𝟱 𝗧𝘂𝘁𝗼𝗿𝗶𝗮𝗹 𝙁𝙧𝙤𝙢 𝙑𝙞𝙙𝙚𝙤 𝙂𝙚𝙣𝙚𝙧𝙖𝙩𝙞𝙤𝙣 𝙩𝙤 𝙒𝙤𝙧𝙡𝙙 𝙈𝙤𝙙𝙚𝙡 🚀 Hosted by MMLab@NTU × Kuaishou, etc 📅 June 11 | Nashville 🔗 world-model-tutorial.github.io 🧠 Video is just the start. World modeling is the goal. #CVPR2025 #WorldModel

🎬 𝗖𝗩𝗣𝗥 𝟮𝟬𝟮𝟱 𝗧𝘂𝘁𝗼𝗿𝗶𝗮𝗹
𝙁𝙧𝙤𝙢 𝙑𝙞𝙙𝙚𝙤 𝙂𝙚𝙣𝙚𝙧𝙖𝙩𝙞𝙤𝙣 𝙩𝙤 𝙒𝙤𝙧𝙡𝙙 𝙈𝙤𝙙𝙚𝙡

🚀 Hosted by MMLab@NTU × Kuaishou, etc
📅 June 11 | Nashville
🔗 world-model-tutorial.github.io
🧠 Video is just the start. World modeling is the goal.
#CVPR2025 #WorldModel
Shulin Tian (@shulin_tian) 's Twitter Profile Photo

🎥 Video is already a tough modality for reasoning. Egocentric video? Even tougher! It is longer, messier, and harder. 💡 How do we tackle these extremely long, information-dense sequences without exhausting GPU memory or hitting API limits? We introduce 👓Ego-R1: A framework