Chen Change Loy (@ccloy) Twitter Tweets • TwiCopy

青稞AI

@qingke_ai

5 months ago

北京时间7月8日晚上8点，南洋理工大学MMLab博士生吴鹏浩，将直播分享《GUI-Reflection：让多模态 GUI 智能体获得反思纠错能力的训练框架》。

thumb_up_off_alt5

chat_bubble_outline0

repeat3

shareShare

👁️‍ ObjectClear: Complete Object Removal via Object-Effect Attention 🧹 Jupyter Notebook 🥳 Thanks to Jixin Zhao ❤ Shangchen Zhou ❤ Zhouxia Wang ❤ Peiqing Yang ❤ Chen Change Loy ❤ 🌐page: zjx0101.github.io/projects/Objec… 🧬code: github.com/zjx0101/Object… 📄paper: arxiv.org/abs/2505.22636

thumb_up_off_alt96

chat_bubble_outline4

repeat22

shareShare

Gradio

@gradio

5 months ago

🔥🆕 ObjectClear is an object removal model that can jointly eliminate the target object and its associated effects (shadow etc) Object Clear app on @Huggingface : huggingface.co/spaces/jixin01…

thumb_up_off_alt139

chat_bubble_outline2

repeat28

shareShare

Xingang Pan

@xingangp

4 months ago

Introducing 𝗦𝗧𝗿𝗲𝗮𝗺𝟯𝗥, a new 3D geometric foundation model for efficient 3D reconstruction from streaming input. Similar to LLMs, STream3R uses casual attention during training and KVCache at inference. No need to worry about post-alignment or reconstructing from scratch.

thumb_up_off_alt318

chat_bubble_outline5

repeat58

shareShare

AK

@_akhaliq

4 months ago

STream3R Scalable Sequential 3D Reconstruction with Causal Transformer

thumb_up_off_alt107

chat_bubble_outline4

repeat13

shareShare

Chen Change Loy

@ccloy

4 months ago

Our new preprint: “Next Visual Granularity Generation”  -  a novel framework for image generation that builds visuals hierarchically, from broad layout to fine detail. Achieves consistent FID improvements (e.g., from 3.30 → 3.03, 2.57 → 2.44, 2.09 → 2.06) compared to VAR in

thumb_up_off_alt19

chat_bubble_outline0

repeat2

shareShare

Yihang Luo

@theyihangluo

4 months ago

STream3R reformulates dense 3D reconstruction into a sequential registration task with causal attention. Just tried 3D reconstruction on a #GrokImagine video using #STream3R🫡! Check out STream3R on our GitHub for more👨‍💻: github.com/NIRVANALAN/STr…

thumb_up_off_alt9

chat_bubble_outline0

repeat2

shareShare

AK

@_akhaliq

4 months ago

Next Visual Granularity Generation

thumb_up_off_alt70

chat_bubble_outline2

repeat8

shareShare

DailyPapers

@huggingpapers

4 months ago

New from S-Lab, Nanyang Technological University & SenseTime Research: Next Visual Granularity Generation (NVG)! This novel framework progressively refines images from global layout to fine details, offering fine-grained control over generation. It outperforms the VAR series in

thumb_up_off_alt66

chat_bubble_outline4

repeat11

shareShare

Kwang Moo Yi

@kwangmoo_yi

4 months ago

Lan and Luo et al., "STREAM3R: Scalable Sequential 3D Reconstruction with Causal Transformer" Yep, another streaming feed-forward 3D estimator. This time, with Dust3R backbone. Architecture is now getting pretty close to LLMs :) Are these going to become 3D GPT?

thumb_up_off_alt149

chat_bubble_outline1

repeat23

shareShare

Chen Change Loy

@ccloy

2 months ago

Congrats to Ziwei Liu !!

thumb_up_off_alt31

chat_bubble_outline1

repeat1

shareShare

Shangchen Zhou

@shangchenzhou

2 months ago

📸Join us at #ICCV2025 for the Mobile Intelligent Photography & Imaging (MIPI) Workshop! ✨Leading keynotes: Profs. Song Han, Michal Irani, Boxin Shi, and Ming-Hsuan Yang - on intelligent photography and efficient GenAI. 🗓Oct 20, 8:50am–12:30pm HST 🔗mipi-challenge.org

📸Join us at #ICCV2025 for the Mobile Intelligent Photography & Imaging (MIPI) Workshop!

✨Leading keynotes: Profs. <a href="/songhan_mit/">Song Han</a>, Michal Irani, Boxin Shi, and <a href="/MingHsuanYang/">Ming-Hsuan Yang</a> - on intelligent photography and efficient GenAI.

🗓Oct 20, 8:50am–12:30pm HST
🔗mipi-challenge.org

thumb_up_off_alt28

chat_bubble_outline1

repeat10

shareShare

AK

@_akhaliq

2 months ago

Thinking with Camera A Unified Multimodal Model for Camera-Centric Understanding and Generation

thumb_up_off_alt237

chat_bubble_outline3

repeat33

shareShare

Kang Liao

@kangliao929

2 months ago

Introducing 𝐓𝐡𝐢𝐧𝐤𝐢𝐧𝐠 𝐰𝐢𝐭𝐡 𝐂𝐚𝐦𝐞𝐫𝐚📸, a unified multimodal model that integrates camera-centric spatial intelligence to interpret and create scenes from arbitrary viewpoints. Project Page: kangliao929.github.io/projects/puffi… Code: github.com/KangLiao929/Pu…

thumb_up_off_alt147

chat_bubble_outline14

repeat32

shareShare

Min Choi

@minchoi

2 months ago

Thinking with Camera (Puffin) just dropped. This AI doesn't just see a picture, it reasons like a director. It predicts lens/pose, guides shots, and generates scenes across views. Simple breakdown:

thumb_up_off_alt56

chat_bubble_outline12

repeat6

shareShare

#CVPR2025

@cvpr

2 months ago

The #CVPR2026 submission portal is now open!

thumb_up_off_alt103

chat_bubble_outline2

repeat16

shareShare

Chen Change Loy

@ccloy

2 months ago

Congrats to Yuekun Dai Yuekun Dai and Ziang Cao ziangc , both from MMLab@NTU , for winning the prestigious Google PhD Fellowship! Yuekun: ykdai.github.io Ziang: ziangcao0312.github.io NTU Singapore

thumb_up_off_alt39

chat_bubble_outline1

repeat3

shareShare

Ziwei Liu

@liuziwei7

a month ago

🔥One-Stop Training Engine for Unified Models🔥 ⚡️LMMs-Engine⚡️ is a lean and flexible unified model training engine built for hacking at scale * Support multimodal inputs and outputs, from AR, diffusion and linear models, to unified models like BAGEL 🏠github.com/EvolvingLMMs-L…

thumb_up_off_alt190

chat_bubble_outline6

repeat33

shareShare

Chen Change Loy

青稞AI

camenduru

Gradio

Xingang Pan

AK

Chen Change Loy

Yihang Luo

AK

DailyPapers

Kwang Moo Yi

Chen Change Loy

Shangchen Zhou

AK

Kang Liao

Min Choi

#CVPR2025

Chen Change Loy

Ziwei Liu