Zhenxing Mi (@mifucius1) 's Twitter Profile
Zhenxing Mi

@mifucius1

PhD student @ HKUST

ID: 1421401082215866369

linkhttps://mizhenxing.github.io calendar_today31-07-2021 09:23:31

41 Tweet

100 Followers

588 Following

Zhenxing Mi (@mifucius1) 's Twitter Profile Photo

The code of our ICLR2023 paper "Switch-NeRF: Learning Scene Decomposition with Mixture of Experts for Large-scale Neural Radiance Fields" has been released. Dan Xu Code: github.com/MiZhenxing/Swi… Paper: openreview.net/forum?id=PQ2zo… Project page: mizhenxing.github.io/switchnerf

HanRong YE (@leoyerrrr) 's Twitter Profile Photo

#ICLR2023 Updates to TaskPrompter's codebase for joint 2D-3D multi-task understanding on Cityscapes-3D! We now predict disparity instead of depth, aligning with prevalent practices in the dataset. Please check github.com/prismformore/M…… Thank Prof Dan Xu for valuable guidance!😃

HanRong YE (@leoyerrrr) 's Twitter Profile Photo

How to design generative models to help segmentation tasks?🧐Introducing SegGen, our innovative approach for generating training data for image segmentation tasks, which greatly pushes the boundaries of performance for cutting-edge segmentation models. We creatively propose a

Zhenxing Mi (@mifucius1) 's Twitter Profile Photo

Excited to share our new paper "ThinkDiff" on arxiv. I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models It can make the diffusion models take "IQ tests"! It empowers diffusion models with multimodal in-context understanding and reasoning

Excited to share our new paper "ThinkDiff" on arxiv.

I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models

It can make the diffusion models take "IQ tests"!

It empowers diffusion models with multimodal in-context understanding and reasoning
Zhenxing Mi (@mifucius1) 's Twitter Profile Photo

The image generation of GPT-4o is amazing. It highlights image generation based on multimodal in-context learning. Our paper ThinkDiff investigates this direction and shows promising results, although far less powerful than GPT-4o and Gemini. Check out our paper and post!

The image generation of GPT-4o is amazing. It highlights image generation based on multimodal in-context learning. 

Our paper ThinkDiff investigates this direction and shows promising results, although far less powerful than GPT-4o and Gemini. Check out our paper and post!
Dan Xu ✈️ CVPR2025 (@danxuhk) 's Twitter Profile Photo

We propose a high-fidelity talking head generation framework that supports both single-modal and multi-modal driven signals. More details: Arxiv: arxiv.org/abs/2504.02542 Project page: harlanhong.github.io/publications/a… Github: github.com/harlanhong/ACT… HigginFace: huggingface.co/papers/2504.02…