Qihang Zhang (@qihangzhang0224) 's Twitter Profile
Qihang Zhang

@qihangzhang0224

Ph.D. student in MMLab, CUHK. Prev intern at Apple MLR, Snap Creative Vision, and Shanghai AI Lab.

ID: 1442376420030779397

linkhttps://zqh0253.github.io calendar_today27-09-2021 06:32:04

30 Tweet

147 Takipçi

137 Takip Edilen

Allen T. (@mr_allent) 's Twitter Profile Photo

Text-to-3D Scene Generation Introducing SceneWiz3D A new approach to create high-fidelity 3D scenes from text and 3D object control by Qihang Zhang and colleagues Link and examples below:

Ceyuan Yang (@ceyuany) 's Twitter Profile Photo

Thank AK for sharing our work. As an another plug-in module, CameraCtrl could be inserted into our #AnimateDiff to enable camera controlling. Code and models are avaliable at github.com/hehao13/Camera…

Ceyuan Yang (@ceyuany) 's Twitter Profile Photo

Huggingface demo is available at huggingface.co/spaces/qihang/… You can also find the code and weights in github.com/zqh0253/BerfSc…

Jiatao Gu (@thoma_gu) 's Twitter Profile Photo

🚀Excited to introduce KaleidoDiffusion -- a new method that improves conditional diffusion model generation by incorporating autoregressive latent priors! This allows us generate much more diverse outputs even with high CFG just like a kaleidoscope🔭! (1/n)

🚀Excited to introduce KaleidoDiffusion --
a new method that improves conditional diffusion model generation by incorporating autoregressive latent priors!  This allows us generate much more diverse outputs even with high CFG just like a kaleidoscope🔭!
(1/n)
Ceyuan Yang (@ceyuany) 's Twitter Profile Photo

Scaling LAW also exists in diffusion transformers that tells how large model and how much data we need, given compute budget. arxiv: arxiv.org/pdf/2410.08184

Scaling LAW also exists in diffusion transformers that tells how large model and how much data we need, given compute budget. 

arxiv: arxiv.org/pdf/2410.08184
Jiatao Gu (@thoma_gu) 's Twitter Profile Photo

🚀Excited to introduce our recent work @ AppleMLR -- DART: Denoising AutoRegressive Transformer for Scalable Text-to-Image Generation! A transformer-based model that unifies Autoregressive and Diffusion with a non-Markovian diffusion framework: 🔗 arxiv.org/abs/2410.08159 (1/n)

🚀Excited to introduce our recent work @ AppleMLR --
DART: Denoising AutoRegressive Transformer for Scalable Text-to-Image Generation! 
A transformer-based model that unifies Autoregressive and Diffusion with a non-Markovian diffusion framework: 
🔗 arxiv.org/abs/2410.08159 (1/n)
Jiatao Gu (@thoma_gu) 's Twitter Profile Photo

Life update: Excited to share that I will be joining Penn Computer and Information Science Penn Engineering as an Assistant Professor in Fall 2025!🤯 I’m also seeking multiple PhD students passionate about Generative Intelligence and leveraging it to empower AI agents to interact with the Physical World🌟

Life update: Excited to share that I will be joining <a href="/CIS_Penn/">Penn Computer and Information Science</a> <a href="/PennEngineers/">Penn Engineering</a>  as an Assistant Professor in Fall 2025!🤯

I’m also seeking multiple PhD students passionate about Generative Intelligence and leveraging it to empower AI agents to interact with the Physical World🌟
Qihang Zhang (@qihangzhang0224) 's Twitter Profile Photo

I was fortunate to be mentored by Jiatao this summer at Apple MLR, and my experience couldn't have been better. Jiatao provided frequent and insightful discussions, covering everything from hands-on experiments to broader research ideas. Definitely consider this opportunity!

Jiatao Gu (@thoma_gu) 's Twitter Profile Photo

🚀Thrilled to share our paper "DART" has been accepted by #ICLR2025! Congrats to my amazing collaborators Yuyang Wang Yizhe Zhang Qihang Zhang Dinghuai Zhang 张鼎怀 Navdeep Jaitly @jsusskin Shuangfei Zhai! Please also check the updated version with more results at arxiv.org/abs/2410.08159

Jiatao Gu (@thoma_gu) 's Twitter Profile Photo

Very cool results! The idea of learning additional motion space shares similar intuitions of our recent work for generating 3D-consistent videos by jointing predicting in XYZ space! arxiv.org/abs/2412.01821 Qihang Zhang Shuangfei Zhai @jsusskin

Shuangfei Zhai (@zhaisf) 's Twitter Profile Photo

We are looking for a summer research intern to work on improving TarFlow at Apple. You will be working with myself and a great group of researchers, Jiatao Gu Preetum Nakkiran David Berthelot etc. If interested, send your CV to szhai at apple.com by this week.

Ceyuan Yang (@ceyuany) 's Twitter Profile Photo

We propose Long Context Tuning (LCT) for scene-level video generation to bridge the gap between current single-shot generation and real-world narrative video productions. Homepage: guoyww.github.io/projects/long-… Report: arxiv.org/abs/2503.10589

Ceyuan Yang (@ceyuany) 's Twitter Profile Photo

Glad to share Seaweed-7B, a cost-effective foundation model for video generation. Our tech report highlights the key designs that significantly improve compute efficiency and performance given limited resources, achieving comparable quality against other industry-level models. To

Jiatao Gu (@thoma_gu) 's Twitter Profile Photo

Please drop by and check our highlight poster tomorrow at #CVPR2025! ExHall D Poster #60 Sun 15 Jun 10:30 a.m. CDT — 12:30 p.m. CDT Great work by our Apple intern Qihang Zhang and look forward to more exploration on explicit 3D generation! zqh0253.github.io/wvd/