Qihang Zhang (@qihangzhang0224) Twitter Tweets • TwiCopy

Allen T.

2 years ago

Text-to-3D Scene Generation Introducing SceneWiz3D A new approach to create high-fidelity 3D scenes from text and 3D object control by Qihang Zhang and colleagues Link and examples below:

thumb_up_off_alt239

chat_bubble_outline14

repeat47

shareShare

Thank AK for sharing our work. As an another plug-in module, CameraCtrl could be inserted into our #AnimateDiff to enable camera controlling. Code and models are avaliable at github.com/hehao13/Camera…

thumb_up_off_alt82

chat_bubble_outline3

repeat13

shareShare

Ceyuan Yang

@ceyuany

a year ago

Huggingface demo is available at huggingface.co/spaces/qihang/… You can also find the code and weights in github.com/zqh0253/BerfSc…

thumb_up_off_alt17

chat_bubble_outline0

repeat4

shareShare

Jiatao Gu

@thoma_gu

a year ago

🚀Excited to introduce KaleidoDiffusion -- a new method that improves conditional diffusion model generation by incorporating autoregressive latent priors! This allows us generate much more diverse outputs even with high CFG just like a kaleidoscope🔭! (1/n)

thumb_up_off_alt175

chat_bubble_outline3

repeat34

shareShare

Ceyuan Yang

@ceyuany

a year ago

Scaling LAW also exists in diffusion transformers that tells how large model and how much data we need, given compute budget. arxiv: arxiv.org/pdf/2410.08184

thumb_up_off_alt26

chat_bubble_outline1

repeat4

shareShare

Jiatao Gu

@thoma_gu

a year ago

🚀Excited to introduce our recent work @ AppleMLR -- DART: Denoising AutoRegressive Transformer for Scalable Text-to-Image Generation! A transformer-based model that unifies Autoregressive and Diffusion with a non-Markovian diffusion framework: 🔗 arxiv.org/abs/2410.08159 (1/n)

thumb_up_off_alt358

chat_bubble_outline5

repeat69

shareShare

Qihang Zhang

@qihangzhang0224

a year ago

check out our recent work on T2I generation that unifies autoregressive and diffusion with non-Markovian assumption!

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Jiatao Gu

@thoma_gu

9 months ago

Life update: Excited to share that I will be joining Penn Computer and Information Science Penn Engineering as an Assistant Professor in Fall 2025!🤯 I’m also seeking multiple PhD students passionate about Generative Intelligence and leveraging it to empower AI agents to interact with the Physical World🌟

Life update: Excited to share that I will be joining <a href="/CIS_Penn/">Penn Computer and Information Science</a> <a href="/PennEngineers/">Penn Engineering</a> as an Assistant Professor in Fall 2025!🤯

I’m also seeking multiple PhD students passionate about Generative Intelligence and leveraging it to empower AI agents to interact with the Physical World🌟

thumb_up_off_alt713

chat_bubble_outline94

repeat51

shareShare

Qihang Zhang

@qihangzhang0224

9 months ago

I was fortunate to be mentored by Jiatao this summer at Apple MLR, and my experience couldn't have been better. Jiatao provided frequent and insightful discussions, covering everything from hands-on experiments to broader research ideas. Definitely consider this opportunity!

thumb_up_off_alt2

chat_bubble_outline1

repeat0

shareShare

Jiatao Gu

@thoma_gu

7 months ago

🚀Thrilled to share our paper "DART" has been accepted by #ICLR2025! Congrats to my amazing collaborators Yuyang Wang Yizhe Zhang Qihang Zhang Dinghuai Zhang 张鼎怀 Navdeep Jaitly @jsusskin Shuangfei Zhai! Please also check the updated version with more results at arxiv.org/abs/2410.08159

thumb_up_off_alt117

chat_bubble_outline6

repeat26

shareShare

Jiatao Gu

@thoma_gu

7 months ago

Very cool results! The idea of learning additional motion space shares similar intuitions of our recent work for generating 3D-consistent videos by jointing predicting in XYZ space! arxiv.org/abs/2412.01821 Qihang Zhang Shuangfei Zhai @jsusskin

thumb_up_off_alt63

chat_bubble_outline2

repeat8

shareShare

Shuangfei Zhai

@zhaisf

6 months ago

We are looking for a summer research intern to work on improving TarFlow at Apple. You will be working with myself and a great group of researchers, Jiatao Gu Preetum Nakkiran David Berthelot etc. If interested, send your CV to szhai at apple.com by this week.

thumb_up_off_alt33

chat_bubble_outline0

repeat10

shareShare

Ceyuan Yang

@ceyuany

5 months ago

We propose Long Context Tuning (LCT) for scene-level video generation to bridge the gap between current single-shot generation and real-world narrative video productions. Homepage: guoyww.github.io/projects/long-… Report: arxiv.org/abs/2503.10589

thumb_up_off_alt103

chat_bubble_outline4

repeat23

shareShare

Ceyuan Yang

@ceyuany

4 months ago

Glad to share Seaweed-7B, a cost-effective foundation model for video generation. Our tech report highlights the key designs that significantly improve compute efficiency and performance given limited resources, achieving comparable quality against other industry-level models. To

thumb_up_off_alt520

chat_bubble_outline34

repeat100

shareShare

Jiatao Gu

@thoma_gu

2 months ago

Please drop by and check our highlight poster tomorrow at #CVPR2025! ExHall D Poster #60 Sun 15 Jun 10:30 a.m. CDT — 12:30 p.m. CDT Great work by our Apple intern Qihang Zhang and look forward to more exploration on explicit 3D generation! zqh0253.github.io/wvd/

thumb_up_off_alt58

chat_bubble_outline0

repeat7

shareShare