Jaihoon Kim (@kimjaihoon) Twitter Tweets • TwiCopy

AK

@_akhaliq

8 months ago

ORIGEN Zero-Shot 3D Orientation Grounding in Text-to-Image Generation

thumb_up_off_alt192

chat_bubble_outline5

repeat34

shareShare

Inference-time scaling can work for flow models KAIST AI proposed 3 key ideas to make it possible: • SDE-based generation – Adding controlled randomness allows flow models to explore more outputs, like diffusion models do. • VP interpolant conversion – Guides the model from

Inference-time scaling can work for flow models

<a href="/kaist_ai/">KAIST AI</a> proposed 3 key ideas to make it possible:

• SDE-based generation – Adding controlled randomness allows flow models to explore more outputs, like diffusion models do.

• VP interpolant conversion – Guides the model from

thumb_up_off_alt28

chat_bubble_outline1

repeat8

shareShare

Minhyuk Sung

@minhyuksung

8 months ago

Unconditional Priors Matter! The key to improving CFG-based "conditional" generation in diffusion models actually lies in the quality of their "unconditional" prior. Replace it with a better one to improve conditional generation! 🌐 unconditional-priors-matter.github.io

thumb_up_off_alt25

chat_bubble_outline0

repeat4

shareShare

Yunhong Min

@myh4832

8 months ago

🔥 Grounding 3D Orientation in Text-to-Image 🔥 🎯 We present ORIGEN — the first zero-shot method for accurate 3D orientation grounding in text-to-image generation! 📄 Paper: arxiv.org/abs/2503.22194 🌐 Project: origen2025.github.io

thumb_up_off_alt92

chat_bubble_outline3

repeat19

shareShare

Jaihoon Kim

@kimjaihoon

8 months ago

🚀 Check out our inference-time scaling with FLUX. GPT-4o struggles to follow user prompts involving compositional logical relations. Our inference-time scaling enables efficient search to generate samples with precise alignment to the input text. 🔗 flow-inference-time-scaling.github.io

thumb_up_off_alt9

chat_bubble_outline0

repeat2

shareShare

Minhyuk Sung

@minhyuksung

8 months ago

Introducing ORIGEN: the first orientation-grounding method for image generation with multiple open-vocabulary objects. It’s a novel zero-shot, reward-guided approach using Langevin dynamics, built on a one-step generative model like Flux-schnell. Project: origen2025.github.io

thumb_up_off_alt30

chat_bubble_outline0

repeat5

shareShare

Jaihoon Kim

@kimjaihoon

8 months ago

🔥 KAIST Visual AI Group is hiring interns for 2025 Summer. ❓Can non-KAIST students apply? Yes! ❓Can international students who are not enrolled in any Korean institutions apply? Yes! More info at 🔗 visualai.kaist.ac.kr

thumb_up_off_alt9

chat_bubble_outline0

repeat1

shareShare

AK

@_akhaliq

8 months ago

Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation

thumb_up_off_alt153

chat_bubble_outline5

repeat19

shareShare

Jaihoon Kim

@kimjaihoon

8 months ago

How can VLM reason in arbitrary perspectives? 🔥 Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation proposes a framework that enables spatial reasoning of VLM from arbitrary perspectives

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Jaihoon Kim

@kimjaihoon

8 months ago

🇸🇬 Attending #ICLR2025 ? Check out how we extend pretrained diffusion models to generate images in arbitrary spaces. 📌: Hall 3 + Hall 2B #103 📅: 10AM-12:30PM

thumb_up_off_alt18

chat_bubble_outline0

repeat3

shareShare

Minhyuk Sung

@minhyuksung

8 months ago

#ICLR2025 Come join our StochSync poster (#103) this morning! We introduce a method that combines the best parts of Score Distillation Sampling and Diffusion Synchronization to generate high-quality and consistent panoramas and mesh textures. stochsync.github.io

thumb_up_off_alt21

chat_bubble_outline0

repeat7

shareShare

Phillip (Yuseung) Lee

@yuseungleee

8 months ago

❗️Vision-Language Models (VLMs) struggle with even basic perspective changes! ✏️ In our new preprint, we aim to extend the spatial reasoning capabilities of VLMs to ⭐️arbitrary⭐️ perspectives. 📄Paper: arxiv.org/abs/2504.17207 🔗Project: apc-vlm.github.io 🧵[1/N]

thumb_up_off_alt148

chat_bubble_outline4

repeat37

shareShare

Minhyuk Sung

@minhyuksung

8 months ago

I recently presented our work, “Inference-Time Guided Generation with Diffusion and Flow Models,” at HKUST (CVM 2025 keynote) and NTU (MMLab), covering three classes of guidance methods for diffusion models and their extensions to flow models. Slides: onedrive.live.com/?redeem=aHR0cH…

thumb_up_off_alt109

chat_bubble_outline0

repeat20

shareShare

Jaihoon Kim

@kimjaihoon

8 months ago

📈 Can pretrained flow models generate images from complex compositional prompts—including logical relations and quantities—without further fine-tuning? 🚀 We have released our code for inference-time scaling for flow models: github.com/KAIST-Visual-A…

thumb_up_off_alt29

chat_bubble_outline0

repeat5

shareShare

Jaihoon Kim

@kimjaihoon

6 months ago

🧐 Can we define a better initial prior for Sequential Monte Carlo in reward alignment? That's exactly what Ψ-Sampler 🔱 does. Check out the paper for details: 📌 arxiv.org/abs/2506.01320

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Jaihoon Kim

@kimjaihoon

2 months ago

📢 Excited to share that our paper "Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing" has been accepted to #NeurIPS 2025 🔗 arxiv.org/pdf/2510.06046 📌 flow-inference-time-scaling.github.io

thumb_up_off_alt9

chat_bubble_outline0

repeat0

shareShare

Phillip (Yuseung) Lee

@yuseungleee

2 months ago

🌴Happy to attend #ICCV2025 in Hawaii! I’ll be presenting our paper on enabling VLMs to perform spatial reasoning from arbitrary perspectives. 📔 Paper: arxiv.org/abs/2504.17207 🖥️ Project Page: apc-vlm.github.io ✔️ Poster: Oct 21 (Tue) Session 2 & Exhibit Hall, #858

thumb_up_off_alt48

chat_bubble_outline2

repeat7

shareShare

Jaihoon Kim

@kimjaihoon

14 days ago

Headed to #NeurIPS2025 in San Diego (Dec 1-8)! 🧠 I'll be presenting a couple of posters on generative models. Currently looking for Research Internship opportunities in Generative AI. Let's connect for a chat or coffee ☕️ Please DM me.

thumb_up_off_alt10

chat_bubble_outline1

repeat2

shareShare

Kunho Kim

@kunho_kim_

13 days ago

We present GOATex: Geometry & Occlusion-Aware Texturing in NeurIPS 2025. - Project Page: goatex3d.github.io - Paper: arxiv.org/abs/2511.23051

thumb_up_off_alt6

chat_bubble_outline1

repeat2

shareShare

Jaihoon Kim

AK

TuringPost

Minhyuk Sung

Yunhong Min

Jaihoon Kim

Minhyuk Sung

Jaihoon Kim

AK

Jaihoon Kim

Jaihoon Kim

Minhyuk Sung

Phillip (Yuseung) Lee

Minhyuk Sung

Jaihoon Kim

Jaihoon Kim

Jaihoon Kim

Phillip (Yuseung) Lee

Jaihoon Kim

Kunho Kim