Jihyeon Je (@jihyeonje) Twitter Tweets • TwiCopy

Nakayama George

a year ago

Do large multimodal models understand how to make dresses for your winter holiday party💃? We introduce AIpparel, a vision-language-garment model capable of generating and editing simulation-ready sewing patterns from text and images. Project page at georgenakayama.github.io/AIpparel/.

thumb_up_off_alt68

chat_bubble_outline1

repeat19

shareShare

Phillip (Yuseung) Lee

@yuseungleee

a year ago

🇨🇦 Happy to present GrounDiT at #NeurIPS2024! Find out how we can obtain **precise spatial control** in DiT-based image generation! 📌 Poster: Fri 4:30PM - 7:30PM PST 💻 Our code is also released at: github.com/KAIST-Visual-A…

thumb_up_off_alt50

chat_bubble_outline1

repeat10

shareShare

Aryaman Arora

@aryaman2020

10 months ago

new paper! 🫡 we introduce 🪓AxBench, a scalable benchmark that evaluates interpretability techniques on two axes: concept detection and model steering. we find that: 🥇prompting and finetuning are still best 🥈supervised interp methods are effective 😮SAEs lag behind

thumb_up_off_alt416

chat_bubble_outline7

repeat67

shareShare

CLS

@chengleisi

10 months ago

Reasoning is all the rage these days. If you want to save some time and get to the crux of how to enable reasoning in LLMs, here’s a list of 10 recent papers that I find most informative, along with my notes: (Full thread in doc: docs.google.com/document/d/1TW…) 1/11

thumb_up_off_alt805

chat_bubble_outline7

repeat122

shareShare

Ian Huang

@ianhuang3d

8 months ago

🏡Building realistic 3D scenes just got smarter! Introducing our #CVPR2025 work, 🔥FirePlace, a framework that enables Multimodal LLMs to automatically generate realistic and geometrically valid placements for objects into complex 3D scenes. How does it work?🧵👇

thumb_up_off_alt384

chat_bubble_outline23

repeat105

shareShare

Ken Liu

@kenziyuliu

8 months ago

An LLM generates an article verbatim—did it “train on” the article? It’s complicated: under n-gram definitions of train-set inclusion, LLMs can complete “unseen” texts—both after data deletion and adding “gibberish” data. Our results impact unlearning, MIAs & data transparency🧵

thumb_up_off_alt290

chat_bubble_outline10

repeat79

shareShare

Hansheng Chen

@hanshengch

8 months ago

Excited to share our work: Gaussian Mixture Flow Matching Models (GMFlow) github.com/lakonik/gmflow GMFlow generalizes diffusion models by predicting Gaussian mixture denoising distributions, enabling precise few-step sampling and high-quality generation.

thumb_up_off_alt122

chat_bubble_outline1

repeat31

shareShare

Jihyeon Je

@jihyeonje

8 months ago

Excited to share our #CVPR2025 Highlight work, BlenderGym🏋️‍♂️, on benchmarking multimodal LLMs on graphics editing! Curious how inference compute allocation impacts performance, or how your favorite MLLM stacks up? Check out our paper and benchmark!👇

thumb_up_off_alt33

chat_bubble_outline0

repeat3

shareShare

Mikaela Angelina Uy

@mikacuy

7 months ago

Check out our recent work NVIDIA AI on a feedforward, open-world 3D part segmentation enabling various other downstream applications! 🚀 w/ Minghua Liu Donglai Xiang Hao Su Sanja Fidler Nick Sharp Jun Gao 🔗 Webpage: research.nvidia.com/labs/toronto-a…

thumb_up_off_alt38

chat_bubble_outline0

repeat8

shareShare

Jihyeon Je

@jihyeonje

7 months ago

Can you rotate a dice 🎲 in your head? Mental imagery plays a key role in perspective reasoning for humans - but can it help VLMs reason spatially? We show that Abstract Perspective Change significantly improves VLM reasoning from unseen views. Check out our preprint for more:

thumb_up_off_alt96

chat_bubble_outline0

repeat14

shareShare

Aryaman Arora

@aryaman2020

6 months ago

new paper! 🫡 why are state space models (SSMs) worse than Transformers at recall over their context? this is a question about the mechanisms underlying model behaviour: therefore, we propose using mechanistic evaluations to answer it!

thumb_up_off_alt641

chat_bubble_outline11

repeat84

shareShare

Yunqi (Richard) Gu

@richard_yunqigu

5 months ago

We are presenting 17:00-19:00 today at Poster 267 in ExHall D for #CVPR25! Come and check out the first #VLM #3D #Graphics Benchmark! 📣📣📣

thumb_up_off_alt2

chat_bubble_outline0

repeat2

shareShare

CLS

@chengleisi

5 months ago

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

thumb_up_off_alt553

chat_bubble_outline10

repeat162

shareShare

Jiaju Ma

@jama1017

4 months ago

We introduce MoVer, a Motion Verification DSL that automatically checks if AI-generated motion graphics animations match your text prompts! We make it easy for designers to specify and verify complex animations with LLM-powered iterative refinement. Catch our #SIGGRAPH2025 talk:

thumb_up_off_alt77

chat_bubble_outline3

repeat14

shareShare

Phillip (Yuseung) Lee

@yuseungleee

4 months ago

Does GPT-5 understand perspective change? GPT-5(-Thinking) still often struggles with simple *perspective changes*, a basic visual/spatial reasoning skill that could be essential for VLMs to reach *human-level intelligence* 👀

thumb_up_off_alt18

chat_bubble_outline2

repeat3

shareShare

miatang

@miamiamia0103

4 months ago

Calling all digital artists 🧑‍🎨 Have you ever forgotten to put objects on separate layers? Introducing InkLayer, a segmentation algorithm that makes scene sketches easy to edit. I’m presenting at #SIGGRAPH2025 on Wednesday, 11:45–11:55 am, West Building 118–120. See you there!

thumb_up_off_alt245

chat_bubble_outline4

repeat32

shareShare

Yanzhe Zhang

@stevenyzzhang

3 months ago

Soon, AI agents will act for us—collaborating, negotiating, and sharing data. But can they truly protect our privacy? We simulate privacy-critical scenarios, using alternating search to evolve attacks and defenses, uncovering severe vulnerabilities and building protections.

thumb_up_off_alt77

chat_bubble_outline2

repeat26

shareShare

Ken Liu

@kenziyuliu

3 months ago

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions. Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far:

thumb_up_off_alt362

chat_bubble_outline12

repeat72

shareShare

Dora Zhao

@dorazhao9

3 months ago

What if your "For You" feed actually represented the values you care about? We introduce Alexandria, a browser extension that allows you to re-rank your Twitter feed in real time based on values you choose.

thumb_up_off_alt27

chat_bubble_outline0

repeat3

shareShare

Dora Zhao

@dorazhao9

2 months ago

LLMs are powerful, but they don't know your world. This knowledge gap can lead to generic, unhelpful, or incorrect responses. In our #UIST2025 paper, we explore how users can fill these gaps through creating a community knowledge ecosystem, giving models access to more specific

thumb_up_off_alt59

chat_bubble_outline3

repeat27

shareShare