Jihyeon Je (@jihyeonje) 's Twitter Profile
Jihyeon Je

@jihyeonje

CS PhD @StanfordEng, @DukeEngineering alum

ID: 1433565191917092887

calendar_today02-09-2021 22:59:17

41 Tweet

281 Followers

235 Following

Nakayama George (@georgenaka40190) 's Twitter Profile Photo

Do large multimodal models understand how to make dresses for your winter holiday party💃? We introduce AIpparel, a vision-language-garment model capable of generating and editing simulation-ready sewing patterns from text and images. Project page at georgenakayama.github.io/AIpparel/.

Phillip (Yuseung) Lee (@yuseungleee) 's Twitter Profile Photo

🇨🇦 Happy to present GrounDiT at #NeurIPS2024! Find out how we can obtain **precise spatial control** in DiT-based image generation! 📌 Poster: Fri 4:30PM - 7:30PM PST 💻 Our code is also released at: github.com/KAIST-Visual-A…

🇨🇦 Happy to present GrounDiT at #NeurIPS2024!

Find out how we can obtain **precise spatial control** in DiT-based image generation!

📌 Poster: Fri 4:30PM - 7:30PM PST

💻 Our code is also released at: github.com/KAIST-Visual-A…
Aryaman Arora (@aryaman2020) 's Twitter Profile Photo

new paper! 🫡 we introduce 🪓AxBench, a scalable benchmark that evaluates interpretability techniques on two axes: concept detection and model steering. we find that: 🥇prompting and finetuning are still best 🥈supervised interp methods are effective 😮SAEs lag behind

new paper! 🫡

we introduce  🪓AxBench, a scalable benchmark that evaluates interpretability techniques on two axes: concept detection and model steering.

we find that:
🥇prompting and finetuning are still best
🥈supervised interp methods are effective
😮SAEs lag behind
CLS (@chengleisi) 's Twitter Profile Photo

Reasoning is all the rage these days. If you want to save some time and get to the crux of how to enable reasoning in LLMs, here’s a list of 10 recent papers that I find most informative, along with my notes: (Full thread in doc: docs.google.com/document/d/1TW…) 1/11

Ian Huang (@ianhuang3d) 's Twitter Profile Photo

🏡Building realistic 3D scenes just got smarter! Introducing our #CVPR2025 work, 🔥FirePlace, a framework that enables Multimodal LLMs to automatically generate realistic and geometrically valid placements for objects into complex 3D scenes. How does it work?🧵👇

Ken Liu (@kenziyuliu) 's Twitter Profile Photo

An LLM generates an article verbatim—did it “train on” the article? It’s complicated: under n-gram definitions of train-set inclusion, LLMs can complete “unseen” texts—both after data deletion and adding “gibberish” data. Our results impact unlearning, MIAs & data transparency🧵

An LLM generates an article verbatim—did it “train on” the article?

It’s complicated: under n-gram definitions of train-set inclusion, LLMs can complete “unseen” texts—both after data deletion and adding “gibberish” data. Our results impact unlearning, MIAs & data transparency🧵
Hansheng Chen (@hanshengch) 's Twitter Profile Photo

Excited to share our work: Gaussian Mixture Flow Matching Models (GMFlow) github.com/lakonik/gmflow GMFlow generalizes diffusion models by predicting Gaussian mixture denoising distributions, enabling precise few-step sampling and high-quality generation.

Excited to share our work: 
Gaussian Mixture Flow Matching Models (GMFlow)
github.com/lakonik/gmflow
GMFlow generalizes diffusion models by predicting Gaussian mixture denoising distributions, enabling precise few-step sampling and high-quality generation.
Jihyeon Je (@jihyeonje) 's Twitter Profile Photo

Excited to share our #CVPR2025 Highlight work, BlenderGym🏋️‍♂️, on benchmarking multimodal LLMs on graphics editing! Curious how inference compute allocation impacts performance, or how your favorite MLLM stacks up? Check out our paper and benchmark!👇

Mikaela Angelina Uy (@mikacuy) 's Twitter Profile Photo

Check out our recent work NVIDIA AI on a feedforward, open-world 3D part segmentation enabling various other downstream applications! 🚀 w/ Minghua Liu Donglai Xiang Hao Su Sanja Fidler Nick Sharp Jun Gao 🔗 Webpage: research.nvidia.com/labs/toronto-a…

Jihyeon Je (@jihyeonje) 's Twitter Profile Photo

Can you rotate a dice 🎲 in your head? Mental imagery plays a key role in perspective reasoning for humans - but can it help VLMs reason spatially? We show that Abstract Perspective Change significantly improves VLM reasoning from unseen views. Check out our preprint for more:

Aryaman Arora (@aryaman2020) 's Twitter Profile Photo

new paper! 🫡 why are state space models (SSMs) worse than Transformers at recall over their context? this is a question about the mechanisms underlying model behaviour: therefore, we propose using mechanistic evaluations to answer it!

new paper! 🫡

why are state space models (SSMs) worse than Transformers at recall over their context? this is a question about the mechanisms underlying model behaviour: therefore, we propose using mechanistic evaluations to answer it!
Yunqi (Richard) Gu (@richard_yunqigu) 's Twitter Profile Photo

We are presenting 17:00-19:00 today at Poster 267 in ExHall D for #CVPR25! Come and check out the first #VLM #3D #Graphics Benchmark! 📣📣📣

CLS (@chengleisi) 's Twitter Profile Photo

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

Are AI scientists already better than human researchers?

We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts.

Main finding: LLM ideas result in worse projects than human ideas.
Jiaju Ma (@jama1017) 's Twitter Profile Photo

We introduce MoVer, a Motion Verification DSL that automatically checks if AI-generated motion graphics animations match your text prompts! We make it easy for designers to specify and verify complex animations with LLM-powered iterative refinement. Catch our #SIGGRAPH2025 talk:

Phillip (Yuseung) Lee (@yuseungleee) 's Twitter Profile Photo

Does GPT-5 understand perspective change? GPT-5(-Thinking) still often struggles with simple *perspective changes*, a basic visual/spatial reasoning skill that could be essential for VLMs to reach *human-level intelligence* 👀

Does GPT-5 understand perspective change?

GPT-5(-Thinking) still often struggles with simple *perspective changes*, a basic visual/spatial reasoning skill that could be essential for VLMs to reach *human-level intelligence* 👀
miatang (@miamiamia0103) 's Twitter Profile Photo

Calling all digital artists 🧑‍🎨 Have you ever forgotten to put objects on separate layers? Introducing InkLayer, a segmentation algorithm that makes scene sketches easy to edit. I’m presenting at #SIGGRAPH2025 on Wednesday, 11:45–11:55 am, West Building 118–120. See you there!

Yanzhe Zhang (@stevenyzzhang) 's Twitter Profile Photo

Soon, AI agents will act for us—collaborating, negotiating, and sharing data. But can they truly protect our privacy? We simulate privacy-critical scenarios, using alternating search to evolve attacks and defenses, uncovering severe vulnerabilities and building protections.

Ken Liu (@kenziyuliu) 's Twitter Profile Photo

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions. Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far:

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions.

Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far:
Dora Zhao (@dorazhao9) 's Twitter Profile Photo

What if your "For You" feed actually represented the values you care about? We introduce Alexandria, a browser extension that allows you to re-rank your Twitter feed in real time based on values you choose.

Dora Zhao (@dorazhao9) 's Twitter Profile Photo

LLMs are powerful, but they don't know your world. This knowledge gap can lead to generic, unhelpful, or incorrect responses. In our #UIST2025 paper, we explore how users can fill these gaps through creating a community knowledge ecosystem, giving models access to more specific