@cvenhoff00 : 🔍 New paper: How do vision-language models actually align visual- and language representations? We used sparse autoencoders to peek inside VLMs and found something surprising about when and where cross-modal alignment happens! Presented at XAI4CV Workshop @ CVPR 🧵 (1/6) • TwiCopy

Constantin Venhoff

@cvenhoff00

+ Follow

ID: 1783494472476540928

calendar_today25-04-2024 13:53:24

5 Tweet

16 Followers

40 Following

Constantin Venhoff

@cvenhoff00

3 months ago

🔍 New paper: How do vision-language models actually align visual- and language representations? We used sparse autoencoders to peek inside VLMs and found something surprising about when and where cross-modal alignment happens! Presented at XAI4CV Workshop @ CVPR 🧵 (1/6)

thumb_up_off_alt298

chat_bubble_outline8

repeat44

shareShare