Lluís Castrejón
@lluiscastrejon
ID: 3053101745
22-02-2015 16:32:33
3 Tweet
18 Followers
82 Following
Introducing HAMMR: hierarchical multimodal agents that handle a broad range of VQA tasks within a single system (counting, spatial reasoning, OCR, visual pointing, external knowledge, and more). arxiv.org/abs/2404.05465 Lluís Castrejón @tejmensink Howard Zhou André Araujo Jasper Uijlings
Breaking news from Text-to-Image Arena! 🖼️✨ Google DeepMind’s Imagen 3 debuts at #1, surpassing Recraft-v3 with a remarkable +70-point lead! Congrats to the Google Imagen team for setting a new bar! Try the best text2image at LMArena and cast your vote! More analysis👇