
nitzan guetta
@nitzanguetta
ID: 1680284922404110337
15-07-2023 18:35:39
23 Tweet
14 Followers
13 Following



π We had a blast presenting #VisualRiddles at the D&B Track at #NeurIPS2024! π Thanks to everyone who stopped byβWe hope this benchmark will push models with better visual reasoning and world knowledge. π Missed it? Learn more here: visual-riddles.github.io nitzan guetta


πππ OpenAI O1, Gemini-2.0 and Gemini-2.0-thinking are on the #VisualRiddles leaderboard! Multiple Choice: Gemini-2.0-thinking hits 60% accuracy (84% with hints!) Open-Ended (Auto-Rating): O1 leads with 58% accuracy. Check it out: π visual-riddles.github.io Yonatan Bitton

