Jacob Phillips (@jacob_dphillips) 's Twitter Profile
Jacob Phillips

@jacob_dphillips

Engineering Fellow @a16z, American Dynamism. prev ML @scale_AI, CTO @Themis_AI, AI @MIT

ID: 717016291383644160

calendar_today04-04-2016 15:49:40

199 Tweet

545 Takipçi

920 Takip Edilen

Jacob Phillips (@jacob_dphillips) 's Twitter Profile Photo

The recent Sonnet release actually showed a small regression on MMMU, a visual reasoning benchmark, despite large advances in long-context reasoning for agentic coding and AIME. Excited to see better embodied reasoning benchmarks in the future!

The recent Sonnet release actually showed a small regression on MMMU, a visual reasoning benchmark, despite large advances in long-context reasoning for agentic coding and AIME. Excited to see better embodied reasoning benchmarks in the future!