Antonis Antoniades (@anton_iades) 's Twitter Profile
Antonis Antoniades

@anton_iades

CS PhD student @ucsbNLP teaching machines to think like humans and humans to think like machines. Prev. @UCSBPhysics Guitar/Bouzouki. Cyprus/CA 🧠🤖🎸🌊

ID: 21491925

linkhttps://a-antoniades.github.io calendar_today21-02-2009 15:24:04

1,1K Tweet

524 Takipçi

953 Takip Edilen

Antonis Antoniades (@anton_iades) 's Twitter Profile Photo

This paper is great. It shows why normal benchmarks cannot effectively convey the "goodness" of a certain model. I wondered if encouraging certain "words" may boost the performance of these reasoning chains. Language expressions are a proxy for the behaviors they investigate.

Antonis Antoniades (@anton_iades) 's Twitter Profile Photo

My hunch is that although the performance difference between base GPT-4 and 4.5 is (seemingly) small, it may actually lead to a quite significant difference with post-training.

Antonis Antoniades (@anton_iades) 's Twitter Profile Photo

if you see someone running a process on the GPUs for like 7+ days, they either know exactly what they're doing, or have no clue at all... (speaking from experience 😂)

Alfonso Amayuelas (@alfonamayuelas) 's Twitter Profile Photo

📜🚨 Check out our latest work on "Self-Resource Allocation in Multi-Agent LLM Systems" where we explore how LLMs can be used to optimize task allocation in multi-agent systems 🤖 🧵(1/3)

📜🚨 Check out our latest work on "Self-Resource Allocation in Multi-Agent LLM Systems" where we explore how LLMs can be used to optimize task allocation in multi-agent systems 🤖
🧵(1/3)
Antonis Antoniades (@anton_iades) 's Twitter Profile Photo

If you're at ICLR you may want to grab the opportunity to talk with my incredible co-authors Kexun Zhang and Yuxi XIE on everything Search + Agent related at our SWE-Search poster, on Thursday at 3:00pm, Hall 3 + Hall 2B #156. 😁 iclr.cc/virtual/2025/p…

If you're at ICLR you may want to grab the opportunity to talk with my incredible co-authors <a href="/kexun_zhang/">Kexun Zhang</a> and <a href="/sigrid_xie/">Yuxi XIE</a> on everything Search + Agent related at our SWE-Search poster, on Thursday at 3:00pm, Hall 3 + Hall 2B #156. 😁 iclr.cc/virtual/2025/p…
Alfonso Amayuelas (@alfonamayuelas) 's Twitter Profile Photo

New paper 🚨📜🚀 Introducing “Agents of Change: Self-Evolving LLM Agents for Strategic Planning”! In this work, we show how LLM-powered agents can rewrite their own prompts & code to climb the learning curve in the board game Settlers of Catan 🎲 🧵👇

New paper 🚨📜🚀
Introducing “Agents of Change: Self-Evolving LLM Agents for Strategic Planning”!
In this work, we show how LLM-powered agents  can rewrite their own prompts &amp; code to climb the learning curve in the board game Settlers of Catan 🎲
🧵👇
Antonis Antoniades (@anton_iades) 's Twitter Profile Photo

The thing people who dismiss LLMs don’t get is that even if they’re not the end game, they’ll be key to getting us there.

Antonis Antoniades (@anton_iades) 's Twitter Profile Photo

Great work on Searching in complex environments. The authors identified many of the problems we faced in SWE-Search: agent scaffolding, environment reliability, and selecting the correct final solution (in SWE-Search we addressed the latter using multi-agent debate verifier).

Antonis Antoniades (@anton_iades) 's Twitter Profile Photo

Elon's "search for the ultimate truth" paradigm for AI is kind of genius tbh. It's also very connected to RLVR training, where only the outcome matters, getting the "true" answer. :)