Antonis Antoniades (@anton_iades) Twitter Tweets • TwiCopy

Antonis Antoniades

@anton_iades

+ Follow

CS PhD student @ucsbNLP teaching machines to think like humans and humans to think like machines. Prev. @UCSBPhysics Guitar/Bouzouki. Cyprus/CA 🧠🤖🎸🌊

ID: 21491925

linkhttps://a-antoniades.github.io calendar_today21-02-2009 15:24:04

1,1K Tweet

524 Followers

953 Following

Antonis Antoniades

@anton_iades

9 months ago

Hope all conferences follow CVPR’s lead. 👏🏼

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

This paper is great. It shows why normal benchmarks cannot effectively convey the "goodness" of a certain model. I wondered if encouraging certain "words" may boost the performance of these reasoning chains. Language expressions are a proxy for the behaviors they investigate.

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Antonis Antoniades

@anton_iades

9 months ago

My hunch is that although the performance difference between base GPT-4 and 4.5 is (seemingly) small, it may actually lead to a quite significant difference with post-training.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Antonis Antoniades

@anton_iades

8 months ago

first impressions on @apple intelligence:

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Antonis Antoniades

@anton_iades

8 months ago

if you see someone running a process on the GPUs for like 7+ days, they either know exactly what they're doing, or have no clue at all... (speaking from experience 😂)

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Antonis Antoniades

@anton_iades

8 months ago

multi-turn training got these models going wild.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Antonis Antoniades

@anton_iades

7 months ago

Tree search is a natural choice for efficiently iterating and exploring diverse solutions for agentic tasks.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Antonis Antoniades

@anton_iades

7 months ago

the "reject/accept" options on cursor are such an OP training signal... i'm jealous 🥹

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Alfonso Amayuelas

@alfonamayuelas

7 months ago

📜🚨 Check out our latest work on "Self-Resource Allocation in Multi-Agent LLM Systems" where we explore how LLMs can be used to optimize task allocation in multi-agent systems 🤖 🧵(1/3)

thumb_up_off_alt53

chat_bubble_outline1

repeat18

shareShare

Antonis Antoniades

@anton_iades

7 months ago

If you're at ICLR you may want to grab the opportunity to talk with my incredible co-authors Kexun Zhang and Yuxi XIE on everything Search + Agent related at our SWE-Search poster, on Thursday at 3:00pm, Hall 3 + Hall 2B #156. 😁 iclr.cc/virtual/2025/p…

If you're at ICLR you may want to grab the opportunity to talk with my incredible co-authors <a href="/kexun_zhang/">Kexun Zhang</a> and <a href="/sigrid_xie/">Yuxi XIE</a> on everything Search + Agent related at our SWE-Search poster, on Thursday at 3:00pm, Hall 3 + Hall 2B #156. 😁 iclr.cc/virtual/2025/p…

thumb_up_off_alt7

chat_bubble_outline0

repeat4

shareShare

Xinyi Wang @ ICLR

@xinyiwang98

7 months ago

I'm presenting our mem v.s. gen paper at ICLR on Saturday morning at #244. Come and check it out if you're interested!

thumb_up_off_alt9

chat_bubble_outline0

repeat2

shareShare

Alfonso Amayuelas

@alfonamayuelas

5 months ago

New paper 🚨📜🚀 Introducing “Agents of Change: Self-Evolving LLM Agents for Strategic Planning”! In this work, we show how LLM-powered agents can rewrite their own prompts & code to climb the learning curve in the board game Settlers of Catan 🎲 🧵👇

thumb_up_off_alt295

chat_bubble_outline4

repeat68

shareShare

Antonis Antoniades

@anton_iades

5 months ago

The thing people who dismiss LLMs don’t get is that even if they’re not the end game, they’ll be key to getting us there.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Antonis Antoniades

@anton_iades

5 months ago

great paper thread.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Antonis Antoniades

@anton_iades

4 months ago

😂😂...it's been so fun working with Kexun Zhang on SWE-Agents. thanks JetBrains. watch this space!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Antonis Antoniades

@anton_iades

4 months ago

Great work on Searching in complex environments. The authors identified many of the problems we faced in SWE-Search: agent scaffolding, environment reliability, and selecting the correct final solution (in SWE-Search we addressed the latter using multi-agent debate verifier).

thumb_up_off_alt5

chat_bubble_outline1

repeat0

shareShare

Antonis Antoniades

@anton_iades

4 months ago

Elon's "search for the ultimate truth" paradigm for AI is kind of genius tbh. It's also very connected to RLVR training, where only the outcome matters, getting the "true" answer. :)

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare