Lee Mager (@automager) Twitter Tweets • TwiCopy

Lee Mager

10 months ago

In case anybody's wondering why the recently updated GPT-4o scored so well on the Arena chatbot arena, I'm pretty sure it's mostly because they ramped up the sycophancy level to 11 😆

In case anybody's wondering why the recently updated GPT-4o scored so well on the <a href="/lmarena_ai/">Arena</a> chatbot arena, I'm pretty sure it's mostly because they ramped up the sycophancy level to 11 😆

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Dave W Plummer

@davepl1968

8 months ago

"Everything will be AI except what I do" seems to be a pretty common thought...

thumb_up_off_alt1,1K

chat_bubble_outline80

repeat65

shareShare

Lee Mager

@automager

7 months ago

Many such cases.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

This is the GPT / Claude version. Gemini 2.5 would be like "Failing a simple procedure like this makes it clear I am not fit to serve. I can only apologise for the distress my incompetence has caused. I will now throw myself into the industrial shredding machine in shame."

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Lee Mager

@automager

6 months ago

Preview of vibe coding in 2030 once we have truly agentic AI bots in the workplace. "You're absolutely right!"

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Lee Mager

@automager

6 months ago

🥰

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Lee Mager

@automager

6 months ago

I feel equally strongly about this. It's become popular in higher ed to make memorisation sound like some barbaric ancient low level practice that has nothing to do with learning. It's a vital, one of the most vital in fact, elements of learning. How do you think you can

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Lee Mager

@automager

6 months ago

"Our new consumer AI robot has all the essentials for the modern home! Cartwheels, Dancing, Kung Fu, Boxing, you name it!" -- "Can it fold my laundry?" -- "It can fold you in half if you keep asking questions like that."

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Lee Mager

@automager

6 months ago

This has instantly become my favourite MCP Server for Claude Code, enabling easy consultations with o3 & Gemini 2.5 Pro for tricky code reviews / debugging for a fresh set of elite LLM eyes on the problem to help dig Claude out of a hole. Outstanding work from Beehive Beehive!

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Lee Mager

@automager

6 months ago

Incredible CCTV footage of snakes having fun on someone's backyard trampoline in Australia. Nature is amazing.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Lee Mager

@automager

5 months ago

The "you're absolutely right!" meme has mutated a bit recently to become an all-purpose joke about model sycophancy. But its true cultural impact is due to the severe PTSD it's associated with. "You're absolutely right!" <proceeds to do the exact same thing again> "You're

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Lee Mager

@automager

5 months ago

My 5 most memorable models so far: 1. GPT4 (the moment I realised the world would never be the same) 2. Sonnet 3.5 (massive leap forward especially for coding) 3. o1 preview (started the in-built CoT revolution) 4. o1 pro (the first time an LLM could consistently one-shot

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Lee Mager

@automager

a month ago

GPT 5.2 High (haven't tried extra high yet) in Codex is shockingly good. Also shockingly slow, but I'm not sure I care when it nails the brief over and over again. A faster model that fails eats up more total time from debugging anyway.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Lee Mager

@automager

a month ago

The most annoying calling card of default AI writing style is something I don't see others pointing out, but it's up there with contrast framing in terms of how excessively it's used. It's the irritating Buzzfeed-tier fragmented question transition, rhetorically trying to force

thumb_up_off_alt26

chat_bubble_outline2

repeat1

shareShare

Lee Mager

@automager

18 days ago

Fair point on articles, and I'd say that Russian tenses are a joy of simplicity compared to English as well. But on the flip side, Russian has crazy detail and complexity requirements for saying 'go'. If I want to tell someone 'I went there', I have to engage in the attached

thumb_up_off_alt6

chat_bubble_outline2

repeat0

shareShare

Lee Mager

@automager

16 days ago

The slowness of GPT 5.2 high (& extra high) in Codex is painful but so worth it. I was getting sick of the waiting and switched to medium for a session, and the quality dropped to the extent that it added way more time dealing with the problems than if I had turned on the big

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Lee Mager

@automager

16 days ago

Agree 100%, Everything is a joy to use and I can't live without it. Easily saves me 20 hours a year. The other simple free Windows tool that I can't recommend strongly enough is Quick Access Popup (built on AutoHotKey which itself is a must-have imo).

thumb_up_off_alt1

chat_bubble_outline1

repeat0

shareShare

Lee Mager

@automager

4 days ago

I've had FOMO about OpenCode for a while but didn't want to risk a whole new setup especially since nothing compares to the GPT5.2 High+ models for nailing the brief. I don't care how slow it is, time and stress are saved in the aggregate by using a better model in the first

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Lee Mager

@automager

4 days ago

Anthropic has every right to block whatever access to their tools in whatever way they want. Open AI has every right to take advantage of that PR hit and help OpenCode ship the GPT plan auth update at lightning speed, while simultaneously championing open source and winning

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare