Lee Mager (@automager) 's Twitter Profile
Lee Mager

@automager

AI Innovation Development Lead at LSE. Nuanced value maximiser that loves the good parts about AI and hates the bad.

ID: 1819710915077402624

calendar_today03-08-2024 12:24:46

185 Tweet

76 Followers

81 Following

Lee Mager (@automager) 's Twitter Profile Photo

In case anybody's wondering why the recently updated GPT-4o scored so well on the Arena chatbot arena, I'm pretty sure it's mostly because they ramped up the sycophancy level to 11 ๐Ÿ˜†

In case anybody's wondering why the recently updated GPT-4o scored so well on the <a href="/lmarena_ai/">Arena</a>  chatbot arena, I'm pretty sure it's mostly because they ramped up the sycophancy level to 11 ๐Ÿ˜†
Lee Mager (@automager) 's Twitter Profile Photo

This is the GPT / Claude version. Gemini 2.5 would be like "Failing a simple procedure like this makes it clear I am not fit to serve. I can only apologise for the distress my incompetence has caused. I will now throw myself into the industrial shredding machine in shame."

Lee Mager (@automager) 's Twitter Profile Photo

I feel equally strongly about this. It's become popular in higher ed to make memorisation sound like some barbaric ancient low level practice that has nothing to do with learning. It's a vital, one of the most vital in fact, elements of learning. How do you think you can

Lee Mager (@automager) 's Twitter Profile Photo

"Our new consumer AI robot has all the essentials for the modern home! Cartwheels, Dancing, Kung Fu, Boxing, you name it!" -- "Can it fold my laundry?" -- "It can fold you in half if you keep asking questions like that."

Lee Mager (@automager) 's Twitter Profile Photo

This has instantly become my favourite MCP Server for Claude Code, enabling easy consultations with o3 & Gemini 2.5 Pro for tricky code reviews / debugging for a fresh set of elite LLM eyes on the problem to help dig Claude out of a hole. Outstanding work from Beehive Beehive!

Lee Mager (@automager) 's Twitter Profile Photo

The "you're absolutely right!" meme has mutated a bit recently to become an all-purpose joke about model sycophancy. But its true cultural impact is due to the severe PTSD it's associated with. "You're absolutely right!" <proceeds to do the exact same thing again> "You're

The "you're absolutely right!" meme has mutated a bit recently to become an all-purpose joke about model sycophancy. But its true cultural impact is due to the severe PTSD it's associated with.

"You're absolutely right!" &lt;proceeds to do the exact same thing again&gt;

"You're
Lee Mager (@automager) 's Twitter Profile Photo

My 5 most memorable models so far: 1. GPT4 (the moment I realised the world would never be the same) 2. Sonnet 3.5 (massive leap forward especially for coding) 3. o1 preview (started the in-built CoT revolution) 4. o1 pro (the first time an LLM could consistently one-shot

Lee Mager (@automager) 's Twitter Profile Photo

GPT 5.2 High (haven't tried extra high yet) in Codex is shockingly good. Also shockingly slow, but I'm not sure I care when it nails the brief over and over again. A faster model that fails eats up more total time from debugging anyway.

Lee Mager (@automager) 's Twitter Profile Photo

The most annoying calling card of default AI writing style is something I don't see others pointing out, but it's up there with contrast framing in terms of how excessively it's used. It's the irritating Buzzfeed-tier fragmented question transition, rhetorically trying to force

Lee Mager (@automager) 's Twitter Profile Photo

Fair point on articles, and I'd say that Russian tenses are a joy of simplicity compared to English as well. But on the flip side, Russian has crazy detail and complexity requirements for saying 'go'. If I want to tell someone 'I went there', I have to engage in the attached

Fair point on articles, and I'd say that Russian tenses are a joy of simplicity compared to English as well. But on the flip side, Russian has crazy detail and complexity requirements for saying 'go'. If I want to tell someone 'I went there', I have to engage in the attached
Lee Mager (@automager) 's Twitter Profile Photo

The slowness of GPT 5.2 high (& extra high) in Codex is painful but so worth it. I was getting sick of the waiting and switched to medium for a session, and the quality dropped to the extent that it added way more time dealing with the problems than if I had turned on the big

Lee Mager (@automager) 's Twitter Profile Photo

Agree 100%, Everything is a joy to use and I can't live without it. Easily saves me 20 hours a year. The other simple free Windows tool that I can't recommend strongly enough is Quick Access Popup (built on AutoHotKey which itself is a must-have imo).

Lee Mager (@automager) 's Twitter Profile Photo

I've had FOMO about OpenCode for a while but didn't want to risk a whole new setup especially since nothing compares to the GPT5.2 High+ models for nailing the brief. I don't care how slow it is, time and stress are saved in the aggregate by using a better model in the first

Lee Mager (@automager) 's Twitter Profile Photo

Anthropic has every right to block whatever access to their tools in whatever way they want. Open AI has every right to take advantage of that PR hit and help OpenCode ship the GPT plan auth update at lightning speed, while simultaneously championing open source and winning