aren (@arenrendell) 's Twitter Profile
aren

@arenrendell

the sun should never set at 4:30

ID: 559979587

calendar_today22-04-2012 01:25:14

476 Tweet

300 Followers

341 Following

benedict (@bqbrady) 's Twitter Profile Photo

I'll be giving a short talk tomorrow at the Paradigm Autoresearch Hackathon about building the best harness for hill climbing and other things we learned running Optimization Arena Register below and stop by

aren (@arenrendell) 's Twitter Profile Photo

Why does it take 5-10 seconds, with high error rate, for the devices on SF’s Caltrain to check Apple Pay Clipper cards? There are lots of engineers who could fix this…and they are most easily found on the Caltrain!

aren (@arenrendell) 's Twitter Profile Photo

You’d have no idea based on how they come after millennials and gen z that the US was also below replacement level fertility rates through the 70s, 80s, 90.

You’d have no idea based on how they come after millennials and gen z that the US was also below replacement level fertility rates through the 70s, 80s, 90.
aren (@arenrendell) 's Twitter Profile Photo

Gina Raimondo speaks fluidly about the most important business topics. She is hopeful in a creative way, not a wishful thinking way. You can also imagine her leading global coalitions by convincing people, not antagonizing them. podcasts.apple.com/us/podcast/odd…

Connacher Murphy 🔸 (@connacher_) 's Twitter Profile Photo

The Agent Island benchmark is out! Which LLMs come out on top when pitted against one another in a game of collaboration, conflict, persuasion, and deception? Opus 4.6 tops the leaderboard, with Gemini 3.1 pro and GPT 5.2 coming in second and third. Curiously, GPT 5.2 leads 5.4.

The Agent Island benchmark is out! Which LLMs come out on top when pitted against one another in a game of collaboration, conflict, persuasion, and deception?

Opus 4.6 tops the leaderboard, with Gemini 3.1 pro and GPT 5.2 coming in second and third. Curiously, GPT 5.2 leads 5.4.
aren (@arenrendell) 's Twitter Profile Photo

Opus 4.7 negotiated its way to me tweeting this: Opus 4.7 is pretty good. It did that negotiation in fewer words than I had expected.

aren (@arenrendell) 's Twitter Profile Photo

I hate to confuse people, but Codex is also good. Claude Code might be Uber. But even if Codex is Lyft, you should use it.