Josh 🇺🇸 (@ogjoshhsu) 's Twitter Profile
Josh 🇺🇸

@ogjoshhsu

Founder. Tinkering with Beans.

ID: 218028363

calendar_today21-11-2010 06:27:07

3,3K Tweet

557 Takipçi

1,1K Takip Edilen

Maggie Appleton (@mappletons) 's Twitter Profile Photo

I've been trying to articulate why the fawning, complimentary responses from AI chatbots feel so insidious to me. I've figured out how to explain it. Wrote a piece on how current model training and design choices threaten our critical thinking skills: maggieappleton.com/ai-enlightenme…

Patrick McKenzie (@patio11) 's Twitter Profile Photo

So October 15th, the extended US tax deadline, is just around the corner, and I have some observations which are more about LLM progress than taxes. Background: many people professionally involved with LLMs estimate 2026-2028 as the year where one can get an LLM to "do taxes."

Shakeel (@shakeelhashim) 's Twitter Profile Photo

In today’s Transformer, I wrote about the very scary disconnect I felt between the conversations I was having with people like Joshua Achiam and Jack Clark at The Curve, and the way the wider world talks about AI.

In today’s Transformer, I wrote about the very scary disconnect I felt between the conversations I was having with people like <a href="/jachiam0/">Joshua Achiam</a> and <a href="/jackclarkSF/">Jack Clark</a> at The Curve, and the way the wider world talks about AI.
alphaXiv (@askalphaxiv) 's Twitter Profile Photo

Tiny Recursive Models: A tiny 7M parameter model that recursively refines its answer beats LLMs 100x larger on hard puzzles like ARC-AGI We independently reproduced the paper, corroborated results, and released the weights + API access for those looking to benchmark it 🔍

elie (@eliebakouch) 's Twitter Profile Photo

Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably huggingface.co/spaces/Hugging…

Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably

huggingface.co/spaces/Hugging…
Aaron Levie (@levie) 's Twitter Profile Photo

Many AI agent problems are really just information retrieval problems. If the agent has a better way to find and comb through data performantly, you will get far better results. Compute is fungible so you can use it during indexing and processing or you can use later to crank

VCs Congratulating Themselves 👏👏👏 (@vcbrags) 's Twitter Profile Photo

VCs giving their expert advice: “Build something people want” “Talk to your users” “Hire A players” “Focus on distribution” Founders:

VCs giving their expert advice: 

“Build something people want”
“Talk to your users”
“Hire A players”
“Focus on distribution” 

Founders:
Nick St. Pierre (@nickfloats) 's Twitter Profile Photo

Every node in your graph is a little death of inspiration to me. You've turned imagination into plumbing and disguised it as “creative freedom”. You've mistaken complexity for control and turned creation into workflow theater. Let us build worlds, not workflows. Don't give me

signüll (@signulll) 's Twitter Profile Photo

absolutely lovely to see this. kinda reminds you why anthropic’s whole vibe feels anomalously clean in a world addicted to sludge metrics. kudos to mike & team. also you can see a reflection of this in their marketing as well (which i posted about here before).

absolutely lovely to see this. kinda reminds you why anthropic’s whole vibe feels anomalously clean in a world addicted to sludge metrics. kudos to mike &amp; team. also you can see a reflection of this in their marketing as well (which i posted about here before).
Peter Yang (@petergyang) 's Twitter Profile Photo

Cursor scaled to $29B without any full-time PMs. Ryo (Cursor's Head of Design) walked me through how they work and it's the opposite of every big tech best practice: 1. Roles are muddy PM work is spread across designers and engineers. Everyone does what fits their strengths

Cursor scaled to $29B without any full-time PMs.

Ryo (Cursor's Head of Design) walked me through how they work and it's the opposite of every big tech best practice:

1. Roles are muddy

PM work is spread across designers and engineers. Everyone does what fits their strengths
Sarah Sachs (@sarahmsachs) 's Twitter Profile Photo

Applied AI products supporting new model releases isn't what is used to be. It's no longer a question of evals, it's a question of agility and, in some cases, grit. Opus 4.5 exposed an entirely new concept called "effort", a parameter that involved optimization and use case