Josh McGrath (@j_mcgraph) 's Twitter Profile
Josh McGrath

@j_mcgraph

Post training @openai

ID: 969439214

calendar_today25-11-2012 05:25:01

1,1K Tweet

1,1K Takipçi

872 Takip Edilen

Effie Klimi (@effiebio) 's Twitter Profile Photo

“So. we stack layers where each does Wx + b followed by a nonlinearity e.g. ReLU. This builds a deep function. We train it by minimizing loss using backpropagation and gradient descen-“

“So. we stack layers where each does Wx + b followed by a nonlinearity e.g. ReLU. This builds a deep function. We train it by minimizing loss using backpropagation and gradient descen-“
Aidan McLaughlin (@aidan_mclau) 's Twitter Profile Photo

notable imo that the world's most valuable companies *are not* casinos, slop mobile games, or even tiktok/youtube instead they're apple (tools that ehance creativity), meta (originally about connecting people), and nvidia (enabling ai to accelerate science)

Josh McGrath (@j_mcgraph) 's Twitter Profile Photo

If you’re working in quant and are tired of the lack of meaning in your work, we’re always hiring! I promise you, we have more than enough compute. P.S. we’re chaotic because building anything new is messy. Do it with us!

shyamal (@shyamalanadkat) 's Twitter Profile Photo

getting started with evals doesn't require too much. the pattern that we've seen work for small teams looks a lot like test‑driven development applied to AI engineering: 1/ anchor evals in user stories, not in abstract benchmarks: sit down with your product/design counterpart

Josh McGrath (@j_mcgraph) 's Twitter Profile Photo

I think we’re in a timeline where > US builds general robots for manufacturing in the next two years > doesn’t matter because of our land use regulation

rapha (@rapha_gl) 's Twitter Profile Photo

opus not being great at benchmarks (but having really good user testimonials) is further confirmation of the deep deep eval crisis we’re in you can max out any benchmark with enough RL, and that doesn’t translate into a good product. you can optimize for DAUs and glaze-hack it

Josh McGrath (@j_mcgraph) 's Twitter Profile Photo

Now that this is out I can finally say that we were building a vape but it turned out to be too dangerous, the tricks we prompted it to do were too sick.

Jason Wei (@_jasonwei) 's Twitter Profile Photo

A recent clarity that I gained is viewing AI research as a “max-performance domain”, which means that you can be world-class by being very good at only one part of your job. As long as you can create seminal impact (e.g., train the best model, start a new paradigm, or create

Josh McGrath (@j_mcgraph) 's Twitter Profile Photo

I'm a bit skeptical of the papers mentioned below, but I think it's a good thing to rerun the eval yourself and report whatever number you get. It's hard to replicate someones eval setup!

Josh McGrath (@j_mcgraph) 's Twitter Profile Photo

The admin wants to be good at AI until it means either recognizing > foreign talent > forms of energy they don’t find masculine or something idk

Joshua Achiam (@jachiam0) 's Twitter Profile Photo

It feels somewhat astonishing that a third of the strategic bomber fleet of a great power can now be taken out in a daring drone attack - the foundations of war appear to be changing at speed. I don't know what this will lead to.

Duncan S. Campbell (@duncan__c) 's Twitter Profile Photo

We must manufacture batteries at scale here in the US if we want to have any shot at the very near future of defense, robotics, mobility, and energy. Basically everything that matters right now goes back to batteries.

Josh McGrath (@j_mcgraph) 's Twitter Profile Photo

No, it’s like if neuroscience had unlimited resolution imaging and perfect intervention. You know, two key bottlenecks in that field.