Tobias Mann (@tobias_writes) 's Twitter Profile
Tobias Mann

@tobias_writes

Systems Editor @TheRegister / @SitPub — hiker, animal lover, photographer, blogger, and tech journo.

I'm over on Mastodon now at @[email protected]

ID: 875460965495484416

linkhttps://connect.tobiasmann.com calendar_today15-06-2017 21:12:14

3,3K Tweet

1,1K Followers

623 Following

Tim Prickett Morgan (@tdaytonpm) 's Twitter Profile Photo

Andy Bechtolsheim, a guru of both systems and networking, outlines how we boost XPU performance for AI workloads by 100X by 2028 – and how interconnects can keep up. $ANET $NVDA $AMD $INTC nextplatform.com/2024/08/26/bec…

Tobias Mann (@tobias_writes) 's Twitter Profile Photo

There's a lotta hype around "AI PCs," but if you're looking to kick the tires on LLMs and diffusion models at home, #NPU TOPS aren't as important as you might think. In my latest for The Register, I dive into the specs that actually matter for local #AI: theregister.com/2024/08/25/ai_…

Tobias Mann (@tobias_writes) 's Twitter Profile Photo

As promised my write up on #LLM tool calling is now live over on The Register . In the guide, I explore giving LLMs access to: - user data 🪪 - a calculator 🧮 - a clock ⏰ - weather APIs 🌅 - IT systems like #Proxmox 🖥️ theregister.com/2024/08/26/ai_…

Tobias Mann (@tobias_writes) 's Twitter Profile Photo

I knew that throughput scaled with batch size, but the fact that a RTX 3090 can serve a small #LLM to this many folks is just downright impressive and a huge boon for anyone prototyping an AI-enhanced app. theregister.com/2024/08/23/309…

Tobias Mann (@tobias_writes) 's Twitter Profile Photo

That problem server has been solid for 48 hours now. This weekend I'll start working backward to see what exactly the culprit was. I'm thinking it may have been the beta BIOS for the AsRock Rack X470D4U

That problem server has been solid for 48 hours now. This weekend I'll start working backward to see what exactly the culprit was. I'm thinking it may have been the beta BIOS for the AsRock Rack X470D4U
Tobias Mann (@tobias_writes) 's Twitter Profile Photo

Cerebras gives waferscale chips inferencing twist, claims 1,800 token per sec generation rates Faster than you can read? More like blink and you'll miss the hallucination. My latest for The Register theregister.com/2024/08/27/cer…

Tobias Mann (@tobias_writes) 's Twitter Profile Photo

Want to build a computer with more than a handful of GPUs? At the speeds modern interconnects operate, copper will only get you so far. In my latest for The Register I dive into Broadcom's efforts to co-package optical engines to "GPUs." theregister.com/2024/08/28/bro…

Tobias Mann (@tobias_writes) 's Twitter Profile Photo

More vRAM more better, right? With this addition I'm up to 80GB in the rack. Now I gotta worry about tripping my 15 amp breaker. 😬 Note the 3090 TI is power capped at 300W...

More vRAM more better, right? With this addition I'm up to 80GB in the rack. Now I gotta worry about tripping my 15 amp breaker. 😬

Note the 3090 TI is power capped at 300W...
Tobias Mann (@tobias_writes) 's Twitter Profile Photo

Wild how spiky inference can be. This is for a five question summarization query. Spikes straight to 100% utilization and 300W. Admittedly, this is for a batch size of 1.

Wild how spiky inference can be. This is for a five question summarization query. Spikes straight to 100% utilization and 300W. Admittedly, this is for a batch size of 1.