t0mcruzz (@getvalver) 's Twitter Profile
t0mcruzz

@getvalver

sr. software engineer now
hw engineer past

ID: 183164413

calendar_today26-08-2010 09:03:36

1,1K Tweet

64 Followers

79 Following

John Carmack (@id_aa_carmack) 's Twitter Profile Photo

In the 90s, there were a dozen companies making graphics accelerators, and Nvidia wasn’t initially a clear winner. Their first product was terrible, and 3DFX, 3DLabs, Rendition, and others all had important pieces of the puzzle earlier. However, they relentlessly improved and

абстрактный мужик (@abstract_artem) 's Twitter Profile Photo

сторитайм в связи с безумной уязвимостью в xz через изменение билд скриптов короче, году в 2017м ковыряя JVM билд системы по работе: Gradle, Buck, Bazel, Maven до меня дошло что Gradle отличается от них всех подключением Java аннотейшн процессоров — это такой API

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Congrats to AI at Meta on Llama 3 release!! 🎉 ai.meta.com/blog/meta-llam… Notes: Releasing 8B and 70B (both base and finetuned) models, strong-performing in their model class (but we'll see when the rankings come in @ LMSYS Org :)) 400B is still training, but already encroaching

Sebastian Raschka (@rasbt) 's Twitter Profile Photo

If you are looking for something to code & read this weekend, I uploaded a notebook to finetune a small GPT model to classify SPAM messages with ~96% accuracy: github.com/rasbt/LLMs-fro… (Fun fact: it's small enough to train it on your laptop; ~5 min on my M3 MacBook Air!)

If you are looking for something to code & read this weekend, I uploaded a notebook to finetune a small GPT model to classify SPAM messages with ~96% accuracy: github.com/rasbt/LLMs-fro… 
(Fun fact: it's small enough to train it on your laptop; ~5 min on my M3 MacBook Air!)
Vrushank Desai (@vrushankdes) 's Twitter Profile Photo

I spent a couple months at the beginning of this year learning about GPU programming through trying to optimize inference for Cheng Chi awesome Diffusion Policy paper. I was able to improve inference time for the denoising U-Net by ~3.4x over Pytorch eager mode and ~2.65x over

the tiny corp (@__tinygrad__) 's Twitter Profile Photo

.AMD @amdradeon released some MES documentation today! (it's on GPUOpen) A good start, but we are bypassing the MES now in our "AMD" backend. We are even bypassing most of the MEC. Can you document the PM4 packets and what happens after you poke COMPUTE_DISPATCH_INITIATOR?

.<a href="/AMD/">AMD</a> @amdradeon released some MES documentation today! (it's on GPUOpen)

A good start, but we are bypassing the MES now in our "AMD" backend. We are even bypassing most of the MEC.

Can you document the PM4 packets and what happens after you poke COMPUTE_DISPATCH_INITIATOR?
Tom Yeh (@proftomyeh) 's Twitter Profile Photo

Transformer by Hand✍️ To study the transformer architecture, it is like opening up the hood of a car and seeing all sorts of engine parts: embeddings, positional encoding, feed-forward network, attention weighting, self-attention, cross-attention, multi-head attention, layer

Tom Yeh (@proftomyeh) 's Twitter Profile Photo

llm.c by Hand✍️ C programming + matrix multiplication by hand This combination is perhaps as low as we can get to explain how the Transformer works. Special thanks to Andrej Karpathy for encouraging early feedback and tetsuo - cRc for helping me understand the pragma magic. I hope

Tagir Valeev (@tagir_valeev) 's Twitter Profile Photo

Почему хорошо иметь детей. 1. Обнимаешь ребёнка — он тёплый. 2. Можно с ребёнком гулять на детской площадке, крутиться на каруселях, кататься с горки. 3. Всегда есть, с кем дома поиграть в настолки, не надо никого звать. 4. Он смешной.