🫡 *in monk mode* (@eric_ruleman) Twitter Tweets • TwiCopy

🫡 in monk mode

@eric_ruleman

+ Follow

ai & acrobatics

ID: 744967929045803008

linkhttp://acrofestivals.org calendar_today20-06-2016 18:59:30

13,13K Tweet

1,1K Followers

4,4K Following

🫡 in monk mode

@eric_ruleman

9 months ago

crazy that programmers re-invented phones, money & cars

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

training time compute can be amortized across all future inference calls but inference time compute is only valuable for one user inference time compute will therefore dramatically increase compute demands

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

🫡 in monk mode

@eric_ruleman

8 months ago

Playing with the new Qwern QVQ "visual reasoning" model on Hugging Face. Comes back blank for tank man, Mao Zedong, Xi Jinping, and Jack Ma. It will identify Yao Ming though! huggingface.co/spaces/Qwen/QV…

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

🫡 in monk mode

@eric_ruleman

8 months ago

Deepseek greatly decreased training costs by reducing inference from O(n^2) to O(n) via Latent Attention. Instead of computing each pairwise attention score, they converted the tokens to a latent space vector (length L) thus O(n*L) forward pass.

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare