Matt | @mattstaff@hachyderm.io (@mattstaff) 's Twitter Profile
Matt | @[email protected]

@mattstaff

Person.
Migrating to Mastodon
@[email protected]

ID: 84802491

linkhttp://www.pluralsight.com calendar_today24-10-2009 08:54:53

3,3K Tweet

176 Followers

961 Following

Robin (@solarise_webdev) 's Twitter Profile Photo

I know you're all getting mighty tired of seeing typography on your timeline today! But here's a pretext.js demo that (hopefully) isn't a crime against justification and indentation.

Prince Canuma (@prince_canuma) 's Twitter Profile Photo

John is killing it! He has being pushing my TurboQuant MLX implementation to the max. In the video below he uses to process 265K context using Qwen3.5-4B 🚀 It achieves 500ms TTFT and a reasonable 40 tok/s.

Prince Canuma (@prince_canuma) 's Twitter Profile Photo

mlx-vlm v0.4.3 is here 🚀 Day-0 support: 🔥 Gemma 4 (vision, audio, MoE) by Google DeepMind 🦅 Falcon-OCR + Falcon Perception by Technology Innovation Institute 🪨 Granite Vision 4.0 by IBM Research New models: 🎯 SAM 3.1 with Object Multiplex by Facebook 🔍 RF-DETR detection & segmentation by

mlx-vlm v0.4.3 is here 🚀

Day-0 support:
 🔥 Gemma 4 (vision, audio, MoE) by <a href="/GoogleDeepMind/">Google DeepMind</a> 
🦅 Falcon-OCR + Falcon Perception by <a href="/TIIuae/">Technology Innovation Institute</a> 
🪨 Granite Vision 4.0 by <a href="/IBMResearch/">IBM Research</a> 

New models: 
🎯 SAM 3.1 with Object Multiplex by <a href="/facebook/">Facebook</a> 
🔍 RF-DETR detection &amp; segmentation by
Prince Canuma (@prince_canuma) 's Twitter Profile Photo

Gemma 4 26B-A4B is now ~2x faster at 375K context with TurboQuant on MLX-VLM v0.4.4 🚀 The model's official max context is 262K but I pushed it to 375K anyway. That's roughly 5–6 full novels (the entire LOTR trilogy + The Hobbit). Up to ~20K tokens they're neck and neck, but

Gemma 4 26B-A4B is now ~2x faster at 375K context with TurboQuant on MLX-VLM v0.4.4 🚀

The model's official max context is 262K but I pushed it to 375K anyway. That's roughly 5–6 full novels (the entire LOTR trilogy + The Hobbit).

Up to ~20K tokens they're neck and neck, but
exQUIZitely.com (@exquizitelycom) 's Twitter Profile Photo

An average picture that you save on your phone or PC has a size of around 400 kilobytes. It doesn't do anything, it's just a static image. Now divide that by the factor 10, so you drop to 40 kilobytes. That's the size of The Last Ninja, developed by System 3 and published in

Prince Canuma (@prince_canuma) 's Twitter Profile Photo

TurboQuant: Open Evals on MLX 🔥 Yesterday I launched mlx-vlm v0.4.4 with major TurboQuant performance improvements. Today, the open benchmark results on MM-NIAH (val, 520 samples) using Gemma 4 26B IT by Google DeepMind on M3 Ultra: → 0 quality loss — 78% accuracy for both

TurboQuant: Open Evals on MLX 🔥

Yesterday I launched mlx-vlm v0.4.4 with major TurboQuant performance improvements.

Today, the open benchmark results on MM-NIAH (val, 520 samples) using Gemma 4 26B IT by <a href="/GoogleDeepMind/">Google DeepMind</a> on M3 Ultra:

→ 0 quality loss — 78% accuracy for both
Prince Canuma (@prince_canuma) 's Twitter Profile Photo

Alongside MM-NIAH I’m also running LongBench-V2 to truly showcase where TurboQuant shines which is large context ( above 60K) Running will take around 24h to complete. Meanwhile, here is a sneak peak of 6 samples across different context sizes. See you in a day or two 🫡

Alongside MM-NIAH I’m also running LongBench-V2 to truly showcase where TurboQuant shines which is large context ( above 60K) 

Running will take around 24h to complete. Meanwhile, here is a sneak peak of 6 samples across different context sizes.

See you in a day or two 🫡
Bellolandia Studio (@bellolandia) 's Twitter Profile Photo

Después de mucho tiempo de trabajo intensivo tenemos el catártico placer de compartirles este anticipo al primer episodio de Electro Andes, una serie sudamericana de anime de cyberpunk mitológico seteado en una versión fantástica de los Andes. En las siguientes semanas iremos