mzba (@limzba) 's Twitter Profile
mzba

@limzba

ID: 633814695

calendar_today12-07-2012 13:24:00

439 Tweet

335 Followers

140 Following

Georgi Gerganov (@ggerganov) 's Twitter Profile Photo

I just ran the gpt-oss eval suite with the large gpt-oss-120b on my M2 Ultra using vanilla llama.cpp and got the following scores: - GPQA: 79.8% - AIME25: 96.6% These numbers are inline with those from various cloud providers: Here are the steps: github.com/ggml-org/llama…

I just ran the gpt-oss eval suite with the large gpt-oss-120b on my M2 Ultra using vanilla llama.cpp and got the following scores:

- GPQA: 79.8%
- AIME25: 96.6%

These numbers are inline with those from various cloud providers:

Here are the steps:

github.com/ggml-org/llama…
FFmpeg (@ffmpeg) 's Twitter Profile Photo

🚨 FFmpeg 8.0 has been released! 🚨 It has many new features and bugfixes such as APV and ProRes RAW decoding, numerous Vulkan encoders and decoders, VVC decoding features etc. We have also upgraded our project infrastructure. ffmpeg.org

mzba (@limzba) 's Twitter Profile Photo

I canceled my Claude max subscription now. I'm not exactly sure what Anthropic did, but it's reached the point the Claude code is actually causing more damage to my project than the benefit it used to provide.

Awni Hannun (@awnihannun) 's Twitter Profile Photo

I post about LLMs a lot, but MLX is much more than an LLM inference framework. A good way to learn more about its many features is the intro video we made for WWDC 25:

I post about LLMs a lot, but MLX is much more than an LLM inference framework.

A good way to learn more about its many features is the intro video we made for WWDC 25:
N8 Programs (@n8programs) 's Twitter Profile Photo

Quick gist for eval-ling perplexity that Ivan Fioravanti ᯅ often finds useful. Essentially just the mlx-lm eval_ppl util wrapped for command-line usage. gist.github.com/N8python/fe048…

mzba (@limzba) 's Twitter Profile Photo

Why does Qwen-Image need a negative prompt embedding to properly generate text in images, even the empty negative prompt still be required ? Is this simply due to the way the model was trained? It really slows down the generation process as we have to pass the prompt embedding

mzba (@limzba) 's Twitter Profile Photo

I am convinced that Codex is better. Claude code and DeepSeek V3.1 didn't get anywhere, wasting a day and $30 in tokens. Codex fixed the issue in 30 minutes. Next time, I'll try Ivan Fioravanti ᯅ's recommendations first to save a few bucks 😀.

Prince Canuma (@prince_canuma) 's Twitter Profile Photo

Introducing Marvis-TTS 🔥🚀 A new local-first TTS model Lucas Newman and I built for efficiency, accessibility, and real-time performance right on consumer devices like Apple Silicon, iPhones, iPads, and more. Traditional TTS models often demand full text inputs or sacrifice

mzba (@limzba) 's Twitter Profile Photo

This makes me think about how Stable Diffusion uses CLIP for the text encoder, Flux.1 uses CLIP and T5, and Qwen Image uses Qwen-VL as the text/image encoder. I'm wondering if image generation/editing are now starting to incorporate more and more LLM architectures, which might

Awni Hannun (@awnihannun) 's Twitter Profile Photo

GPT-OSS uses MXFP4 quantization (which MLX now supports). There are two FP4 formats circulating right now: MXFP4 and NVFP4 (NV for Nvidia). From looking at how GPT-OSS uses MXFP4, it is somewhat suboptimal. I'm thinking NVFP4 will be the more commonly used format in the

GPT-OSS uses MXFP4 quantization (which MLX now supports). 

There are two FP4 formats circulating right now: MXFP4 and NVFP4 (NV for Nvidia).

From looking at how GPT-OSS uses MXFP4, it is somewhat suboptimal. I'm thinking NVFP4 will be the more commonly used format in the
mzba (@limzba) 's Twitter Profile Photo

GLM 4.5 air was the only local model I can run locally and doing a good coding job, please give it a try and support those OSS model providers :)