mzba (@limzba) Twitter Tweets • TwiCopy

Georgi Gerganov

a month ago

I just ran the gpt-oss eval suite with the large gpt-oss-120b on my M2 Ultra using vanilla llama.cpp and got the following scores: - GPQA: 79.8% - AIME25: 96.6% These numbers are inline with those from various cloud providers: Here are the steps: github.com/ggml-org/llama…

thumb_up_off_alt302

chat_bubble_outline4

repeat30

shareShare

FFmpeg

@ffmpeg

a month ago

🚨 FFmpeg 8.0 has been released! 🚨 It has many new features and bugfixes such as APV and ProRes RAW decoding, numerous Vulkan encoders and decoders, VVC decoding features etc. We have also upgraded our project infrastructure. ffmpeg.org

thumb_up_off_alt3,3K

chat_bubble_outline73

repeat246

shareShare

mzba

@limzba

a month ago

I canceled my Claude max subscription now. I'm not exactly sure what Anthropic did, but it's reached the point the Claude code is actually causing more damage to my project than the benefit it used to provide.

thumb_up_off_alt2,2K

chat_bubble_outline292

repeat59

shareShare

Ivan Fioravanti ᯅ

@ivanfioravanti

a month ago

Looking at performance jump of 4B, can't wait to try Qwen3-8B-Instruct-2508(?).

thumb_up_off_alt86

chat_bubble_outline7

repeat9

shareShare

Awni Hannun

@awnihannun

a month ago

I post about LLMs a lot, but MLX is much more than an LLM inference framework. A good way to learn more about its many features is the intro video we made for WWDC 25:

thumb_up_off_alt277

chat_bubble_outline10

repeat30

shareShare

N8 Programs

@n8programs

a month ago

Quick gist for eval-ling perplexity that Ivan Fioravanti ᯅ often finds useful. Essentially just the mlx-lm eval_ppl util wrapped for command-line usage. gist.github.com/N8python/fe048…

thumb_up_off_alt12

chat_bubble_outline3

repeat1

shareShare

mzba

@limzba

a month ago

Why does Qwen-Image need a negative prompt embedding to properly generate text in images, even the empty negative prompt still be required ? Is this simply due to the way the model was trained? It really slows down the generation process as we have to pass the prompt embedding

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

mzba

@limzba

a month ago

I am convinced that Codex is better. Claude code and DeepSeek V3.1 didn't get anywhere, wasting a day and $30 in tokens. Codex fixed the issue in 30 minutes. Next time, I'll try Ivan Fioravanti ᯅ's recommendations first to save a few bucks 😀.

thumb_up_off_alt57

chat_bubble_outline7

repeat0

shareShare

mzba

@limzba

a month ago

You always win customer's respect by this kind of transparency

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Prince Canuma

@prince_canuma

a month ago

Introducing Marvis-TTS 🔥🚀 A new local-first TTS model Lucas Newman and I built for efficiency, accessibility, and real-time performance right on consumer devices like Apple Silicon, iPhones, iPads, and more. Traditional TTS models often demand full text inputs or sacrifice

thumb_up_off_alt429

chat_bubble_outline39

repeat73

shareShare

mzba

@limzba

a month ago

This makes me think about how Stable Diffusion uses CLIP for the text encoder, Flux.1 uses CLIP and T5, and Qwen Image uses Qwen-VL as the text/image encoder. I'm wondering if image generation/editing are now starting to incorporate more and more LLM architectures, which might

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

mzba

@limzba

a month ago

Great to see the new open sourced dense model, can't wait to test its agentic abilities

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Ivan Fioravanti ᯅ

@ivanfioravanti

a month ago

Faster prompt processing coming to MLX! Testing this soon 🚀

thumb_up_off_alt86

chat_bubble_outline7

repeat2

shareShare

mzba

@limzba

a month ago

Nice to see MLX get mentioned...

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Awni Hannun

@awnihannun

a month ago

GPT-OSS uses MXFP4 quantization (which MLX now supports). There are two FP4 formats circulating right now: MXFP4 and NVFP4 (NV for Nvidia). From looking at how GPT-OSS uses MXFP4, it is somewhat suboptimal. I'm thinking NVFP4 will be the more commonly used format in the