Stephen McConnachie (@mcnatch) Twitter Tweets • TwiCopy

Simon Willison

6 months ago

My LLM command-line tool and Python library now has support for tool calling! You can define tools as Python functions or bundle them in plugins, and LLM can then make them available to models. OpenAI, Anthropic, Gemini and Ollama are supported so far. simonwillison.net/2025/May/27/ll…

thumb_up_off_alt442

chat_bubble_outline10

repeat46

shareShare

XiaomiMiMo

@xiaomimimo

6 months ago

Today, MiMo can see We release MiMo-VL-7B-SFT and MiMo-VL-7B-RL, two powerful vision-language models delivering state-of-the-art performance in both general visual understanding and multimodal reasoning. MiMo-VL-7B-RL outperforms Qwen2.5-VL-7B on 35 out of 40 evaluated tasks,

thumb_up_off_alt137

chat_bubble_outline6

repeat30

shareShare

Adina Yakup

@adinayakup

6 months ago

Video-XL-2 🔥 long video understanding model by BAAI and Shanghai Jiao Tong University huggingface.co/BAAI/Video-XL-2 ✨ Apache 2.0 ✨ Handles up to 10,000+ frames on a single GPU ✨ 2048-frame encoding in just 12s ✨ Efficient Chunk-based Prefilling & Bi-granularity KV decoding

thumb_up_off_alt210

chat_bubble_outline1

repeat40

shareShare

ollama

@ollama

6 months ago

3 months ago, Stanford's Hazy Research lab introduced Minions, a project that connects Ollama to frontier cloud models to reduce cloud costs by 5-30x while achieving 98% of frontier model accuracy. Secure Minion turns an H100 into a secure enclave, where all memory and

thumb_up_off_alt1,1K

chat_bubble_outline21

repeat171

shareShare

merve

@mervenoyann

5 months ago

Past week was insanely packed for open AI! 😱 Luckily we picked some highlights for you ❤️ lfg! 💬 LLMs/VLMs > Deepseek 🐳 released DeepSeek-R1-0528, 38B model, only 0.2 and 1.4 points behind o3 in AIME 24/25 🤯 they also released an 8B distilled version based on Qwen3 (OS) >

thumb_up_off_alt137

chat_bubble_outline5

repeat22

shareShare

merve

@mervenoyann

5 months ago

stop building parser pipelines 👋🏻 there's a new document parser that is small, fast, Apache 2.0 licensed and is better than all the other ones! 😱 MonkeyOCR is a 3B model that can parse everything (charts, formules, tables etc) in a document 🤠

thumb_up_off_alt984

chat_bubble_outline18

repeat120

shareShare

merve

@mervenoyann

3 months ago

we're all sleeping on this OCR model 🔥 dots.ocr is a new 3B model with sota performance, support for 100 languages & allowing commercial use! 🤯 single e2e model to extract image, convert tables, formula, and more into markdown 📝

thumb_up_off_alt2,2K

chat_bubble_outline47

repeat367

shareShare

Hynek Kydlíček

@hkydlicek

3 months ago

Chinese instagram (RedNote) silently dropping the best VLM for general purpose OCR in just 1.7B size wasn't on my wish list 🎅

thumb_up_off_alt2,2K

chat_bubble_outline20

repeat226

shareShare

Teknium (e/λ)

@teknium1

3 months ago

All the details of OpenAI's new base model courtesy of HuggingFace update log. - Looks like NO base model (despite their oss model cookbook page saying it is) - 21B and 117B Total Param, 3.6B and 5.1B Active MoE Model sizes - Reasoning and Agentic capabilities - License: APACHE

thumb_up_off_alt753

chat_bubble_outline24

repeat45

shareShare

Simon Willison

@simonw

3 months ago

The OpenAI open weight models just dropped - a 20B and a 120B, both under a proper open source Apache 2.0 license!

thumb_up_off_alt346

chat_bubble_outline9

repeat23

shareShare

Etienne Bernard

@etiennebcp

3 months ago

We are releasing NuMarkdown-8B-Thinking, an open-source (MIT License) reasoning OCR model 🧠✨📄 NuMarkdown-8B-Thinking is apparently the first (!) reasoning VLM specialized in converting PDFs/Scans/Spreadsheets into Markdown files (typically used for RAG applications). It

thumb_up_off_alt1,1K

chat_bubble_outline27

repeat161

shareShare

Z.ai

@zai_org

3 months ago

Introducing GLM-4.5V: a breakthrough in open-source visual reasoning GLM-4.5V delivers state-of-the-art performance among open-source models in its size class, dominating across 41 benchmarks. Built on the GLM-4.5-Air base model, GLM-4.5V inherits proven techniques from

thumb_up_off_alt1,1K

chat_bubble_outline116

repeat357

shareShare

Georgi Gerganov

@ggerganov

3 months ago

whisper.cpp is coming to ffmpeg github.com/FFmpeg/FFmpeg/…

thumb_up_off_alt1,1K

chat_bubble_outline45

repeat184

shareShare

Xenova

@xenovacom

3 months ago

Google just released their smallest Gemma model ever: Gemma 3 270M! 🤯 🤏 Highly compact & efficient 🤖 Strong instruction-following capabilities 🔧 Perfect candidate for fine-tuning It's so tiny that it can even run 100% locally in your browser with Transformers.js! 🤗

thumb_up_off_alt299

chat_bubble_outline9

repeat36

shareShare

merve

@mervenoyann

3 months ago

Meta released DINOv3 🔥 > 12 sota image models (ConvNeXT and ViT) in various sizes, trained on web and satellite data! > use for anything: image classification to segmentation, depth or even video tracking 🤯 > day-0 support from transformers 🤗 > allows commercial use! 😍

thumb_up_off_alt532

chat_bubble_outline11

repeat82

shareShare

Xenova

@xenovacom

3 months ago

Simon Willison > I imagine this model will be particularly fun to play with directly in a browser using transformers.js. I built a fun little bedtime story generator with it 🤗

thumb_up_off_alt35

chat_bubble_outline2

repeat5

shareShare

steven

@tu7uruu

3 months ago

HUGE RELEASE! Nvidia just droppped: > Granary: the largest open-source speech dataset for European languages 🗣️🇪🇺 > Canary-1b-v2: 25 languages, ASR + En↔X translation > Parakeet-tdt-0.6b-v3: SOTA multilingual ASR You can now train your ASR model to understand European

thumb_up_off_alt538

chat_bubble_outline15

repeat69

shareShare

Piotr Żelasko

@piotrzelasko

3 months ago

You asked for it, and we listened. MULTILINGUAL Canary v2 and Parakeet v3!! 🌏 25 European languages 🏆 SotA on Multilingual Open ASR Leaderboard 🔥 600x and 2000x faster than real-time 🕰️ Timestamps! 🗣️ Speech translation (Canary) 🃏 Granary: all data is open, train it yourself!

thumb_up_off_alt333

chat_bubble_outline12

repeat41

shareShare

FFmpeg

@ffmpeg

3 months ago

🚨 FFmpeg 8.0 has been released! 🚨 It has many new features and bugfixes such as APV and ProRes RAW decoding, numerous Vulkan encoders and decoders, VVC decoding features etc. We have also upgraded our project infrastructure. ffmpeg.org

thumb_up_off_alt3,3K

chat_bubble_outline73

repeat246

shareShare

OpenBMB

@openbmb

3 months ago

🚀 Introducing MiniCPM-V 4.5 8B: pushing the boundary of multimodal AI! ～ SOTA VL Capability: Surpasses GPT-4o, Gemini 2.0 Pro, Qwen2.5-VL 72B on OpenCompass! ～ "Eagle Eye" Video: 96x visual token compression for high refresh rate and long video understanding ～ Controllable

thumb_up_off_alt122

chat_bubble_outline8

repeat39

shareShare