Jingwei Zuo (@jingweizuo) Twitter Tweets • TwiCopy

Jingwei Zuo

@jingweizuo

3 months ago

Amazing work!! Prince Canuma 🚀

thumb_up_off_alt18

chat_bubble_outline1

repeat2

shareShare

Jingwei Zuo

@jingweizuo

3 months ago

Very nice work! Dhia eddine Rhaiem More options now for Falcon-H1 full fine-tuning and LoRA tuning with LLaMA-Factory LLaMA Factory

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

We just released the fine-tuning code and fine-tuned models of BitVLA in Huggingface🔥🔥 Enjoy these hyper-efficient 1-bit VLA models! Code: github.com/ustcwhy/BitVLA Models: huggingface.co/collections/ho…

thumb_up_off_alt64

chat_bubble_outline0

repeat12

shareShare

Jingwei Zuo

@jingweizuo

2 months ago

We’re organizing the E2LM competition at #NeurIPS2025, focused on early-stage training evaluations of Large Language Models. Registration is now open — join us and help revolutionize how we evaluate LLMs! 🚀

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Technology Innovation Institute

@tiiuae

2 months ago

🚀 Exciting news! Falcon-H1 & Falcon-E are now on Oumi — the open-source platform for training, fine-tuning (SFT, LoRA, QLoRA), and deploying LLMs anywhere: laptops, cloud, or clusters. Start building: github.com/oumi-ai/oumi/t… #FalconH1 #FalconE #OpenSourceAI #LLM

thumb_up_off_alt15

chat_bubble_outline2

repeat9

shareShare

Awni Hannun

@awnihannun

2 months ago

Latest mlx-lm is out! pip install -U mlx-lm Bunch of new models: - SmolLM3 (Hugging Face) - Ernie family (Baidu) - BitNet (Microsoft) - Falcon-E (TII) - Text-only Gemma3n (Google) - MiniCPM4 (OpenBMB) - AFM (Apple) +Performance improvements for DWQ, dynamic quantization, and

thumb_up_off_alt251

chat_bubble_outline17

repeat35

shareShare

younes

@younesbelkada

2 months ago

Excited to have contributed into Falcon-E (Bitnet) integration with Prince Canuma Awni Hannun in mlx-lm Falcon-E now fully supported in mlx-lm - as simple as `mlx_lm.generate --model tiiuae/Falcon-E-1B-Instruct --prompt "Implement bubble sort" --max-tokens 100 --temp 0.1` 🚀

Excited to have contributed into Falcon-E (Bitnet) integration with <a href="/Prince_Canuma/">Prince Canuma</a> <a href="/awnihannun/">Awni Hannun</a> in mlx-lm
Falcon-E now fully supported in mlx-lm - as simple as `mlx_lm.generate --model tiiuae/Falcon-E-1B-Instruct --prompt "Implement bubble sort" --max-tokens 100 --temp 0.1` 🚀

thumb_up_off_alt23

chat_bubble_outline1

repeat5

shareShare

Technology Innovation Institute

@tiiuae

2 months ago

Falcon-H1 now runs natively on your device via llama.cpp—0.5B to 34B models, no server needed. Fast inference, long context, multilingual, tool-ready. Build, test, and go beyond. #FalconH1 #LocalLLM #AIOnDevice #EdgeAI #OpenSourceAI

thumb_up_off_alt8

chat_bubble_outline23

repeat19

shareShare

Rosinality

@rosinality

a month ago

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance Falcon's side-by-side attention-SSM hybrid model. Very detailed, from tokenizers to data preparation and optimization strategies.

thumb_up_off_alt144

chat_bubble_outline1

repeat18

shareShare

AK

@_akhaliq

a month ago

Falcon-H1 A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

thumb_up_off_alt67

chat_bubble_outline3

repeat14

shareShare

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxestex

a month ago

Falcon-H1 is a very dense research paper exploring the space of hybrid attention designs and tuning *every* hyperparameter there is. It's more interesting than models themselves. If you were intrigued by that «AlphaGo move» slop, this is the real thing.

thumb_up_off_alt63

chat_bubble_outline1

repeat12

shareShare

Jingwei Zuo

@jingweizuo

a month ago

Great updates!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Jingwei Zuo

@jingweizuo

a month ago

It’s concerning how recent model releases flex on a handful of benchmarks - but miss the bigger picture: world knowledge, nuance, common sense. Are we still building foundation models, or just performance models? VIBE CHECK: that’s all you need.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Si-ze Zheng

@deeplyignorant

25 days ago

🎉 Excited to share: We’ve open-sourced Triton-distributed MegaKernel! A fresh, powerful take on MegaKernel for LLMs—built entirely on our Triton-distributed framework. github.com/ByteDance-Seed… Why it’s awesome? 🧩 Super programmable ⚡ Blazing performance 📊 Rock-solid precision

thumb_up_off_alt144

chat_bubble_outline3

repeat23

shareShare

Jingwei Zuo

@jingweizuo

23 days ago

Omar Sanseviero Congrats on the release, but you forgot to include the SoTA in the chart: Falcon-H1-0.5B Technology Innovation Institute huggingface.co/tiiuae/Falcon-…

<a href="/osanseviero/">Omar Sanseviero</a> Congrats on the release, but you forgot to include the SoTA in the chart: Falcon-H1-0.5B <a href="/TIIuae/">Technology Innovation Institute</a> huggingface.co/tiiuae/Falcon-…

thumb_up_off_alt17

chat_bubble_outline2

repeat4

shareShare

vLLM

@vllm_project

20 days ago

🚀 Amazing community project! vLLM CLI — a command-line tool for serving LLMs with vLLM: ✅ Interactive menu-driven UI & scripting-friendly CLI ✅ Local + HuggingFace Hub model management ✅ Config profiles for perf/memory tuning ✅ Real-time server & GPU monitoring ✅ Error