Mark Kurtz (@markurtz_) 's Twitter Profile
Mark Kurtz

@markurtz_

AI @ @RedHat, former CTO @ @NeuralMagic (acquired), active ML researcher and contributor

ID: 855942805956554752

calendar_today23-04-2017 00:34:02

179 Tweet

356 Takipçi

95 Takip Edilen

Mark Kurtz (@markurtz_) 's Twitter Profile Photo

5 years ago, I started an internal research project called neuralmagicML. Through a ton of engineering and hard work at @neuralmagic, those humble beginnings have evolved into a fantastic open-source framework within vLLM! I'm incredibly excited for the future of LLM Compressor!

Mark Kurtz (@markurtz_) 's Twitter Profile Photo

🌟 New Featured Community Model: FP8 Llama-3.1-Storm-8B! 🌟 Check out model created with LLM Compressor in our HF collection. 3X faster with better accuracy than the base Llama 3.1 8B model! Collection: huggingface.co/collections/ne… LLM Compressor: github.com/vllm-project/l…

🌟 New Featured Community Model: FP8 Llama-3.1-Storm-8B! 🌟

Check out model created with LLM Compressor in our HF collection. 3X faster with better accuracy than the base Llama 3.1 8B model!

Collection: huggingface.co/collections/ne…
LLM Compressor: github.com/vllm-project/l…
Red Hat AI (@redhat_ai) 's Twitter Profile Photo

.Mark Kurtz, CTO of Neural Magic, shares insights on building a career in AI, from hands-on expertise to understanding AI’s limitations. He emphasizes mentorship and open-source projects in advancing education. Read his tips for impactful AI solutions!👇 medium.com/authority-maga…

Santiago (@svpino) 's Twitter Profile Photo

You can now optimize and make any open-source LLM faster: 1. pip install llmcompressor 2. apply quantization with 1 line of code Two benefits: 1. Your LLM will run faster during inference time. 2. You will save a ton of money on hardware Here are a couple of examples: •

You can now optimize and make any open-source LLM faster:

1. pip install llmcompressor
2. apply quantization with 1 line of code

Two benefits:

1. Your LLM will run faster during inference time.
2. You will save a ton of money on hardware

Here are a couple of examples:

•
Mark Kurtz (@markurtz_) 's Twitter Profile Photo

It was a privilege to sit down with Chris Brandt from FUTR.tv and explore the state of AI today. We covered everything from the promise of smaller, specialized models to the real risks enterprises face when adopting AI. Key points in the podcast: 1️⃣  Why is AI

Mark Kurtz (@markurtz_) 's Twitter Profile Photo

To paraphrase a popular post from Jasper on this platform: give me in4 or give me death! In all seriousness, there is no reason why people should waste power, energy, and performance deploying 16-bit LLMs. These models are enormous, and with that, there's a ton of

To paraphrase a popular post from <a href="/zjasper666/">Jasper</a> on this platform: give me in4 or give me death!

In all seriousness, there is no reason why people should waste power, energy, and performance deploying 16-bit LLMs. These models are enormous, and with that, there's a ton of
Mark Kurtz (@markurtz_) 's Twitter Profile Photo

The results are in: Trade-offs between accuracy & performance in LLM quantization After hundreds of thousands of evals and benchmarks from our research team at @neuralmagic, I'm excited to share our findings on LLM quantization—now available as a paper on arXiv:

The results are in: Trade-offs between accuracy &amp; performance in LLM quantization

After hundreds of thousands of evals and benchmarks from our research team at @neuralmagic, I'm excited to share our findings on LLM quantization—now available as a paper on arXiv:
brian stevens (@addvin) 's Twitter Profile Photo

I’m thrilled to announce that Neural Magic has signed a definitive agreement to join forces with Red Hat, Inc. At Neural Magic our vision is that the future of AI is open, and we have been on a mission to enable enterprises to capture the powerful innovation from AI, while at

I’m thrilled to announce that Neural Magic has signed a definitive agreement to join forces with Red Hat, Inc.

At Neural Magic our vision is that the future of AI is open, and we have been on a mission to enable enterprises to capture the powerful innovation from AI, while at
jamiegoldstein (@jamieagoldstein) 's Twitter Profile Photo

In 2018, we raised a glass to toast the start of our partnership with @neuralmagic. Nearly seven years later, we raised another to mark the successful sale of the company to Red Hat - a well-deserved milestone for an incredible team and groundbreaking technology.

Mark Kurtz (@markurtz_) 's Twitter Profile Photo

I’m a bit late in sharing this, but at the start of the year, I embarked on an exciting new journey—joining Red Hat following its acquisition of @neuralmagic, where I had the privilege of serving as CTO. While waiting for the deal to finalize during a holiday trip, I saw this

I’m a bit late in sharing this, but at the start of the year, I embarked on an exciting new journey—joining <a href="/RedHat/">Red Hat</a>  following its acquisition of @neuralmagic, where I had the privilege of serving as CTO.

While waiting for the deal to finalize during a holiday trip, I saw this
Mark Kurtz (@markurtz_) 's Twitter Profile Photo

To quote a recent tweet: vLLM is the de facto standard open-source LLM inference engine for large-scale, enterprise-grade production deployments.

To quote a recent tweet: vLLM is the de facto standard open-source LLM inference engine for large-scale, enterprise-grade production deployments.
Red Hat AI (@redhat_ai) 's Twitter Profile Photo

Llama 4 Herd is here! It brings a lot of goodies, like MoE architecture and native multimodality, enabling developers to build personalized multimodal experiences. With Day 0 support in vLLM, you can deploy Llama 4 with vLLM now! Let's dig into it. (a thread)

Llama 4 Herd is here! 

It brings a lot of goodies, like MoE architecture and native multimodality, enabling developers to build personalized multimodal experiences.

With Day 0 support in vLLM, you can deploy Llama 4 with <a href="/vllm_project/">vLLM</a> now!

Let's dig into it.

(a thread)
Red Hat AI (@redhat_ai) 's Twitter Profile Photo

Join us this Friday for Random Samples, a weekly AI talk series from Red Hat AI Innovation Team. Topic: The State of LLM Compression — From Research to Production We’ll explore quantization, sparsity, academic vs. real-world benchmarks, and more. Join details in comments 👇

Join us this Friday for Random Samples, a weekly AI talk series from <a href="/RedHat_AI/">Red Hat AI</a> Innovation Team.

Topic: The State of LLM Compression — From Research to Production

We’ll explore quantization, sparsity, academic vs. real-world benchmarks, and more.

Join details in comments 👇
Red Hat AI (@redhat_ai) 's Twitter Profile Photo

Missed the session? No worries! Watch the recording on YouTube: youtube.com/live/T8XDkZuv7… View the slides: docs.google.com/presentation/d… Questions? Drop them in comments and Mark Kurtz will get back to you.

Red Hat AI (@redhat_ai) 's Twitter Profile Photo

🚨 Introducing the Axolotl-LLM Compressor integration, designed to make fine-tuning sparse models easier and more efficient than ever! Now you can fine-tune sparse models for specific data while preserving their sparse structure and recovering any accuracy lost during pruning.

🚨 Introducing the Axolotl-LLM Compressor integration, designed to make fine-tuning sparse models easier and more efficient than ever! Now you can fine-tune sparse models for specific data while preserving their sparse structure and recovering any accuracy lost during pruning.
Eldar Kurtic (@_eldarkurtic) 's Twitter Profile Photo

Our flagship paper on how far careful quantization can really go in practice got accepted as an oral at ACL 2025 (top 8%)! 🥳 Turns out, old-school methods like GPTQ, SmoothQuant, and RTN are quite good when tuned properly. All of the tricks are already in LLM-Compressor!

Our flagship paper on how far careful quantization can really go in practice got accepted as an oral at ACL 2025 (top 8%)! 🥳

Turns out, old-school methods like GPTQ, SmoothQuant, and RTN are quite good when tuned properly. 
All of the tricks are already in LLM-Compressor!
Red Hat AI (@redhat_ai) 's Twitter Profile Photo

🚀 Thrilled to announce GuideLLM v0.3.0! This release is highlighted by a brand new Web UI, containerized benchmarking, and powerful dataset preprocessing. GuideLLM GitHub: github.com/vllm-project/g… (Thread 👇)

🚀 Thrilled to announce GuideLLM v0.3.0!

This release is highlighted by a brand new Web UI, containerized benchmarking, and powerful dataset preprocessing.

GuideLLM GitHub: github.com/vllm-project/g…

(Thread 👇)