Mark Kurtz (@markurtz_) Twitter Tweets • TwiCopy

Mark Kurtz

a year ago

5 years ago, I started an internal research project called neuralmagicML. Through a ton of engineering and hard work at @neuralmagic, those humble beginnings have evolved into a fantastic open-source framework within vLLM! I'm incredibly excited for the future of LLM Compressor!

thumb_up_off_alt17

chat_bubble_outline1

repeat1

shareShare

Mark Kurtz

@markurtz_

a year ago

🌟 New Featured Community Model: FP8 Llama-3.1-Storm-8B! 🌟 Check out model created with LLM Compressor in our HF collection. 3X faster with better accuracy than the base Llama 3.1 8B model! Collection: huggingface.co/collections/ne… LLM Compressor: github.com/vllm-project/l…

thumb_up_off_alt14

chat_bubble_outline1

repeat4

shareShare

Red Hat AI

@redhat_ai

a year ago

.Mark Kurtz, CTO of Neural Magic, shares insights on building a career in AI, from hands-on expertise to understanding AI’s limitations. He emphasizes mentorship and open-source projects in advancing education. Read his tips for impactful AI solutions!👇 medium.com/authority-maga…

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Santiago

@svpino

a year ago

You can now optimize and make any open-source LLM faster: 1. pip install llmcompressor 2. apply quantization with 1 line of code Two benefits: 1. Your LLM will run faster during inference time. 2. You will save a ton of money on hardware Here are a couple of examples: •

thumb_up_off_alt874

chat_bubble_outline20

repeat125

shareShare

Mark Kurtz

@markurtz_

a year ago

It was a privilege to sit down with Chris Brandt from FUTR.tv and explore the state of AI today. We covered everything from the promise of smaller, specialized models to the real risks enterprises face when adopting AI. Key points in the podcast: 1️⃣ Why is AI

thumb_up_off_alt7

chat_bubble_outline0

repeat3

shareShare

Mark Kurtz

@markurtz_

a year ago

To paraphrase a popular post from Jasper on this platform: give me in4 or give me death! In all seriousness, there is no reason why people should waste power, energy, and performance deploying 16-bit LLMs. These models are enormous, and with that, there's a ton of

To paraphrase a popular post from <a href="/zjasper666/">Jasper</a> on this platform: give me in4 or give me death!

In all seriousness, there is no reason why people should waste power, energy, and performance deploying 16-bit LLMs. These models are enormous, and with that, there's a ton of

thumb_up_off_alt29

chat_bubble_outline3

repeat8

shareShare

Mark Kurtz

@markurtz_

a year ago

The results are in: Trade-offs between accuracy & performance in LLM quantization After hundreds of thousands of evals and benchmarks from our research team at @neuralmagic, I'm excited to share our findings on LLM quantization—now available as a paper on arXiv:

thumb_up_off_alt18

chat_bubble_outline2

repeat4

shareShare

brian stevens

@addvin

a year ago

I’m thrilled to announce that Neural Magic has signed a definitive agreement to join forces with Red Hat, Inc. At Neural Magic our vision is that the future of AI is open, and we have been on a mission to enable enterprises to capture the powerful innovation from AI, while at

thumb_up_off_alt128

chat_bubble_outline17

repeat35

shareShare

Mark Kurtz

@markurtz_

a year ago

Incredibly excited to continue our mission of open, efficient AI!

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

jamiegoldstein

@jamieagoldstein

a year ago

In 2018, we raised a glass to toast the start of our partnership with @neuralmagic. Nearly seven years later, we raised another to mark the successful sale of the company to Red Hat - a well-deserved milestone for an incredible team and groundbreaking technology.

thumb_up_off_alt14

chat_bubble_outline2

repeat8

shareShare

Mark Kurtz

@markurtz_

9 months ago

I’m a bit late in sharing this, but at the start of the year, I embarked on an exciting new journey—joining Red Hat following its acquisition of @neuralmagic, where I had the privilege of serving as CTO. While waiting for the deal to finalize during a holiday trip, I saw this

I’m a bit late in sharing this, but at the start of the year, I embarked on an exciting new journey—joining <a href="/RedHat/">Red Hat</a> following its acquisition of @neuralmagic, where I had the privilege of serving as CTO.

While waiting for the deal to finalize during a holiday trip, I saw this

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Mark Kurtz

@markurtz_

8 months ago

To quote a recent tweet: vLLM is the de facto standard open-source LLM inference engine for large-scale, enterprise-grade production deployments.

thumb_up_off_alt13

chat_bubble_outline0

repeat1

shareShare

Red Hat AI

@redhat_ai

8 months ago

Llama 4 Herd is here! It brings a lot of goodies, like MoE architecture and native multimodality, enabling developers to build personalized multimodal experiences. With Day 0 support in vLLM, you can deploy Llama 4 with vLLM now! Let's dig into it. (a thread)

thumb_up_off_alt190

chat_bubble_outline5

repeat47

shareShare

Red Hat AI

@redhat_ai

7 months ago

Join us this Friday for Random Samples, a weekly AI talk series from Red Hat AI Innovation Team. Topic: The State of LLM Compression — From Research to Production We’ll explore quantization, sparsity, academic vs. real-world benchmarks, and more. Join details in comments 👇

Join us this Friday for Random Samples, a weekly AI talk series from <a href="/RedHat_AI/">Red Hat AI</a> Innovation Team.

Topic: The State of LLM Compression — From Research to Production

We’ll explore quantization, sparsity, academic vs. real-world benchmarks, and more.

Join details in comments 👇

thumb_up_off_alt10

chat_bubble_outline1

repeat5

shareShare

Red Hat AI

@redhat_ai

7 months ago

Missed the session? No worries! Watch the recording on YouTube: youtube.com/live/T8XDkZuv7… View the slides: docs.google.com/presentation/d… Questions? Drop them in comments and Mark Kurtz will get back to you.

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Red Hat AI

@redhat_ai

5 months ago

🚨 Introducing the Axolotl-LLM Compressor integration, designed to make fine-tuning sparse models easier and more efficient than ever! Now you can fine-tune sparse models for specific data while preserving their sparse structure and recovering any accuracy lost during pruning.

thumb_up_off_alt46

chat_bubble_outline0

repeat14

shareShare

Eldar Kurtic

@_eldarkurtic

5 months ago

Our flagship paper on how far careful quantization can really go in practice got accepted as an oral at ACL 2025 (top 8%)! 🥳 Turns out, old-school methods like GPTQ, SmoothQuant, and RTN are quite good when tuned properly. All of the tricks are already in LLM-Compressor!

thumb_up_off_alt155

chat_bubble_outline5

repeat29

shareShare

Red Hat AI

@redhat_ai

2 months ago

🚀 Thrilled to announce GuideLLM v0.3.0! This release is highlighted by a brand new Web UI, containerized benchmarking, and powerful dataset preprocessing. GuideLLM GitHub: github.com/vllm-project/g… (Thread 👇)

thumb_up_off_alt21

chat_bubble_outline1

repeat11

shareShare