Red Hat AI (@redhat_ai) 's Twitter Profile
Red Hat AI

@redhat_ai

Deliver AI value with the resources you have, the insights you own, and the freedom you need.

ID: 997536616481722369

linkhttps://ai.redhat.com/ calendar_today18-05-2018 17:57:17

1,1K Tweet

6,6K Takipçi

1,1K Takip Edilen

Red Hat AI (@redhat_ai) 's Twitter Profile Photo

Sparse-Marlin is here and integrated into vLLM! This GPU-optimized kernel accelerates matrix multiplication with 4-bit quantized weights and 2:4 sparsity, achieving 5.3x speedups on NVIDIA GPUs (Ampere/Ada). Maintains efficiency with batch sizes up to 32. Links below.

Sparse-Marlin is here and integrated into <a href="/vllm_project/">vLLM</a>! This GPU-optimized kernel accelerates matrix multiplication with 4-bit quantized weights and 2:4 sparsity, achieving 5.3x speedups on NVIDIA GPUs (Ampere/Ada). Maintains efficiency with batch sizes up to 32. Links  below.