Elias Frantar (@elias_frantar) Twitter Tweets • TwiCopy

Elias Frantar

@elias_frantar

+ Follow

Researcher @OpenAI | prev. PhD @ISTAustria and intern @GoogleDeepmind | I also build super fast Lego Rubik's Cube robots.

ID: 3037495037

linkhttps://efrantar.github.io/ calendar_today14-02-2015 20:01:25

76 Tweet

489 Followers

128 Following

Michael Goin

@mgoin_

2 years ago

Exciting news from our latest LLM compression research! 🚀 Together with ISTAustria and @neuralmagic, we’ve been exploring sparse finetuning for LLMs and achieved 7.7 tokens/second on a single core and at 26.7 tokens/second on 4 cores of an AMD Ryzen CPU! (1/n)

Exciting news from our latest LLM compression research! 🚀 Together with <a href="/ISTAustria/">ISTAustria</a> and @neuralmagic, we’ve been exploring sparse finetuning for LLMs and achieved 7.7 tokens/second on a single core and at 26.7 tokens/second on 4 cores of an AMD Ryzen CPU! (1/n)

thumb_up_off_alt149

chat_bubble_outline5

repeat40

shareShare

Dan Alistarh

@dalistarh

2 years ago

Happy to release QUIK, a new accurate post-training quantization method which processes the majority of weights and activations using 4bit precision. [1/N] With Saleh Ashkboos Elias Frantar Torsten Hoefler 🇨🇭 Paper: arxiv.org/abs/2310.09259 Code: github.com/IST-DASLab/QUIK Snapshot:

thumb_up_off_alt158

chat_bubble_outline7

repeat37

shareShare

efxmarty

@efxmarty

2 years ago

AutoGPTQ 0.7.0 is released and includes Elias Frantar's Marlin kernel for int4*fp16 matrix multiplication on Ampere GPUs. Check out github.com/AutoGPTQ/AutoG… - This is usable with any int4 quantized Transformers model (symmetric quantization, no act-order) directly from the Hub!🧵

thumb_up_off_alt24

chat_bubble_outline1

repeat8

shareShare

Dan Alistarh

@dalistarh

a year ago

Happy to release the write-up on the MARLIN kernel for fast LLM inference, now supporting 2:4 sparsity! Led by Elias Frantar & Roberto López Castro Paper: arxiv.org/abs/2408.11743 Code: github.com/IST-DASLab/Spa… MARLIN is integrated with vLLM thanks to @neuralmagic!

thumb_up_off_alt73

chat_bubble_outline3

repeat22

shareShare