Ella Charlaix
@ellacharlaix
ML Eng @huggingface
21-07-2015 10:20:40
14 Tweets
629 Followers
225 Following
You can now accelerate inference by applying quantization to models from the Hugging Face Hub π₯
β‘οΈ With π€ Optimum, you can easily apply static and dynamic quantization on your model before exporting it to the ONNX format π€―
Start here π huggingface.co/docs/optimum/mβ¦