Tijmen Blankevoort
@tirune
ID: 41835547
22-05-2009 15:39:21
471 Tweet
446 Takipçi
161 Takip Edilen
Low-bit integers are the go to format for #AI model efficiency. When does floating point perform better? Our NeurIPS Conference accepted paper "FP8 Quantization: The Power of the Exponent" tackles this question. Tijmen Blankevoort Mart Jorn Peters Markus Nagel bit.ly/3V50VPL
We propose a dynamic tokenizer for ViTs, where the scale at which an image is processed varies based on the complexity of the image area. This means less computing for simple areas and more for complex, cluttered areas. Thanks to Amelie Royer, Jakob Havtorn, Tijmen Blankevoort
Our work on Vector Quantization for SOTA size vs accuracy trade-offs in LLMs is on Arxiv! Thanks co-authors Andrey Kuzmin, Markus Nagel, Peter Couperus, Cedric Bastoul, Eric Mahurin, Tijmen Blankevoort and Paul Whatmough for their hard work And thanks to AK for amplifying!
Today, with Tim Dettmers, Hugging Face, & @mobius_labs, we're releasing FSDP/QLoRA, a new project that lets you efficiently train very large (70b) models on a home computer with consumer gaming GPUs. 1/🧵 answer.ai/posts/2024-03-…
Another week, another release: Our PEFT method VeRA (LoRA but 10-100x less parameters thanks to random projections) is now on HF PEFT! so now's a good time to `pip install peft` Thx to Alex McKinney Benjamin Bossan Kopi Or Tea ? + Tijmen Blankevoort huggingface.co/docs/peft/pack…
🪩The State of AI 2024 has landed! 🪩 Our seventh installment is our biggest and most comprehensive yet, covering everything you *need* to know about research, industry, safety and politics. As ever, here's my director’s cut (+ video tutorial!) 🧵
I recently made the news because of a doc I wrote in Meta’s GenAI organization. ‘The Information’ wrote about it as if I did a big raging ‘mic drop’ before leaving the company. Nothing could be further from the truth - so setting the record straight here. open.substack.com/pub/blankevoor…