Tijmen Blankevoort (@tirune) 's Twitter Profile
Tijmen Blankevoort

@tirune

ID: 41835547

calendar_today22-05-2009 15:39:21

471 Tweet

446 Takipçi

161 Takip Edilen

Qualcomm Research & Technologies (@qcomresearch) 's Twitter Profile Photo

Low-bit integers are the go to format for #AI model efficiency. When does floating point perform better? Our NeurIPS Conference accepted paper "FP8 Quantization: The Power of the Exponent" tackles this question. Tijmen Blankevoort Mart Jorn Peters Markus Nagel bit.ly/3V50VPL

Qualcomm Research & Technologies (@qcomresearch) 's Twitter Profile Photo

Wondering how it's possible to run very large #AI models such as Stable Diffusion and GPT on device? Our Qualcomm AI Research team has compared the most popular integer and floating point formats and weighs in on which one is most efficient: qualcomm.com/news/onq/2023/…

Davis Blalock (@davisblalock) 's Twitter Profile Photo

"FP8 versus INT8 for efficient deep learning inference" Is fp8 just plain better than int8? No. There are tradeoffs between the two at various levels of the stack, and this paper digs into their strengths and weaknesses. [1/11]

"FP8 versus INT8 for efficient deep learning inference"

Is fp8 just plain better than int8? 

 No. There are tradeoffs between the two at various levels of the stack, and this paper digs into their strengths and weaknesses. [1/11]
Babak Ehteshami Bejnordi (@babakeht) 's Twitter Profile Photo

We propose a dynamic tokenizer for ViTs, where the scale at which an image is processed varies based on the complexity of the image area. This means less computing for simple areas and more for complex, cluttered areas. Thanks to Amelie Royer, Jakob Havtorn, Tijmen Blankevoort

Tijmen Blankevoort (@tirune) 's Twitter Profile Photo

Talking in this Dutch podcast, about how a general AI, something that can do many tasks just like a human (and perhaps better) might not be as far away as you might think. 😃 Specifically RL+LLMs has the potential ability to supercharge the current model performance.

Yuki (@y_m_asano) 's Twitter Profile Photo

Very happy to announce that VeRA is accepted at ICLR 2026 with scores 8,8,8,5! VeRA makes LoRA ~10x more parameter efficient while retaining the same performance & also works for vision! Paper: arxiv.org/abs/2310.11454 Our very light-weight webpage😏: dkopi.github.io/vera/

Very happy to announce that VeRA is accepted at <a href="/iclr_conf/">ICLR 2026</a> with scores 8,8,8,5! 
VeRA makes LoRA ~10x more parameter efficient while retaining the same performance &amp; also works for vision!

Paper: arxiv.org/abs/2310.11454

Our very light-weight webpage😏: dkopi.github.io/vera/
Tycho van der Ouderaa (@tychovdo) 's Twitter Profile Photo

⭐️New paper ⭐️ Excited to share 'The LLM Surgeon', accepted at ICLR 2024. We obtain SOTA pruning performance and even demonstrate structured LLM pruning of full rows and cols. Direct practical impact enabling compression up to 20-30% with negligible loss in performance.🧵1/9👇

⭐️New paper ⭐️ Excited to share 'The LLM Surgeon', accepted at ICLR 2024. We obtain SOTA pruning performance and even demonstrate structured LLM pruning of full rows and cols. Direct practical impact enabling compression up to 20-30% with negligible loss in performance.🧵1/9👇
Mart (@martvanbaalen) 's Twitter Profile Photo

Our work on Vector Quantization for SOTA size vs accuracy trade-offs in LLMs is on Arxiv! Thanks co-authors Andrey Kuzmin, Markus Nagel, Peter Couperus, Cedric Bastoul, Eric Mahurin, Tijmen Blankevoort and Paul Whatmough for their hard work And thanks to AK for amplifying!

Jeremy Howard (@jeremyphoward) 's Twitter Profile Photo

Today, with Tim Dettmers, Hugging Face, & @mobius_labs, we're releasing FSDP/QLoRA, a new project that lets you efficiently train very large (70b) models on a home computer with consumer gaming GPUs. 1/🧵 answer.ai/posts/2024-03-…

Tycho van der Ouderaa (@tychovdo) 's Twitter Profile Photo

Our paper, "The LLM Surgeon," accepted at ICLR 2024, achieves SOTA in LLM pruning in all unstructured, semi-structured, and the most challenging but most effective structured pruning that removes entire matrix rows/columns. Happy to share that code is now publicly available.

Yuki (@y_m_asano) 's Twitter Profile Photo

Another week, another release: Our PEFT method VeRA (LoRA but 10-100x less parameters thanks to random projections) is now on HF PEFT! so now's a good time to `pip install peft` Thx to Alex McKinney Benjamin Bossan Kopi Or Tea ? + Tijmen Blankevoort huggingface.co/docs/peft/pack…

Another week, another release:
Our PEFT method VeRA (LoRA but 10-100x less parameters thanks to random projections) is now on HF PEFT!
so now's a good time to `pip install peft` 
Thx to <a href="/alexfmckinney/">Alex McKinney</a> <a href="/BenjaminBossan/">Benjamin Bossan</a> <a href="/dkopi/">Kopi Or Tea ?</a> + <a href="/TiRune/">Tijmen Blankevoort</a> 
huggingface.co/docs/peft/pack…
Yuki (@y_m_asano) 's Twitter Profile Photo

Today we introduce Bidirectional Instruction Tuning (Bitune). It's a new way of adapting LLMs for the instruction->answering stage. It allows the model to process the instruction/question with bidirectional attention, while the answer generation remains causal.

Today we introduce Bidirectional Instruction Tuning (Bitune). It's a new way of adapting LLMs for the instruction-&gt;answering stage.

It allows the model to process the instruction/question with bidirectional attention, while the answer generation remains causal.
Nathan Benaich (@nathanbenaich) 's Twitter Profile Photo

🪩The State of AI 2024 has landed! 🪩 Our seventh installment is our biggest and most comprehensive yet, covering everything you *need* to know about research, industry, safety and politics. As ever, here's my director’s cut (+ video tutorial!) 🧵

Yuki (@y_m_asano) 's Twitter Profile Photo

So You Think your ICLR 2026 rejection was surprising? We nearly fell out of our chairs when our 7.25 avg rating (10,8,6,5 -- i.e. top 4%) Bitune paper got rejected 😅. It's not like new points or problems surfaced... Just ¯\_(ツ)_/¯ I guess? Sharing this so that especially

So You Think your <a href="/iclr_conf/">ICLR 2026</a> rejection was surprising? 
We nearly fell out of our chairs when our 7.25 avg rating (10,8,6,5 -- i.e. top 4%)  Bitune paper got rejected 😅. 
It's not like new points or problems surfaced... Just ¯\_(ツ)_/¯  I guess? 
Sharing this so that especially
Zechun Liu (@zechunliu) 's Twitter Profile Photo

🚀 We're thrilled to announce that the SoTA low-bit quantization ParetoQ code is now open-source! 🌟 github.com/facebookresear… 🔍 What does this repo support? 🌟State-of-the-art sub-4-bit quantization: It is a significant upgrade from our previous LLM-QAT repo. Outperforming all

🚀 We're thrilled to announce that the SoTA low-bit quantization ParetoQ code is now open-source! 
🌟 github.com/facebookresear… 

🔍 What does this repo support?
🌟State-of-the-art sub-4-bit quantization: It is a significant upgrade from our previous LLM-QAT repo. Outperforming all
Tijmen Blankevoort (@tirune) 's Twitter Profile Photo

I recently made the news because of a doc I wrote in Meta’s GenAI organization. ‘The Information’ wrote about it as if I did a big raging ‘mic drop’ before leaving the company. Nothing could be further from the truth - so setting the record straight here. open.substack.com/pub/blankevoor…