Pavel Iakubovskii (@qubvelx) 's Twitter Profile
Pavel Iakubovskii

@qubvelx

ML Engineer @ 🤗 | Kaggle Competition Master | Creator of segmentation_models.pytorch

ID: 1262481185177047040

linkhttp://github.com/qubvel calendar_today18-05-2020 20:32:36

80 Tweet

242 Followers

121 Following

GeekyRakshit (e/mad) (@soumikrakshit96) 's Twitter Profile Photo

✨ CVPR 2025 highlight: A Distractor-Aware Memory for Visual Object Tracking with SAM2 the authors propose a new distractor-aware memory model for SAM2 and an introspection-based update strategy that jointly addresses the segmentation accuracy as well as tracking robustness 🏡

Niels Rogge (@nielsrogge) 's Twitter Profile Photo

Another classic has made it into the Transformers library: LightGlue (ICCV '23)🔥 A deep neural network that learns to match local features across images. Faster and more efficient than SuperGlue. Adaptive computation based on difficulty🕵️ Now available in a few lines of code!

Another classic has made it into the Transformers library: LightGlue (ICCV '23)🔥

A deep neural network that learns to match local features across images.

Faster and more efficient than SuperGlue. Adaptive computation based on difficulty🕵️

Now available in a few lines of code!
Lysandre (@lysandrejik) 's Twitter Profile Photo

BOOOM! transformers now has a baked-in http server w/ OpenAI spec compatible API Launch it with `transformers serve` and connect your favorite apps. Here I'm running 👋 Jan with local transformers and hot-swappable models. There is preliminary tool call support as well!

Niels Rogge (@nielsrogge) 's Twitter Profile Photo

New model alert in Transformers: EoMT! EoMT greatly simplifies the design of ViTs for image segmentation 🙌 Unlike Mask2Former and OneFormer which add complex modules like an adapter, pixel decoder and Transformer decoder on top, EoMT is just a ViT with a set of query tokens ✅

New model alert in Transformers: EoMT!

EoMT greatly simplifies the design of ViTs for image segmentation 🙌

Unlike Mask2Former and OneFormer which add complex modules like an adapter, pixel decoder and Transformer decoder on top, EoMT is just a ViT with a set of query tokens ✅
João Gante (@joao_gante) 's Twitter Profile Photo

LET'S GO! Cursor using local 🤗 transformers models! You can now test ANY transformers-compatible LLM against your codebase. From hacking to production, it takes only a few minutes: anything `transformers` does, you can serve into your app 🔥 Here's a demo with Qwen3 4B:

SkalskiP (@skalskip92) 's Twitter Profile Photo

supervision-0.26.0 is out we finally released support for ViTPose and ViTPose++ pose estimation models from Hugging Face transformers link: github.com/roboflow/super…

SkalskiP (@skalskip92) 's Twitter Profile Photo

re ViTPose++ I showed you yesterday here it is against YOLOv8x-pose (model that a lot of people use for pose estimation) massive difference in accuracy

Ross Wightman (@wightmanr) 's Twitter Profile Photo

A joint OpenCLIP (3.0.0) and timm (1.0.18) release day today. It's been a quarter since the last OC release, so what's new? PE (Perception Encoder) Core support was the headline feature. Using the timm vision encoder for the PE models, I adapted the weights from AI at Meta so they

merve (@mervenoyann) 's Twitter Profile Photo

We have recently merged fast processors for many models, the speed-up in Qwen-VL series is 🔥 you get speed-up up to 3x on CPU and 26x on GPU 🤯 you don't have to do anything, this is enabled by default 🥳

We have recently merged fast processors for many models, the speed-up in Qwen-VL series is 🔥

you get speed-up up to 3x on CPU and 26x on GPU 🤯

you don't have to do anything, this is enabled by default 🥳
SkalskiP (@skalskip92) 's Twitter Profile Photo

ViTPose++ is crazy good! look at the interaction between pink and green player. getting that right is really impressive. we are plugging it into basketball AI.

ViTPose++ is crazy good! 

look at the interaction between pink and green player.   getting that right is really impressive.

we are plugging it into basketball AI.
Matthew Carrigan (@carrigmat) 's Twitter Profile Photo

GPT OSS is out. It's OpenAI's first open-weights model release since GPT-2, and some of the technical innovations have huge implications. This is a thread about two of them: Learned attention sinks, and MXFP4 weights.

GPT OSS is out. It's OpenAI's first open-weights model release since GPT-2, and some of the technical innovations have huge implications. This is a thread about two of them: Learned attention sinks, and MXFP4 weights.
merve (@mervenoyann) 's Twitter Profile Photo

we have merged two new zero-shot detectors to transformers 🔥 LLMDet and MM GroundingDINO is out! these models can detect anything 🤯 to celebrate, here's an app to compare the models per inference and latency ⤵️

we have merged two new zero-shot detectors to transformers 🔥 LLMDet and MM GroundingDINO is out! 
these models can detect anything 🤯

to celebrate, here's an app to compare the models per inference and latency ⤵️
AI at Meta (@aiatmeta) 's Twitter Profile Photo

Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense