Pavel Iakubovskii (@qubvelx) Twitter Tweets • TwiCopy

Pavel Iakubovskii

@qubvelx

+ Follow

ML Engineer @ 🤗 | Kaggle Competition Master | Creator of segmentation_models.pytorch

ID: 1262481185177047040

linkhttp://github.com/qubvel calendar_today18-05-2020 20:32:36

80 Tweet

242 Followers

121 Following

GeekyRakshit (e/mad)

@soumikrakshit96

5 months ago

✨ CVPR 2025 highlight: A Distractor-Aware Memory for Visual Object Tracking with SAM2 the authors propose a new distractor-aware memory model for SAM2 and an introspection-based update strategy that jointly addresses the segmentation accuracy as well as tracking robustness 🏡

thumb_up_off_alt380

chat_bubble_outline5

repeat56

shareShare

Niels Rogge

@nielsrogge

5 months ago

Another classic has made it into the Transformers library: LightGlue (ICCV '23)🔥 A deep neural network that learns to match local features across images. Faster and more efficient than SuperGlue. Adaptive computation based on difficulty🕵️ Now available in a few lines of code!

thumb_up_off_alt399

chat_bubble_outline3

repeat43

shareShare

Lysandre

@lysandrejik

5 months ago

BOOOM! transformers now has a baked-in http server w/ OpenAI spec compatible API Launch it with `transformers serve` and connect your favorite apps. Here I'm running 👋 Jan with local transformers and hot-swappable models. There is preliminary tool call support as well!

thumb_up_off_alt103

chat_bubble_outline5

repeat28

shareShare

Niels Rogge

@nielsrogge

5 months ago

New model alert in Transformers: EoMT! EoMT greatly simplifies the design of ViTs for image segmentation 🙌 Unlike Mask2Former and OneFormer which add complex modules like an adapter, pixel decoder and Transformer decoder on top, EoMT is just a ViT with a set of query tokens ✅

thumb_up_off_alt358

chat_bubble_outline4

repeat52

shareShare

João Gante

@joao_gante

5 months ago

LET'S GO! Cursor using local 🤗 transformers models! You can now test ANY transformers-compatible LLM against your codebase. From hacking to production, it takes only a few minutes: anything `transformers` does, you can serve into your app 🔥 Here's a demo with Qwen3 4B:

thumb_up_off_alt190

chat_bubble_outline10

repeat36

shareShare

SkalskiP

@skalskip92

4 months ago

supervision-0.26.0 is out we finally released support for ViTPose and ViTPose++ pose estimation models from Hugging Face transformers link: github.com/roboflow/super…

thumb_up_off_alt984

chat_bubble_outline24

repeat156

shareShare

SkalskiP

@skalskip92

4 months ago

re ViTPose++ I showed you yesterday here it is against YOLOv8x-pose (model that a lot of people use for pose estimation) massive difference in accuracy

thumb_up_off_alt637

chat_bubble_outline16

repeat65

shareShare

Ross Wightman

@wightmanr

4 months ago

A joint OpenCLIP (3.0.0) and timm (1.0.18) release day today. It's been a quarter since the last OC release, so what's new? PE (Perception Encoder) Core support was the headline feature. Using the timm vision encoder for the PE models, I adapted the weights from AI at Meta so they

thumb_up_off_alt79

chat_bubble_outline2

repeat8

shareShare

merve

@mervenoyann

4 months ago

We have recently merged fast processors for many models, the speed-up in Qwen-VL series is 🔥 you get speed-up up to 3x on CPU and 26x on GPU 🤯 you don't have to do anything, this is enabled by default 🥳

thumb_up_off_alt328

chat_bubble_outline5

repeat25

shareShare

SkalskiP

@skalskip92

4 months ago

ViTPose++ is crazy good! look at the interaction between pink and green player. getting that right is really impressive. we are plugging it into basketball AI.

thumb_up_off_alt794

chat_bubble_outline16

repeat81

shareShare

Matthew Carrigan

@carrigmat

4 months ago

GPT OSS is out. It's OpenAI's first open-weights model release since GPT-2, and some of the technical innovations have huge implications. This is a thread about two of them: Learned attention sinks, and MXFP4 weights.

thumb_up_off_alt177

chat_bubble_outline2

repeat33

shareShare

merve

@mervenoyann

3 months ago

we have merged two new zero-shot detectors to transformers 🔥 LLMDet and MM GroundingDINO is out! these models can detect anything 🤯 to celebrate, here's an app to compare the models per inference and latency ⤵️

thumb_up_off_alt295

chat_bubble_outline6

repeat39

shareShare

AI at Meta

@aiatmeta

3 months ago

Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense

thumb_up_off_alt3,3K

chat_bubble_outline150

repeat689

shareShare