Orr Zohar @ ICLR’25 (@orr_zohar) Twitter Tweets • TwiCopy

Orr Zohar @ ICLR’25

@orr_zohar

+ Follow

PhD Student @Stanford • Researching large multimodal models • @KnightHennessy scholar • Advised by @yeung_levy

ID: 1659236939088936961

linkhttps://orrzohar.github.io/ calendar_today18-05-2023 16:38:24

79 Tweet

279 Takipçi

169 Takip Edilen

good girl

@goodgirlxsz

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Leandro von Werra

@lvwerra

8 months ago

It's 2025 and the most trending model has <1B parameters

thumb_up_off_alt86

chat_bubble_outline7

repeat14

shareShare

smol code update for HuggingSnap, huge impact: we added VoiceOver support so more people can use it more easily. A visual local assistant always in your pocket has many use cases, and it can also be a great help for people with low vision. Reminder: local, private, open source.

thumb_up_off_alt9

chat_bubble_outline1

repeat2

shareShare

Alejandro Lozano

@ale9806_

8 months ago

Earlier this year, we released the BIOMEDICA dataset, featuring 24 million unique image caption pairs and 30 million image references derived from open-source biomedical literature. It's been great to see the community engaging with it—we're currently seeing around 6K downloads

thumb_up_off_alt24

chat_bubble_outline4

repeat9

shareShare

Peter Tong

@tongpetersb

8 months ago

Vision models have been smaller than language models; what if we scale them up? Introducing Web-SSL: A family of billion-scale SSL vision models (up to 7B parameters) trained on billions of images without language supervision, using VQA to evaluate the learned representation.

thumb_up_off_alt484

chat_bubble_outline8

repeat84

shareShare

Orr Zohar @ ICLR’25

@orr_zohar

8 months ago

Excited to see SmolVLM powering BMC-SmolVLM in the latest BIOMEDICA update! At just 2.2B params, it matches 7-13B biomedical VLMs. Check out the full release: Hugging Face #smolvlm

thumb_up_off_alt11

chat_bubble_outline0

repeat5

shareShare

Andi Marafioti

@andimarafioti

8 months ago

We are so back with Hugging Face’s Smol models 🚀 Usage doubled 🔥 and we’re now at 110k+ MAU 📈 SmolLM, SmolVLM, SmolDocling — all coming together 💫 Huge thanks to everyone building with us 💛 Let’s keep it growing 💪✨

thumb_up_off_alt24

chat_bubble_outline0

repeat4

shareShare

Andi Marafioti

@andimarafioti

8 months ago

Today, we share the tech report for SmolVLM: Redefining small and efficient multimodal models. 🔥 Explaining how to design a tiny 256M VLM that uses less than 1GB of RAM and outperforms our 80B models from 18 months ago! Here are the coolest insights from our experiments: ✨

thumb_up_off_alt471

chat_bubble_outline7

repeat113

shareShare

merve

@mervenoyann

8 months ago

SmolVLM paper is out 🔥 It's one of my favorite papers since it contains a ton of findings on training a good smol model 🤯 Andi Marafioti summarized it here ⤵️

thumb_up_off_alt266

chat_bubble_outline3

repeat35

shareShare

Orr Zohar @ ICLR’25

@orr_zohar

8 months ago

🤗The SmolVLM report is out, with all the experiments, findings, and insights that led to high performance at tiny sizes🤏. 📱These models can run on most mobile/edge devices. 📖Give it a look!

thumb_up_off_alt50

chat_bubble_outline0

repeat9

shareShare

Andi Marafioti

@andimarafioti

8 months ago

We are taking the most popular open-source reproducible evaluation (OpenCompass). I actually reached out to moondream and asked them to update their eval since it's 6 months old and their internal evaluations claim to way higher🤷‍♂️

thumb_up_off_alt12

chat_bubble_outline0

repeat1

shareShare

Andi Marafioti

@andimarafioti

8 months ago

Eric Lee vik The values we report can be corroborated with the open-source evaluation from Open Compass. The model in the table you're highlighting is SmolVLM2 (huggingface.co/spaces/opencom…). We don't know how moondream got those evaluations for SmolVLM, I guess they run their own evals.

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Luis

@lusxvr

7 months ago

Today, we are open-sourcing nanoVLM, a pure pytorch library to train a Vision-Language Model from scratch in 750 lines of code. Training on one H100 for 6h, we get 35.3% on MMStar, matching SmolVLM-256M which was trained with 100x more GPU hours. 👀 Even in a FREE Google Colab,

thumb_up_off_alt916

chat_bubble_outline12

repeat147

shareShare

Vaibhav (VB) Srivastav

@reach_vb

7 months ago

BOOOM! Learn VLMs from inside out in < 1000 lines of pure PyTorch code! 🔥 github.com/huggingface/na…

thumb_up_off_alt408

chat_bubble_outline1

repeat64

shareShare

Thomas Wolf

@thom_wolf

7 months ago

New open-source drop from the HF team - nanoVLM A super tight codebase to learn/train VLM with good performances - inspired by Andrej Karpathy 's NanoGPT 750 lines of pytorch code. Training a 222M parameters nanoVLM for 6 hours on a single H100 reaches 35.3% on MMStar, matching the

New open-source drop from the HF team - nanoVLM

A super tight codebase to learn/train VLM with good performances - inspired by <a href="/karpathy/">Andrej Karpathy</a> 's NanoGPT

750 lines of pytorch code. Training a 222M parameters nanoVLM for 6 hours on a single H100 reaches 35.3% on MMStar, matching the

thumb_up_off_alt149

chat_bubble_outline1

repeat25

shareShare

Aritra R G

@arig23498

7 months ago

🥹❤️

thumb_up_off_alt153

chat_bubble_outline2

repeat3

shareShare

Andi Marafioti

@andimarafioti

7 months ago

Alert alert, we got our first external contribution to the nanoVLM project! Thank you, Lain !

Alert alert, we got our first external contribution to the nanoVLM project! Thank you, <a href="/not_so_lain/">Lain</a> !

thumb_up_off_alt39

chat_bubble_outline2

repeat3

shareShare

Miquel Farré

@micuelll

7 months ago

WE ARE COOKING!! I’m looking for a creative engineer to join the ride 🤩 If that’s you, send me a message 🚀 You should be someone who learns tools fast, builds scrappy hacks when needed, and focuses on what works. You might be working in the space of media, image/video

thumb_up_off_alt13

chat_bubble_outline0

repeat2

shareShare

Orr Zohar @ ICLR’25

good girl

Leandro von Werra

Pedro Cuenca

Alejandro Lozano

Peter Tong

Orr Zohar @ ICLR’25

Andi Marafioti

Andi Marafioti

merve

Orr Zohar @ ICLR’25

Andi Marafioti

Andi Marafioti

Luis

Vaibhav (VB) Srivastav

Thomas Wolf

Aritra R G

Andi Marafioti

Miquel Farré