Andi Marafioti (@andimarafioti) 's Twitter Profile
Andi Marafioti

@andimarafioti

🤖 smol models @huggingface | multimodal and on device | 🇦🇷 in 🇨🇭

ID: 1513241024969203725

calendar_today10-04-2022 19:42:57

1,1K Tweet

3,3K Followers

369 Following

merve (@mervenoyann) 's Twitter Profile Photo

officially written 28 pages over two weekends (only have around 3-4 pages left) about everything post-training (RLHF, SFT, DPO, MPO, GRPO, also stuff like LoRA, DoRA, QLoRA, bnb) and I did it while I was sick you can just write things

officially written 28 pages over two weekends (only have around 3-4 pages left)
about everything post-training (RLHF, SFT, DPO, MPO, GRPO, also stuff like LoRA, DoRA, QLoRA, bnb)

and I did it while I was sick

you can just write things
Amir Mahla (@amir_mahla) 's Twitter Profile Photo

Let’s goooo! We’ve just launched a new Computer Use Agent (CUA) powered by open models, Hugging Face smolagents and @E2B for secure computer sandboxing! We're building something different. Open. Transparent. Yours. Check this out

Andi Marafioti (@andimarafioti) 's Twitter Profile Photo

Just dropped: 🎉 NVIDIA Nemotron-Parse v1.1 Next-gen OCR for parsing PDFs & PPTs into structured, machine-ready output (text + bounding boxes + semantic classes). Ready for commercial use and to generate datasets🚀 Check the examples on Hugging face! huggingface.co/nvidia/NVIDIA-…

Just dropped: 🎉 NVIDIA Nemotron-Parse v1.1 
Next-gen OCR for parsing PDFs & PPTs into structured, machine-ready output (text + bounding boxes + semantic classes). 
Ready for commercial use and to generate datasets🚀
Check the examples on Hugging face! huggingface.co/nvidia/NVIDIA-…
Andi Marafioti (@andimarafioti) 's Twitter Profile Photo

Late night coding, brain fried. Pasted logs + code to ChatGPT and Gemini. ChatGPT explained the issue and how to solve it. Gemini told me I must have pasted the wrong code because it had nothing to do with the logs. Gemini was right.

merve (@mervenoyann) 's Twitter Profile Photo

I'm keeping a track of real-time vision models (mostly detectors) on Hugging Face we have RT-DETR, YOLO, RF-DETR and D-FINE for now what other models should we add?

I'm keeping a track of real-time vision models (mostly detectors) on <a href="/huggingface/">Hugging Face</a> 

we have RT-DETR, YOLO, RF-DETR and D-FINE for now

what other models should we add?
merve (@mervenoyann) 's Twitter Profile Photo

icymi Hugging Face dropped a computer use agent last week 🔥 built on various Qwen3-VL models & E2B sandbox, ask the app to do anything 🙌🏻 it exposes each thinking step to you, try different models with a neat UI🤩

Lucie-Aimée Kaffee (@frimelle) 's Twitter Profile Photo

For the first time, in 2025, Chinese model developers surpassed the U.S. in adoption, driven by DeepSeek and Qwen. The global race for AI isn’t settled.

For the first time, in 2025, Chinese model developers surpassed the U.S. in adoption, driven by DeepSeek and Qwen. The global race for AI isn’t settled.
David Louapre (@dlouapre) 's Twitter Profile Photo

Introducing "The Eiffel Tower Llama"!🗼 Remember Golden Gate Claude? Unfortunately Anthropic's viral demo was shut down after 24h, and key technical details remained hidden. So we recreated it, uncovering key insights on steering LLMs using SAEs⚒️ Full blog post + live demo 👇

Introducing "The Eiffel Tower Llama"!🗼

Remember Golden Gate Claude? Unfortunately Anthropic's viral demo was shut down after 24h, and key technical details remained hidden.

So we recreated it, uncovering key insights on steering LLMs using SAEs⚒️

Full blog post + live demo 👇
Quentin Gallouédec (@qgallouedec) 's Twitter Profile Photo

Not to diminish anyone’s achievements, but most of the things Jesus did were basically extensions of techniques Jürgen Schmidhuber published decades earlier. Turning water into wine? They had liquid-to-liquid transformation networks in ’78 BCE. As usual: great work, zero citations.

Lysandre (@lysandrejik) 's Twitter Profile Photo

Transformers v5's first release candidate is out 🔥 The biggest release of my life. It's been five years since the last major (v4). From 20 architectures to 400, 20k daily downloads to 3 million. The release is huge, w/ tokenization (no slow tokenizers!), modeling & processing.

Transformers v5's first release candidate is out 🔥 The biggest release of my life.

It's been five years since the last major (v4). From 20 architectures to 400, 20k daily downloads to 3 million.

The release is huge, w/ tokenization (no slow tokenizers!), modeling &amp; processing.
Julien Chaumond (@julien_c) 's Twitter Profile Photo

We heard your feedback 🙏 Killer new feature on Hugging Face = duplicate any dataset to your account. Thanks to Xet, duplicating a 1TB dataset takes 2 seconds HF is going to be the best platform for large data storage

We heard your feedback 🙏

Killer new feature on <a href="/huggingface/">Hugging Face</a> = duplicate any dataset to your account. 

Thanks to Xet, duplicating a 1TB dataset takes 2 seconds

HF is going to be the best platform for large data storage