Alvaro Somoza (@ozzygt) 's Twitter Profile
Alvaro Somoza

@ozzygt

ML Engineer @HuggingFace

ID: 26274548

calendar_today24-03-2009 17:01:53

86 Tweet

270 Takipçi

176 Takip Edilen

Sayak Paul (@risingsayak) 's Twitter Profile Photo

Yo! `Wan2.1-I2V-14B-480P` LoRAs are now supported in Diffusers. Go, fire it! In the frame, we have the popular Squish LoRA.

Alvaro Somoza (@ozzygt) 's Twitter Profile Photo

It's amazing that this old SD 1.5 fine-tuned model with images from Studio Ghibli is trending on the Hugging Face Hub thanks to OpenAI. huggingface.co/nitrosocke/Ghi…

clem 🤗 (@clementdelangue) 's Twitter Profile Photo

Before 2020, most of the AI field was open and collaborative. For me, that was the key factor that accelerated scientific progress and made the impossible possible—just look at the “T” in ChatGPT, which comes from the Transformer architecture openly shared by Google. Then came

Sayak Paul (@risingsayak) 's Twitter Profile Photo

You have had your GPT4-o Ghibli moments. Now switch your attention to the latest Diffusers release - 0.33.0 🔥 Bringing you a bunch of new image & video gen models, a wide suite of memory optimizations w/ caching, & `torch.compile()` support when hotswapping LoRAs. Thread ⬇️

Alvaro Somoza (@ozzygt) 's Twitter Profile Photo

please, stop comparing HiDream quantized with Flux or other models, quantization really hurts this model. Soon I'll post a guide on how to run it without quantization on 24GB and 16GB GPUs with almost the same speed as with quantization using diffusers.

please, stop comparing HiDream quantized with Flux or other models, quantization really hurts this model. Soon I'll post a guide on how to run it without quantization on 24GB and 16GB GPUs with almost the same speed as with quantization using diffusers.
Linoy Tsaban🎗️ (@linoy_tsaban) 's Twitter Profile Photo

HiDream Image LoRA fine-tuning just dropped and it's time to make it yours 🧨 HiDream's sota capabilities (and mit license) bring a lot of potential to explore with fine-tunes 🔥 and of course I had to train a yarn art LoRA 🧶 - more upgrades and features soon! - code, weights

HiDream Image LoRA fine-tuning just dropped and it's time to make it yours 🧨

HiDream's sota capabilities (and mit license) bring a lot of potential to explore with fine-tunes 🔥

and of course I had to train a yarn art LoRA 🧶
- more upgrades and features soon!
- code, weights
apolinario 🌐 (@multimodalart) 's Twitter Profile Photo

Do you know what's as exciting as Veo 3? Running Wan 2.1 in just 4-8 steps 🏎️💨 I built a demo for Wan 2.1 + CausVid LoRA, which allows high quality video generation in just 4 steps 🤏

Sayak Paul (@risingsayak) 's Twitter Profile Photo

We present HeadHunter, a framework for principled analysis of perturbed attention guidance 🤖 Consequently, it enables deeply fine-grained control over the generation quality & visual attributes. Join in 🧵 for insights and "guidance". 1/12

We present HeadHunter, a framework for principled analysis of perturbed attention guidance 🤖

Consequently, it enables deeply fine-grained control over the generation quality & visual attributes.

Join in 🧵 for insights and "guidance".

1/12
Sayak Paul (@risingsayak) 's Twitter Profile Photo

Boy, we shipped, and we shipped hard 🧨 From new SoTA open models to improved support for torch.compile to features, inspiring more accessibility -- this Diffusers release is a blast! What's your favorite? Check out the notes for more!

Boy, we shipped, and we shipped hard 🧨

From new SoTA open models to improved support for torch.compile to features, inspiring more accessibility -- this Diffusers release is a blast!

What's your favorite?

Check out the notes for more!
Alvaro Somoza (@ozzygt) 's Twitter Profile Photo

With Flux Kontext, if you want a more precise shape and positioning you can actually do a bad drawing of what you want. Prompt: change the black drawing of a hat to a real hat.

With Flux Kontext, if you want a more precise shape and positioning you can actually do a bad drawing of what you want. Prompt: change the black drawing of a hat to a real hat.
Alvaro Somoza (@ozzygt) 's Twitter Profile Photo

Created a new guide about image compositing with Diffusers and SDXL, for those of you who don't know how to do it and need an open-source and commercially usable solution. huggingface.co/blog/OzzyGT/di…

Created a new guide about image compositing with Diffusers and SDXL, for those of you who don't know how to do it and need an open-source and commercially usable solution.
huggingface.co/blog/OzzyGT/di…
Alvaro Somoza (@ozzygt) 's Twitter Profile Photo

so it turns out we don't need to do anything special for the text encoder, this is with both the transformer and text encoder using bitsandbytes with 4-bit quantization, using under 16GB of VRAM and in ~1m40s with a 3090

so it turns out we don't need to do anything special for the text encoder, this is with both the transformer and text encoder using bitsandbytes with 4-bit quantization, using under 16GB of VRAM and in ~1m40s with a 3090
Alvaro Somoza (@ozzygt) 's Twitter Profile Photo

if you want the code for running Qwen-Image with 24GB and 16GB GPUs using diffusers, you can use the code in this PR while waiting for a merge: huggingface.co/Qwen/Qwen-Imag…

Alvaro Somoza (@ozzygt) 's Twitter Profile Photo

Qwen-Image-Lightning 8-Steps runs in 22s and using less than 16GB with a 3090. You can find the models and the code to test it here: huggingface.co/OzzyGT/qwen-im…

Qwen-Image-Lightning 8-Steps runs in 22s and using less than 16GB with a 3090. You can find the models and the code to test it here: huggingface.co/OzzyGT/qwen-im…