Alvaro Somoza (@ozzygt) Twitter Tweets • TwiCopy

Sayak Paul

10 months ago

Yo! `Wan2.1-I2V-14B-480P` LoRAs are now supported in Diffusers. Go, fire it! In the frame, we have the popular Squish LoRA.

thumb_up_off_alt19

chat_bubble_outline0

repeat5

shareShare

POM

@peteromallet

9 months ago

Hackathon today with Lightricks, sponsored by @FAL, Hugging Face and Google With a Modular Diffusers/Mellon sneak peak by apolinario 🌐!

Hackathon today with <a href="/Lightricks/">Lightricks</a>, sponsored by @FAL, <a href="/huggingface/">Hugging Face</a> and <a href="/Google/">Google</a>

With a Modular Diffusers/Mellon sneak peak by <a href="/multimodalart/">apolinario 🌐</a>!

thumb_up_off_alt64

chat_bubble_outline2

repeat13

shareShare

Alvaro Somoza

@ozzygt

9 months ago

It's amazing that this old SD 1.5 fine-tuned model with images from Studio Ghibli is trending on the Hugging Face Hub thanks to OpenAI. huggingface.co/nitrosocke/Ghi…

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Before 2020, most of the AI field was open and collaborative. For me, that was the key factor that accelerated scientific progress and made the impossible possible—just look at the “T” in ChatGPT, which comes from the Transformer architecture openly shared by Google. Then came

thumb_up_off_alt618

chat_bubble_outline31

repeat74

shareShare

Sayak Paul

@risingsayak

9 months ago

You have had your GPT4-o Ghibli moments. Now switch your attention to the latest Diffusers release - 0.33.0 🔥 Bringing you a bunch of new image & video gen models, a wide suite of memory optimizations w/ caching, & `torch.compile()` support when hotswapping LoRAs. Thread ⬇️

thumb_up_off_alt80

chat_bubble_outline2

repeat13

shareShare

Alvaro Somoza

@ozzygt

9 months ago

please, stop comparing HiDream quantized with Flux or other models, quantization really hurts this model. Soon I'll post a guide on how to run it without quantization on 24GB and 16GB GPUs with almost the same speed as with quantization using diffusers.

thumb_up_off_alt19

chat_bubble_outline2

repeat0

shareShare

Linoy Tsaban🎗️

@linoy_tsaban

8 months ago

HiDream Image LoRA fine-tuning just dropped and it's time to make it yours 🧨 HiDream's sota capabilities (and mit license) bring a lot of potential to explore with fine-tunes 🔥 and of course I had to train a yarn art LoRA 🧶 - more upgrades and features soon! - code, weights

thumb_up_off_alt135

chat_bubble_outline9

repeat21

shareShare

apolinario 🌐

@multimodalart

7 months ago

Do you know what's as exciting as Veo 3? Running Wan 2.1 in just 4-8 steps 🏎️💨 I built a demo for Wan 2.1 + CausVid LoRA, which allows high quality video generation in just 4 steps 🤏

thumb_up_off_alt175

chat_bubble_outline17

repeat19

shareShare

Sayak Paul

@risingsayak

7 months ago

We present HeadHunter, a framework for principled analysis of perturbed attention guidance 🤖 Consequently, it enables deeply fine-grained control over the generation quality & visual attributes. Join in 🧵 for insights and "guidance". 1/12

thumb_up_off_alt151

chat_bubble_outline3

repeat30

shareShare

Sayak Paul

@risingsayak

6 months ago

Boy, we shipped, and we shipped hard 🧨 From new SoTA open models to improved support for torch.compile to features, inspiring more accessibility -- this Diffusers release is a blast! What's your favorite? Check out the notes for more!

thumb_up_off_alt95

chat_bubble_outline6

repeat14

shareShare

Alvaro Somoza

@ozzygt

6 months ago

With Flux Kontext, if you want a more precise shape and positioning you can actually do a bad drawing of what you want. Prompt: change the black drawing of a hat to a real hat.

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Alvaro Somoza

@ozzygt

6 months ago

change the drawing of a hat to a real baseball cap and with the same exact position and color as the drawing

thumb_up_off_alt6

chat_bubble_outline1

repeat0

shareShare

Alvaro Somoza

@ozzygt

5 months ago

Created a new guide about image compositing with Diffusers and SDXL, for those of you who don't know how to do it and need an open-source and commercially usable solution. huggingface.co/blog/OzzyGT/di…

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Alvaro Somoza

@ozzygt

5 months ago

how is this for prompt adherence? Qwen-Image with 24GB GPU and Diffusers.

thumb_up_off_alt14

chat_bubble_outline2

repeat0

shareShare

Alvaro Somoza

@ozzygt

5 months ago

so it turns out we don't need to do anything special for the text encoder, this is with both the transformer and text encoder using bitsandbytes with 4-bit quantization, using under 16GB of VRAM and in ~1m40s with a 3090

thumb_up_off_alt5

chat_bubble_outline1

repeat0

shareShare

Alvaro Somoza

@ozzygt

5 months ago

if you want the code for running Qwen-Image with 24GB and 16GB GPUs using diffusers, you can use the code in this PR while waiting for a merge: huggingface.co/Qwen/Qwen-Imag…

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Alvaro Somoza

@ozzygt

5 months ago

Qwen-Image-Lightning 8-Steps runs in 22s and using less than 16GB with a 3090. You can find the models and the code to test it here: huggingface.co/OzzyGT/qwen-im…

thumb_up_off_alt16

chat_bubble_outline2

repeat2

shareShare

Alvaro Somoza

Sayak Paul

POM

Alvaro Somoza

clem 🤗

Sayak Paul

Alvaro Somoza

Linoy Tsaban🎗️

apolinario 🌐

Sayak Paul

Sayak Paul

Alvaro Somoza

Alvaro Somoza

Alvaro Somoza

Alvaro Somoza

Alvaro Somoza

Alvaro Somoza

Alvaro Somoza