Anwesha Chowdhury (@anweshac1211) 's Twitter Profile
Anwesha Chowdhury

@anweshac1211

AI Research Engineer | Open Source Contributor |
CTO of an AI Fashion Startup based in UK

ID: 1658986663803228161

linkhttps://achowdhury1211.github.io calendar_today18-05-2023 00:03:55

163 Tweet

38 Followers

378 Following

Anwesha Chowdhury (@anweshac1211) 's Twitter Profile Photo

interesting read on parallelism some takeaways: - expert parallelism is the same as data parallelism for non-moe layers - for dp: the input batches of data are divided among the GPUs - for model parallelism: the inputs batches are replicated in each core.

interesting read on parallelism
some takeaways:
- expert parallelism is the same as data parallelism for non-moe layers
- for dp: the input batches of data are divided among the GPUs
- for model parallelism: the inputs batches are replicated in each core.
A.I.Warper (@aiwarper) 's Twitter Profile Photo

Fun workflow I was playing with last night 1) Kontext to remove Thor from the shot 2) Photopea to place Shrek 3) Kontext + Relight lora to blend him into the shot 4) Wan2.2 i2V to animate Very addicting... ๐Ÿ˜ Prompts are written up in the corner. Wan 2.2 prompt below ๐Ÿ‘‡

Ostris (@ostrisai) 's Twitter Profile Photo

Trained a sidechain LoRA to compensate for the quantization precision loss when quantizing Qwen Image to 3 bit. It works well. This can be active during training and should allow us to fine tune Qwen Image on <24GB of VRAM. This can be done to all models.

Trained a sidechain LoRA to compensate for the quantization precision loss when quantizing Qwen Image to 3 bit. It works well. This can be active during training and should allow us to fine tune Qwen Image on &lt;24GB of VRAM. This can be done to all models.