Wildminder (@wildmindai) 's Twitter Profile
Wildminder

@wildmindai

AI Researcher & Developer | ComfyUI spaghetti whisperer

ID: 1865484067883409408

calendar_today07-12-2024 19:51:06

967 Tweet

177 Takipçi

49 Takip Edilen

Wildminder (@wildmindai) 's Twitter Profile Photo

What? Where? Another new model? Geez, is it the New Year holiday already? why is santa out here delivering god-tier AI gifts like a drone on steroids

Wildminder (@wildmindai) 's Twitter Profile Photo

Harmony: Tencent Hunyuan’s SOTA model for synchronized A/V generation, like InfiniteTalk for Wan; - voice cloning + multi-language speech; - high-fidelity vids from audio inputs. sjtuplayer.github.io/projects/Harmo…

Wildminder (@wildmindai) 's Twitter Profile Photo

OMG just deep-dived the Z-Image docs and wow... there are some useful tips about prompting. 1. Talk like a robot: plain, cold, objective facts only 2. Want perfect text? Quote it like 3. Prompt order that slaps: Subject > clothes > place > lighting > camera > style 4. Use prompt

OMG just deep-dived the Z-Image docs and wow... there are some useful tips about prompting.

1. Talk like a robot: plain, cold, objective facts only
2. Want perfect text? Quote it like
3. Prompt order that slaps: Subject > clothes > place > lighting > camera > style
4. Use prompt
Wildminder (@wildmindai) 's Twitter Profile Photo

Comfyui-PainterVRAM: A tiny utility node that reserves GPU VRAM to prevent OOM crashes; simple plug-and-play with no extra outputs. github.com/princepainter/…

Comfyui-PainterVRAM: A tiny utility node that reserves GPU VRAM to prevent OOM crashes; simple plug-and-play with no extra outputs.
github.com/princepainter/…
Wildminder (@wildmindai) 's Twitter Profile Photo

ComfyUI FlowMatch Euler Discrete Scheduler: provides an experimental CustomSampler node with defaults explicitly optimized for Z-Image-Turbo workflows. github.com/erosDiffusion/…

ComfyUI FlowMatch Euler Discrete Scheduler: provides an experimental CustomSampler node with defaults explicitly optimized for Z-Image-Turbo workflows.
github.com/erosDiffusion/…
Wildminder (@wildmindai) 's Twitter Profile Photo

Another RAM manager for ComfyUI; dynamically regulates cache purging based on system memory pressure; features a "RAM_PRESSURE" mode to auto-clean github.com/Windecay/Comfy…

Another RAM manager for ComfyUI; dynamically regulates cache purging based on system memory pressure; features a "RAM_PRESSURE" mode to auto-clean
github.com/Windecay/Comfy…
Wildminder (@wildmindai) 's Twitter Profile Photo

Z-Image Technically Color LoRA - Vibrant essence of classic film; - rich saturation & dramatic lighting; - enhances depth with lush greens, brilliant blues & dreamlike textures; - creates a distinct silver screen aesthetic. huggingface.co/renderartist/T…

Z-Image Technically Color LoRA - Vibrant essence of classic film; 
- rich saturation & dramatic lighting; 
- enhances depth with lush greens, brilliant blues & dreamlike textures; 
- creates a distinct silver screen aesthetic.
huggingface.co/renderartist/T…
Wildminder (@wildmindai) 's Twitter Profile Photo

The party starts now! Z-Image Anime Style LoRA: turns generic screencaps into high-quality 2D illustrations; clean lineart & cel-shading. Feels like the first day after Flux.1 was released. huggingface.co/reverentelusar…

The party starts now! Z-Image Anime Style LoRA:  turns generic screencaps into high-quality 2D illustrations; clean lineart & cel-shading.
Feels like the first day after Flux.1 was released.
huggingface.co/reverentelusar…
Wildminder (@wildmindai) 's Twitter Profile Photo

Vidi2: A 12B multimodal model for video understanding & creation; up to 30 mins; enables plot-aware auto-editing. - Automated "Long-to-Short" Repurposing - Smart Cropping New level for tiktokers. bytedance.github.io/vidi-website/

Vidi2: A 12B multimodal model for video understanding & creation; up to 30 mins; enables plot-aware auto-editing.
- Automated "Long-to-Short" Repurposing
- Smart Cropping
New level for tiktokers.
bytedance.github.io/vidi-website/
Wildminder (@wildmindai) 's Twitter Profile Photo

Phrasing harmful prompts as poetry acts as a universal jailbreak. I need to take poetry class now. Smaller models are immune to this attack - they were too dumb. arxiv.org/abs/2511.15304…

Phrasing harmful prompts as poetry acts as a universal jailbreak. I need to take poetry class now.
Smaller models are immune to this attack - they were too dumb.
arxiv.org/abs/2511.15304…
Wildminder (@wildmindai) 's Twitter Profile Photo

Z-Image Lenovo UltraReal LoRA. I bet Neanderthals look totally real. if history class had these pics instead of those blurry cave doodles i would’ve paid attention civitai.com/models/1662740

Z-Image Lenovo UltraReal LoRA. I bet Neanderthals look totally real. 
if history class had these pics instead of those blurry cave doodles i would’ve paid attention
civitai.com/models/1662740
Wildminder (@wildmindai) 's Twitter Profile Photo

ComfyUI Z-Image Utilities. The lazy way to perfect ZiT prompts; - OpenRouter+local+HF; - prompt enhancement; - Bilingual, & Vision model support github.com/Koko-boya/Comf…

ComfyUI Z-Image Utilities. The lazy way to perfect ZiT prompts; 
- OpenRouter+local+HF; 
- prompt enhancement; 
- Bilingual, & Vision model support
github.com/Koko-boya/Comf…
Wildminder (@wildmindai) 's Twitter Profile Photo

Interesting project. CoDA: Whole-body manipulation of articulated objects from text; uses Basis Point Sets for precise finger placement; handles complex tasks like simultaneous walking & manipulation. phj128.github.io/page/CoDA/inde…

Wildminder (@wildmindai) 's Twitter Profile Photo

Wanna build an Android bot farm? GELab-Zero-4B: StepFun’s plug-and-play GUI Agent infrastructure; runs a 4B model locally; automates ADB connections & task orchestration without cloud dependencies opengelab.github.io/index.html

Wanna build an Android bot farm? GELab-Zero-4B: StepFun’s plug-and-play GUI Agent infrastructure; runs a 4B model locally; automates ADB connections & task orchestration without cloud dependencies
opengelab.github.io/index.html
Wildminder (@wildmindai) 's Twitter Profile Photo

Video-R4: Reinforcing text-rich video reasoning via visual rumination; mimics human processing by iteratively zooming & re-encoding frames to catch transient text; a 7B LMM yunlong10.github.io/Video-R4/

Video-R4: Reinforcing text-rich video reasoning via visual rumination; mimics human processing by iteratively zooming & re-encoding frames to catch transient text; a 7B LMM 
yunlong10.github.io/Video-R4/
Wildminder (@wildmindai) 's Twitter Profile Photo

We’ve got Z-image, Flux.2… but there’s still a whole month until New Year’s. Is it possible we’ll get even more existing models in 2026… maybe 4K videos on a consumer GPU in 5 min?

We’ve got Z-image, Flux.2… but there’s still a whole month until New Year’s. Is it possible we’ll get even more existing models in 2026… maybe 4K videos on a consumer GPU in 5 min?
Wildminder (@wildmindai) 's Twitter Profile Photo

Z-Image Anime VAE. A finetuned decoder for illustrations; slightly reduces high-frequency artifacts & oversharpening; unifies colors and reduces noise. Generally not so useful. huggingface.co/Anzhc/Z-Image_…

Z-Image Anime VAE. A finetuned decoder for illustrations; slightly reduces high-frequency artifacts & oversharpening; unifies colors and reduces noise. Generally not so useful.
huggingface.co/Anzhc/Z-Image_…
Wildminder (@wildmindai) 's Twitter Profile Photo

ComfyUI-WanVideoWrapper now supports SteadyDancer: like WanAnimate - human image animation framework; produces high-fidelity, coherent motion huggingface.co/Kijai/WanVideo…

Wildminder (@wildmindai) 's Twitter Profile Photo

InfiniteTalk, MultiTalk.. Now AnyTalker: multi-person talking video gen based on Wan2.1-Fun, 1.3B; uses a multi-stream structure to drive arbitrary identities from audio hkust-c4g.github.io/AnyTalker-home…