Rami Ben-Ari (@ramibenari1) Twitter Tweets • TwiCopy

2. Generating images of rare concepts using pre-trained diffusion models Dvir Samuel, Rami Ben-Ari, Simon Raviv, Nir Darshan, Gal Chechik arxiv.org/abs/2304.14530

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Dvir Samuel

@dvir_samuel

a year ago

Our latest paper, “Regularized Newton-Raphson Inversion for Text-to-Image Models,” introduces RNRI, a fast and precise method for inverting images to their noise latents.

thumb_up_off_alt1

chat_bubble_outline2

repeat2

shareShare

Dvir Samuel

@dvir_samuel

a year ago

🚀 High-quality inversion of text-to-image models in real time! Now you can do interactive image editing! 🎨 📄 Paper: arxiv.org/abs/2312.12540 🌐 Project Page & Demo: barakmam.github.io/rnri.github.io/

thumb_up_off_alt6

chat_bubble_outline1

repeat11

shareShare

Dvir Samuel

@dvir_samuel

a year ago

Our latest paper, “Regularized Newton-Raphson Inversion for Text-to-Image Models,” introduces RNRI, a fast and precise method for inverting images to their noise latents.

thumb_up_off_alt2

chat_bubble_outline1

repeat2

shareShare

🔍 RNRI highlights: - Enables super-fast editing of real images. - Improves the generation of rare concepts. - Solves inversion as a root-finding problem of an implicit equation. - Uses the Newton-Raphson numerical scheme for rapid convergence.

thumb_up_off_alt2

chat_bubble_outline0

repeat2

shareShare

Rami Ben-Ari

@ramibenari1

a year ago

Excited to share that our paper, “Active Learning via Classifier Impact and Greedy Selection for Interactive Image Retrieval”, has been accepted to TMLR! TMLR: openreview.net/pdf?id=b68QOen… Project Page: github.com/barleah/Greedy… Short Video Presentation: youtu.be/bHDARDpu8Fg

thumb_up_off_alt0

chat_bubble_outline0

repeat1

shareShare

Rami Ben-Ari

@ramibenari1

9 months ago

Happy to share that we have two papers accepted to #ICLR2025 1. Effective Foundation based Visual Place Recognition arxiv.org/abs/2405.18065 and 2. Guided Newton-Raphson Diffusion Inversion Paper: arxiv.org/abs/2312.12540 🔗 Project page: barakmam.github.io/rnri.github.io/

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Dvir Samuel

@dvir_samuel

8 months ago

🚀 Excited to share OmnimatteZero: Training-Free Real-Time Omnimatte with Video Diffusion Models! 📄 Paper: arxiv.org/abs/2503.18033 🌐 Project: dvirsamuel.github.io/omnimattezero.… 🧵👇

thumb_up_off_alt58

chat_bubble_outline2

repeat15

shareShare

Dvir Samuel

@dvir_samuel

8 months ago

🎬 We propose a training-free method for Omnimatte that can: Remove objects along with their footprint (shadows, reflections) and Seamlessly blend them in a different video, achieving SoTA in real-time, by just using a pre-trained video diffusion model, without any optimization

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare

Dvir Samuel

@dvir_samuel

8 months ago

The challenge: Omnimatte methods decompose videos into background and foreground layers, but current approaches are either computationally heavy due to per-video optimization, or rely on curated datasets for training. Can we achieve good and real-time Omnimatte without training?

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare

Dvir Samuel

@dvir_samuel

8 months ago

Why is this non-trivial? 🔹 Zero-shot image inpainting fails on videos due to temporal inconsistencies 🔹 Object inpainting must also remove shadows, reflections, and other visual effects 🔹 Existing video inpainting methods struggle with high-fidelity background reconstruction

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare

Dvir Samuel

@dvir_samuel

8 months ago

Our approach: We adapt zero-shot image inpainting for video by directly manipulating the spatio-temporal latent space of pre-trained video diffusion models.

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare

Dvir Samuel

@dvir_samuel

8 months ago

Our method leverages self-attention maps of video diffusion models to capture motion cues, enabling object removal with their effects. This works because elements moving together are inherently linked, as described by the common fate principle in Gestalt psychology. 🚀

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare

Dvir Samuel

@dvir_samuel

8 months ago

Key Achievements: ✅ Removes and Extracts objects with their effects (shadows, reflections) ✅ Top background reconstruction accuracy across benchmarks ✅ Fastest Omnimatte method – 24 FPS on an A100 GPU ✅ No training or optimization required

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare