Sagie Benaim (@benaimsagie) 's Twitter Profile
Sagie Benaim

@benaimsagie

Assistant Professor @CseHuji | Computer Vision and Machine Learning.

ID: 852149937823506432

linkhttps://sagiebenaim.github.io/ calendar_today12-04-2017 13:22:32

94 Tweet

470 Takipçi

938 Takip Edilen

AK (@_akhaliq) 's Twitter Profile Photo

Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation paper page: huggingface.co/papers/2309.16… consider the task of generating diverse and realistic videos guided by natural audio samples from a wide variety of semantic classes. For this task, the videos

Guy Yariv (@guy_yariv) 's Twitter Profile Photo

1/n Our paper, "Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation," has been accepted at #AAAI2024! This paper is part of the "Turning Sound into Sight" series, which addresses audio-to-visual generation through text-conditioned models.

MrNeRF (@janusch_patas) 's Twitter Profile Photo

DGD: Dynamic 3D Gaussians Distillation arxiv.org/abs/2405.19321 Project: isaaclabe.github.io/DGD-Website/ Method ⬇️ 1 I 2

Dreaming Tulpa 🥓👑 (@dreamingtulpa) 's Twitter Profile Photo

PuTT is able to optimize the highly compact tensor train representation, making it possible to use them for image fitting, 3D fitting, and novel view synthesis. The "Girl With Pearl Earrings" at 16k resolution achieves a compression factor of 200. Links ⬇️

Sebastian Loeschcke (@sloeschcke) 's Twitter Profile Photo

Excited to share PuTT, accepted to #ICML2024 🎉. PuTT efficiently learns high-quality visual representations with tensor trains, achieving SOTA performance in various 2D/3D compression tasks. Great collab with D. Wang, C. Leth-Espensen, Serge Belongie, Michael Kastoryano, Sagie Benaim

Yizhak Ben-Shabat (Itzik) 💔 (@sitzikbs) 's Twitter Profile Photo

I was deeply offended by a slide in a recent talk at #CVPR2024 that falsely accused my country of genocide. Such baseless political statements have no place in our scientific community. Let's keep our focus on advancing science and leave politics at the door. #CVPR2025

I was deeply offended by a slide in a recent talk at #CVPR2024 that falsely accused my country of genocide. Such baseless political statements have no place in our scientific community. Let's keep our focus on advancing science and leave politics at the door. <a href="/CVPR/">#CVPR2025</a>
Guy Yariv (@guy_yariv) 's Twitter Profile Photo

1/ Commonsense reasoning needs multimodal knowledge, yet current LLMs focus mostly on text, limiting their integration of crucial visual information. We introduce vLMIG, a method that enhances LLMs' visual commonsense by integrating images into the decision-making process

1/ Commonsense reasoning needs multimodal knowledge, yet current LLMs focus mostly on text, limiting their integration of crucial visual information.

We introduce vLMIG, a method that enhances LLMs' visual commonsense by integrating images into the decision-making process
Sebastian Loeschcke (@sloeschcke) 's Twitter Profile Photo

Excited for ICML Conference in Vienna! DM me if you want to chat. I’ll be presenting: -LoQT: Low Rank Adapters for Quantized Pre-Training (𝐎𝐑𝐀𝐋) - 27 Jul, 10:10-10.30, Hall A1 -Coarse-To-Fine Tensor Trains for Compact Visual Representations - 25 Jul, 11:30 -13.00, Hall C 4-9 #203

Excited for <a href="/icmlconf/">ICML Conference</a> in Vienna! 
DM me if you want to chat. 
I’ll be presenting:
-LoQT: Low Rank Adapters for Quantized Pre-Training (𝐎𝐑𝐀𝐋) - 27 Jul, 10:10-10.30, Hall A1
-Coarse-To-Fine Tensor Trains for Compact Visual Representations - 25 Jul, 11:30 -13.00, Hall C 4-9 #203
Isaac Labe (@labe_isaac) 's Twitter Profile Photo

🎊 Excited to share our latest work: “DGD: Dynamic 3D Gaussians Distillation”! 🚀 To appear at #ECCV2024. DGD distills 2D semantic features into dynamic 3D Gaussians, enabling the reconstruction and semantic segmentation of dynamic objects in 3D using only a user click. 🔗

Sebastian Loeschcke (@sloeschcke) 's Twitter Profile Photo

Visit poster #203 today at 11:30 in Hall C ICML Conference: “Coarse-To-Fine Tensor Trains for Compact Visual Representations.” We present a novel method for training Quantized Tensor Trains, achieving highly compressed and high-quality results for view synthesis and 2D/3D compression.

Sagie Benaim (@benaimsagie) 's Twitter Profile Photo

If you are at #ECCV2024 and interested in distilling semantics into dynamic 3D gaussian splatting, please visit our poster #292 in the Thursday morning session.

Guy Yariv (@guy_yariv) 's Twitter Profile Photo

[1/8] Recent work has shown impressive Image-to-Video (I2V) generation results. However, accurately articulating multiple interacting objects and complex motions remains challenging. In our new work, we take a step toward addressing this challenge.

Guy Yariv (@guy_yariv) 's Twitter Profile Photo

I'm thrilled to announce that Through-The-Mask (TTM) has been accepted to #CVPR2025! TTM is an I2V generation framework that leverages mask-based motion trajectories to enhance object-specific motion and maintain consistency, especially in multi-object scenarios More details👇

Itay Chachy (@itaychachy) 's Twitter Profile Photo

[1/10]🚨 Introducing RewardSDS! 🚨 Standard SDS-based text-to-3D methods struggle with fine-grained alignment to user intent, often leading to artifacts or misaligned generations. Our solution 👇

Gal Fiebelman (@galfiebelman) 's Twitter Profile Photo

Excited to announce that "4-LEGS: 4D Language Embedded Gaussian Splatting" has been accepted to #Eurographics2025! 🎉 We connect language with a 4D Gaussian Splatting representation to enable spatiotemporal localization using just text prompts! tau-vailab.github.io/4-LEGS/ [1/7]

MrNeRF (@janusch_patas) 's Twitter Profile Photo

Let it Snow! Animating Static Gaussian Scenes With Dynamic Weather Effects Contributions: • A novel framework for incorporating physically-based global dynamic effects into static 3D Gaussian scenes. • Ensuring realistic scene interaction and collision effects by producing

Rana Hanocka (@ranahanocka) 's Twitter Profile Photo

We’ve been building something we’re 𝑟𝑒𝑎𝑙𝑙𝑦 excited about – LL3M: LLM-powered agents that turn text into editable 3D assets. LL3M models shapes as interpretable Blender code, making geometry, appearance, and style easy to modify. 🔗 threedle.github.io/ll3m 1/

Kwang Moo Yi (@kwangmoo_yi) 's Twitter Profile Photo

Dayani et al., "MV-RAG: Retrieval Augmented Multiview Diffusion" Retrieval augmentation for 3D recon with multi-view diffusion models. Trains both with 3D and 2D assets during training, and uses 2D natural images for inference. Makes sense, I guess? Many things look alike!

Dayani et al., "MV-RAG: Retrieval Augmented Multiview Diffusion"

Retrieval augmentation for 3D recon with multi-view diffusion models. Trains both with 3D and 2D assets during training, and uses 2D natural images for inference. Makes sense, I guess? Many things look alike!
Yosef Dayani (@yosefday) 's Twitter Profile Photo

[1/10] 🤔 What if you wanted to generate a 3D model of a “Bolognese dog” 🐕 or a “Labubu doll” 🧸? Try it with existing text-to-3D models → they collapse. Why? These concepts are rare or new, and the model has never seen them. 🚀 Our solution: MV-RAG See details below ⬇️

[1/10] 🤔 What if you wanted to generate a 3D model of a “Bolognese dog” 🐕 or a “Labubu doll” 🧸?
 Try it with existing text-to-3D models → they collapse.
 Why? These concepts are rare or new, and the model has never seen them.

🚀 Our solution: MV-RAG

See details below ⬇️