Guy Yariv (@guy_yariv) 's Twitter Profile
Guy Yariv

@guy_yariv

Generative AI Researcher |
Research Scientist Intern @ Meta | PhD Candidate @ HUJI

ID: 1596815042611232771

linkhttps://guyyariv.github.io/ calendar_today27-11-2022 10:36:03

99 Tweet

146 Takipçi

120 Takip Edilen

William Lamkin (@williamlamkin) 's Twitter Profile Photo

Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation website: guyyariv.github.io/TTM/ abs: arxiv.org/abs/2501.03059

Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation

website: guyyariv.github.io/TTM/
abs: arxiv.org/abs/2501.03059
Or Tal (@or__tal) 's Twitter Profile Photo

Release Announcement!📢💣 🥁🎷JASCO 🎶🪇 training & inference code + model weights are out! Paper📜: arxiv.org/abs/2406.10970 Samples🔊: pages.cs.huji.ac.il/adiyoss-lab/JA… Code🐍: github.com/facebookresear… Models🤗: huggingface.co/facebook/jasco… Alon Ziv Itai Gat Felix Kreuk Yossi Adi

Hila Chefer (@hila_chefer) 's Twitter Profile Photo

VideoJAM is our new framework for improved motion generation from AI at Meta We show that video generators struggle with motion because the training objective favors appearance over dynamics. VideoJAM directly adresses this **without any extra data or scaling** 👇🧵

Matan Levy (@matanlvy) 's Twitter Profile Photo

Who recognizes where it is? 🔥 Excited to share that our paper, led by Issar Tzachor, "EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition," has been accepted at #ICLR2025! The VPR task aims to answer this question ⬇️

Who recognizes where it is?

🔥 Excited to share that our paper, led by Issar Tzachor, "EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition," has been accepted at #ICLR2025!

The VPR task aims to answer this question ⬇️
Gallil Maimon (@gallilmaimon) 's Twitter Profile Photo

🗣️🧠 Speech Language Models require lots of compute to train, right? In our new paper, we test is it possible to train an SLM on 1xA5000 gpu in 24 hours? The results may surprise you (they even surprised us)! Tips, open source resources, full paper 👇🏻

🗣️🧠 Speech Language Models require lots of compute to train, right? 
In our new paper, we test is it possible to train an SLM on 1xA5000 gpu in 24 hours? 
The results may surprise you (they even surprised us)!
Tips, open source resources, full paper 👇🏻
Avishai Elmakies (@avishaielm37946) 's Twitter Profile Photo

🚨 New Paper Alert 🚨 Can you train a **good** speech language model using 24G in only one day? Yes! Yes you can! We show exactly how this can be done and release everything to the public, including code and datasets. More info can be found in the tweet below

Guy Yariv (@guy_yariv) 's Twitter Profile Photo

I'm thrilled to announce that Through-The-Mask (TTM) has been accepted to #CVPR2025! TTM is an I2V generation framework that leverages mask-based motion trajectories to enhance object-specific motion and maintain consistency, especially in multi-object scenarios More details👇

Guy Yariv (@guy_yariv) 's Twitter Profile Photo

Introducing RewardSDS, a text-to-3D score distillation method that enhances SDS by using reward-weighted sampling to prioritize noise samples based on alignment scores, achieving fine-grained user alignment. Led by the great Itay Chachy

Eliahu Horwitz | @ ICLR2025 (@eliahuhorwitz) 's Twitter Profile Photo

🚨 New paper alert! 🚨 Millions of neural networks now populate public repositories like Hugging Face 🤗, but most lack documentation. So, we decided to build an Atlas 🗺️ Project: horwitz.ai/model-atlas Demo: huggingface.co/spaces/Eliahu/… 🧵👇🏻 Here's what we found:

🚨 New paper alert! 🚨

Millions of neural networks now populate public repositories like Hugging Face 🤗, but most lack documentation. So, we decided to build an Atlas 🗺️

Project: horwitz.ai/model-atlas
Demo: huggingface.co/spaces/Eliahu/…

🧵👇🏻 Here's what we found:
Ori Yoran (@oriyoran) 's Twitter Profile Photo

New #ICLR2024 paper! The KoLMogorov Test: can CodeLMs compress data by code generation? The optimal compression for a sequence is the shortest program that generates it. Empirically, LMs struggle even on simple sequences, but can be trained to outperform current methods! 🧵1/7

Gallil Maimon (@gallilmaimon) 's Twitter Profile Photo

Many modern SpeechLMs are trained with Speech-Text interleaving. How does this impact scaling trends? In our new paper, we train several dozen SLMs, and show - quite a lot! So there is room for optimism 😊 Key insights, code, models, full paper 👇🏻

Many modern SpeechLMs are trained with Speech-Text interleaving. How does this impact scaling trends?

In our new paper, we train several dozen SLMs, and show - quite a lot! So there is room for optimism 😊

Key insights, code, models, full paper 👇🏻
Michael Hassid (@michaelhassid) 's Twitter Profile Photo

The longer reasoning LLM thinks - the more likely to be correct, right? Apparently not. Presenting our paper: “Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning”. Link: arxiv.org/abs/2505.17813 1/n

The longer reasoning LLM thinks - the more likely to be correct, right?

Apparently not.

Presenting our paper: “Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning”.

Link: arxiv.org/abs/2505.17813

1/n
Iddo Yosha (@iddoyosha) 's Twitter Profile Photo

🚨 Happy to share our #Interspeech2025 paper! "WhiStress: Enriching Transcriptions with Sentence Stress Detection" Sentence stress is a word-level prosodic cue that marks contrast or intent. WhiStress detects it alongside transcription—no alignment needed. Paper, code, demo 👇

Hila Chefer (@hila_chefer) 's Twitter Profile Photo

Beyond excited to share FlowMo! We found that the latent representations by video models implicitly encode motion information, and can guide the model toward coherent motion at inference time Very proud of ariel shaulov Itay Hazan for this work! Plus, it’s open source! 🥳

Or Tal (@or__tal) 's Twitter Profile Photo

Which modeling to choose for text-to-music generation? We run a head-to-head comparison to figure it out. Same data, same architecture - AR vs FM. 👇 If you care about fidelity, speed, control, or editing see this thread. 🔗huggingface.co/spaces/ortal16… 📄arxiv.org/abs/2506.08570 1/6

Which modeling to choose for text-to-music generation?
We run a head-to-head comparison to figure it out.
Same data, same architecture - AR vs FM.
👇 If you care about fidelity, speed, control, or editing see this thread.
🔗huggingface.co/spaces/ortal16…
📄arxiv.org/abs/2506.08570
1/6
Gallil Maimon (@gallilmaimon) 's Twitter Profile Photo

🎵💬 If you are interested in Audio Tokenisers, you should check out our new work! We empirically analysed existing tokenisers from every way - reconstruction, downstream, LMs and more. Grab yourself a ☕/🍺 and sit down for a read!

🎵💬 If you are interested in Audio Tokenisers, you should check out our new work!
We empirically analysed existing tokenisers from every way - reconstruction, downstream, LMs and more.

Grab yourself a ☕/🍺 and sit down for a read!
Itai Gat (@itai_gat) 's Twitter Profile Photo

Excited to share our recent work on corrector sampling in language models! A new sampling method that mitigates error accumulation by iteratively revisiting tokens in a window of previously generated text. With: Neta Shaul Uriel Singer Yaron Lipman Link: arxiv.org/abs/2506.06215

Excited to share our recent work on corrector sampling in language models! A new sampling method that mitigates error accumulation by iteratively revisiting tokens in a window of previously generated text.
With: <a href="/shaulneta/">Neta Shaul</a> <a href="/urielsinger/">Uriel Singer</a> <a href="/lipmanya/">Yaron Lipman</a>
Link: arxiv.org/abs/2506.06215
Neta Shaul (@shaulneta) 's Twitter Profile Photo

[1/n] New paper alert! 🚀 Excited to introduce 𝐓𝐫𝐚𝐧𝐬𝐢𝐭𝐢𝐨𝐧 𝐌𝐚𝐭𝐜𝐡𝐢𝐧𝐠 (𝐓𝐌)! We're replacing short-timestep kernels from Flow Matching/Diffusion with... a generative model🤯, achieving SOTA text-2-image generation! Uriel Singer Itai Gat Yaron Lipman