Guy Yariv (@guy_yariv) Twitter Tweets • TwiCopy

William Lamkin

a year ago

Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation website: guyyariv.github.io/TTM/ abs: arxiv.org/abs/2501.03059

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Release Announcement!📢💣 🥁🎷JASCO 🎶🪇 training & inference code + model weights are out! Paper📜: arxiv.org/abs/2406.10970 Samples🔊: pages.cs.huji.ac.il/adiyoss-lab/JA… Code🐍: github.com/facebookresear… Models🤗: huggingface.co/facebook/jasco… Alon Ziv Itai Gat Felix Kreuk Yossi Adi

thumb_up_off_alt72

chat_bubble_outline6

repeat21

shareShare

Hila Chefer

@hila_chefer

a year ago

VideoJAM is our new framework for improved motion generation from AI at Meta We show that video generators struggle with motion because the training objective favors appearance over dynamics. VideoJAM directly adresses this **without any extra data or scaling** 👇🧵

thumb_up_off_alt1,1K

chat_bubble_outline60

repeat194

shareShare

Matan Levy

@matanlvy

a year ago

Who recognizes where it is? 🔥 Excited to share that our paper, led by Issar Tzachor, "EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition," has been accepted at #ICLR2025! The VPR task aims to answer this question ⬇️

thumb_up_off_alt32

chat_bubble_outline1

repeat8

shareShare

Gallil Maimon

@gallilmaimon

10 months ago

🗣️🧠 Speech Language Models require lots of compute to train, right? In our new paper, we test is it possible to train an SLM on 1xA5000 gpu in 24 hours? The results may surprise you (they even surprised us)! Tips, open source resources, full paper 👇🏻

thumb_up_off_alt136

chat_bubble_outline7

repeat37

shareShare

Avishai Elmakies

@avishaielm37946

10 months ago

🚨 New Paper Alert 🚨 Can you train a **good** speech language model using 24G in only one day? Yes! Yes you can! We show exactly how this can be done and release everything to the public, including code and datasets. More info can be found in the tweet below

thumb_up_off_alt11

chat_bubble_outline0

repeat2

shareShare

Guy Yariv

@guy_yariv

10 months ago

I'm thrilled to announce that Through-The-Mask (TTM) has been accepted to #CVPR2025! TTM is an I2V generation framework that leverages mask-based motion trajectories to enhance object-specific motion and maintain consistency, especially in multi-object scenarios More details👇

thumb_up_off_alt44

chat_bubble_outline7

repeat7

shareShare

Guy Yariv

@guy_yariv

9 months ago

Introducing RewardSDS, a text-to-3D score distillation method that enhances SDS by using reward-weighted sampling to prioritize noise samples based on alignment scores, achieving fine-grained user alignment. Led by the great Itay Chachy

thumb_up_off_alt17

chat_bubble_outline0

repeat1

shareShare

Eliahu Horwitz | @ ICLR2025

@eliahuhorwitz

9 months ago

🚨 New paper alert! 🚨 Millions of neural networks now populate public repositories like Hugging Face 🤗, but most lack documentation. So, we decided to build an Atlas 🗺️ Project: horwitz.ai/model-atlas Demo: huggingface.co/spaces/Eliahu/… 🧵👇🏻 Here's what we found:

thumb_up_off_alt66

chat_bubble_outline5

repeat14

shareShare

Ori Yoran

@oriyoran

9 months ago

New #ICLR2024 paper! The KoLMogorov Test: can CodeLMs compress data by code generation? The optimal compression for a sequence is the shortest program that generates it. Empirically, LMs struggle even on simple sequences, but can be trained to outperform current methods! 🧵1/7

thumb_up_off_alt292

chat_bubble_outline8

repeat47

shareShare

Yair Shpitzer

@yairshp

9 months ago

🚀Introducing SISO – a plug-and-play approach for image personalization using just one image!

thumb_up_off_alt136

chat_bubble_outline4

repeat22

shareShare

Gallil Maimon

@gallilmaimon

9 months ago

Many modern SpeechLMs are trained with Speech-Text interleaving. How does this impact scaling trends? In our new paper, we train several dozen SLMs, and show - quite a lot! So there is room for optimism 😊 Key insights, code, models, full paper 👇🏻

thumb_up_off_alt69

chat_bubble_outline4

repeat20

shareShare

Michael Hassid

@michaelhassid

7 months ago

The longer reasoning LLM thinks - the more likely to be correct, right? Apparently not. Presenting our paper: “Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning”. Link: arxiv.org/abs/2505.17813 1/n

thumb_up_off_alt104

chat_bubble_outline5

repeat34

shareShare

Iddo Yosha

@iddoyosha

7 months ago

🚨 Happy to share our #Interspeech2025 paper! "WhiStress: Enriching Transcriptions with Sentence Stress Detection" Sentence stress is a word-level prosodic cue that marks contrast or intent. WhiStress detects it alongside transcription—no alignment needed. Paper, code, demo 👇

thumb_up_off_alt33

chat_bubble_outline2

repeat11

shareShare

Hila Chefer

@hila_chefer

7 months ago

Beyond excited to share FlowMo! We found that the latent representations by video models implicitly encode motion information, and can guide the model toward coherent motion at inference time Very proud of ariel shaulov Itay Hazan for this work! Plus, it’s open source! 🥳

thumb_up_off_alt104

chat_bubble_outline8

repeat13

shareShare

Or Tal

@or__tal

6 months ago

Which modeling to choose for text-to-music generation? We run a head-to-head comparison to figure it out. Same data, same architecture - AR vs FM. 👇 If you care about fidelity, speed, control, or editing see this thread. 🔗huggingface.co/spaces/ortal16… 📄arxiv.org/abs/2506.08570 1/6

thumb_up_off_alt39

chat_bubble_outline1

repeat12

shareShare

Gallil Maimon

@gallilmaimon

6 months ago

🎵💬 If you are interested in Audio Tokenisers, you should check out our new work! We empirically analysed existing tokenisers from every way - reconstruction, downstream, LMs and more. Grab yourself a ☕/🍺 and sit down for a read!

thumb_up_off_alt84

chat_bubble_outline1

repeat22

shareShare

Itai Gat

@itai_gat

6 months ago

Excited to share our recent work on corrector sampling in language models! A new sampling method that mitigates error accumulation by iteratively revisiting tokens in a window of previously generated text. With: Neta Shaul Uriel Singer Yaron Lipman Link: arxiv.org/abs/2506.06215

thumb_up_off_alt88

chat_bubble_outline4

repeat21

shareShare

Neta Shaul

@shaulneta

6 months ago

[1/n] New paper alert! 🚀 Excited to introduce 𝐓𝐫𝐚𝐧𝐬𝐢𝐭𝐢𝐨𝐧 𝐌𝐚𝐭𝐜𝐡𝐢𝐧𝐠 (𝐓𝐌)! We're replacing short-timestep kernels from Flow Matching/Diffusion with... a generative model🤯, achieving SOTA text-2-image generation! Uriel Singer Itai Gat Yaron Lipman

thumb_up_off_alt254

chat_bubble_outline4

repeat42

shareShare