Yasser Benigmim (@yasserbenigmim) Twitter Tweets • TwiCopy

Yasser Benigmim

@yasserbenigmim

+ Follow

PhD student at Télécom Paris, @DeepLearning, @ComputerVision

ID: 2369825181

linkhttps://yasserben.github.io/ calendar_today27-02-2014 21:31:21

46 Tweet

122 Takipçi

1,1K Takip Edilen

good girl

@goodgirlxsz

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Happy to present E.T. the Exceptional Trajectories: Text-to-Camera-Trajectory Generation with Character Awareness. ECCV2024 with Nicolas DUFOUR, Xi WANG, Marc Christie and Vicky Kalogeiton Paper: arxiv.org/pdf/2407.01516 Webpage: lix.polytechnique.fr/vista/projects…

thumb_up_off_alt16

chat_bubble_outline2

repeat5

shareShare

rid

@ridouaneg_

a year ago

(1/8) 🎬 Introducing the Short Film Dataset (SFD), a long video QA benchmark with 1k short films and 5k questions. Why another videoQA dataset? 📖 Story-level QAs 🎥 Publicly available videos 🔒 Minimal data leakage ⏳ Long temporal context questions shortfilmdataset.github.io

thumb_up_off_alt24

chat_bubble_outline2

repeat12

shareShare

Subhankar Roy

@sroy907

a year ago

Less is more! Continual Learning with task-specific ViTs is computationally expensive. To afford task-specific ViTs we propose to summarize the patch tokens. The reduction in token length through patch summarization reduces MSA operations w/o hurting performance. More info👇

thumb_up_off_alt8

chat_bubble_outline1

repeat1

shareShare

Jhony H. Giraldo

@jhonyhgiraldo

a year ago

The internship recruitment season has started! We are pleased to announce several internship positions in the broader field of geometric deep learning, available at CentraleSupélec, Centre Inria de Saclay, Université Paris-Saclay, Télécom Paris, and Institut Polytechnique de Paris.

thumb_up_off_alt96

chat_bubble_outline2

repeat18

shareShare

Hugo

@mldhug

a year ago

You want to give audio abilities to your VLM without compromising its vision performance? You want to align your audio encoder with a pretrained image encoder without suffering from the modality gap? Check our #NeurIPS2024 paper with Michel Olvera Stéphane LATHUILIÈRE and Slim Essid

thumb_up_off_alt19

chat_bubble_outline1

repeat3

shareShare

Xi WANG

@xiwang92

a year ago

🎥 AKiRa provides control over camera motion and optics (focal length, distortion, aperture) in video diffusion, enabling cinematic effects like fisheye, focus shifts, and dolly zoom. 📄 Paper: arxiv.org/abs/2412.14158 👉 Project Page: lix.polytechnique.fr/vista/projects… 🧵👇

thumb_up_off_alt52

chat_bubble_outline2

repeat17

shareShare

Gianni Franchi

@giannifranchi10

7 months ago

🚨 New survey published! 🔍 Explainability & Vision Foundation Models dives into the intersection of #XAI and #FoundationModels in vision. We present: ✅ A novel taxonomy ✅ Key challenges ✅ Foundation Models 📖 Read it here 👉 shorturl.at/8S4eD #AI #ComputerVision

thumb_up_off_alt8

chat_bubble_outline0

repeat3

shareShare

Imad

@imadmarouf3

6 months ago

I’m building CurriboxAI — an AI SaaS to help recruiters & ESNs protect their consultants from client bypass. Here’s the tech stack powering it so far — built for speed & automation. Always open to feedback & curious what you would’ve done differently 👇 #buildinpublic #saas

thumb_up_off_alt2

chat_bubble_outline1

repeat1

shareShare

Junyu Xie

@junyuxiearthur

5 months ago

Movies are more than just video clips, they are stories! 🎬 We’re hosting the 1st SLoMO Workshop at #ICCV2025 to discuss Story-Level Movie Understanding & Audio Descriptions! Website: slomo-workshop.github.io Competition: huggingface.co/spaces/SLoMO-W…

thumb_up_off_alt40

chat_bubble_outline1

repeat14

shareShare

Yasser Benigmim

good girl

Robin Courant

rid

Subhankar Roy

Jhony H. Giraldo

Hugo

Xi WANG

Gianni Franchi

Imad

Junyu Xie