Gedas Bertasius (@gberta227) Twitter Tweets • TwiCopy

Mohaiminul (Emon) Islam

@mmiemon

5 months ago

Great to see our paper ReVisionLLM featured by MCML blog! Gedas Bertasius #CVPR2025

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

🚀 On the job market! Final-year PhD @ UNC Chapel Hill working on computer vision, video understanding, multimodal LLMs & AI agents. 2x Research Scientist Intern Meta 🔍 Seeking Research Scientist/Engineer roles! 🔗 md-mohaiminul.github.io 📧 mmiemon [at] cs [dot] unc [dot] edu

thumb_up_off_alt17

chat_bubble_outline0

repeat4

shareShare

Ziyang Wang

@ziyangw00

4 months ago

🚨Introducing Video-RTS: Resource-Efficient RL for Video Reasoning with Adaptive Video TTS! While RL-based video reasoning with LLMs has advanced, the reliance on large-scale SFT with extensive video data and long CoT annotations remains a major bottleneck. Video-RTS tackles

thumb_up_off_alt40

chat_bubble_outline1

repeat28

shareShare

Mohaiminul (Emon) Islam

@mmiemon

4 months ago

Checkout our new paper: Video-RTS 🎥 A data-efficient RL method for complex video reasoning tasks. 🔹 Pure RL w/ output-based rewards. 🔹 Novel sparse-to-dense Test-Time Scaling (TTS) to expand input frames via self-consistency. 💥 96.4% less training data! More in the thread👇

thumb_up_off_alt13

chat_bubble_outline0

repeat6

shareShare

Ziyang Wang

@ziyangw00

3 months ago

🎉Our Video-RTS paper has been accepted at #EMNLP2025 Main!! We propose a novel video reasoning approach that combines data-efficient reinforcement learning (GRPO) with video-adaptive test-time scaling, improving reasoning performance while maintaining efficiency on multiple

thumb_up_off_alt39

chat_bubble_outline1

repeat28

shareShare

Junlin (Hans) Han

@han_junlin

2 months ago

Excited to share our new work: “Learning to See Before Seeing”! 🧠➡️👀 We investigate an interesting phenomeno: how do LLMs, trained only on text, learn about the visual world? Project page: junlinhan.github.io/projects/lsbs/

thumb_up_off_alt142

chat_bubble_outline7

repeat24

shareShare

Zaid Khan

@codezakh

a month ago

How can an agent reverse engineer the underlying laws of an unknown, hostile & stochastic environment in “one life”, without millions of steps + human-provided goals / rewards? In our work, we: 1️⃣ infer an executable symbolic world model (a probabilistic program capturing

thumb_up_off_alt83

chat_bubble_outline2

repeat36

shareShare

Gedas Bertasius

@gberta227

a month ago

Can AI models teach you to shoot like Steph Curry? 🏀 Come to my talk on Challenges in Expert-Level Skill Analysis at 4:30 pm in Room 318-A tomorrow (Sunday) to find out! sauafg-workshop.github.io #ICCV2025

thumb_up_off_alt15

chat_bubble_outline0

repeat3

shareShare

Homanga Bharadhwaj

@mangahomanga

a month ago

I'll be joining the faculty Johns Hopkins University late next year as a tenure-track assistant professor in JHU Computer Science Looking for PhD students to join me tackling fun problems in robot manipulation, learning from human data, understanding+predicting physical interactions, and beyond!

I'll be joining the faculty <a href="/JohnsHopkins/">Johns Hopkins University</a> late next year as a tenure-track assistant professor in <a href="/JHUCompSci/">JHU Computer Science</a>

Looking for PhD students to join me tackling fun problems in robot manipulation, learning from human data, understanding+predicting physical interactions, and beyond!

thumb_up_off_alt754

chat_bubble_outline75

repeat98

shareShare

Zaid Khan

@codezakh

a month ago

🥳 Honored and grateful to be awarded an NDSEG Fellowship in Computer Science! 💫🇺🇸 Big thanks to my advisor Mohit Bansal for his guidance, and shoutout to my lab mates at UNC AI, collaborators, internship advisors, and mentors for their support 🤗 Excited to continue

thumb_up_off_alt48

chat_bubble_outline15

repeat20

shareShare

Jaehong Yoon (on the faculty job market)

@jaeh0ng_yoon

17 days ago

🎉 Excited to share that 5/5 of my papers (3 main, 2 findings) have been accepted at #EMNLP2025, in video/multimodal reasoning, instructional video editing, and efficient LLM adaptation & reasoning! 🚨 I’m recruiting Ph.D. students to join the Multimodal AI Group at NTU College

thumb_up_off_alt312

chat_bubble_outline15

repeat33

shareShare

Mohit Bansal

@mohitban47

16 days ago

🚨 Check out our awesome students/postdocs' papers at #EMNLP2025 and say hi to them 👋! Also, I will give a keynote (virtually) on "Attributable, Conflict-Robust, and Multimodal Summarization with Multi-Source Retrieval" at the NewSumm workshop. -- Jaehong (in-person) finished

thumb_up_off_alt63

chat_bubble_outline2

repeat30

shareShare

Yiyang Zhou

@aiyiyangz

15 days ago

🚨 BREAKING: AI Can't Actually See Videos. New benchmark shows mainstream LVLMs barely hit 60% accuracy—while humans reach 94.82%. This isn’t a glitch—it’s a fundamental failure in video understanding. LVLMs are doing visual theater, not real comprehension.

thumb_up_off_alt20

chat_bubble_outline2

repeat9

shareShare

Mohit Bansal

@mohitban47

15 days ago

Justin Chih-Yao Chen Archiki Prasad Swarnadeep Saha Elias Stengel-Eskin -- Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning Ziyang Wang Jaehong Yoon Shoubin Yu Mohaiminul (Emon) Islam (on job market) Gedas Bertasius arxiv.org/abs/2507.06485 x.com/ZiyangW00/stat…

thumb_up_off_alt8

chat_bubble_outline1

repeat3

shareShare

Tanveer Hannan

@hannan_tanveer

2 days ago

Our latest paper, DocSLM, developed during my internship at Microsoft, is now on arXiv: arxiv.org/abs/2511.11313. It is an efficient & compact Vision-Language Model to process long & complex documents while operating on resource-constrained edge devices like mobiles & laptops.

thumb_up_off_alt5

chat_bubble_outline2

repeat2

shareShare

Eivinas Butkus

@eivinasbutkus

2 days ago

1/ Can causal models and causal inference engines emerge through next-token prediction? Judea Pearl and others (Matej Zečević) have argued no. We present behavioral and mechanistic evidence that this is possible. #neurips2025 #NeurIPS

thumb_up_off_alt14

chat_bubble_outline3

repeat6

shareShare

Tyler Zhu

@tyleryzhu

2 days ago

Today seems to be a fitting day for Google DeepMind news, so I'm excited to announce our new preprint! Prior work suggests that text & img repr's are converging, albeit weakly. We found these same models actually have strong alignment; the inputs were too impoverished to see it!

Today seems to be a fitting day for <a href="/GoogleDeepMind/">Google DeepMind</a> news, so I'm excited to announce our new preprint!

Prior work suggests that text & img repr's are converging, albeit weakly. We found these same models actually have strong alignment; the inputs were too impoverished to see it!

thumb_up_off_alt102

chat_bubble_outline11

repeat19

shareShare