Mohit Bansal (@mohitban47) 's Twitter Profile
Mohit Bansal

@mohitban47

Parker Distinguished Prof, @UNC. Program Chair #EMNLP2024. Director MURGeLab.cs.unc.edu (@uncnlp). @Berkeley_AI @TTIC_Connect @IITKanpur #NLP #CV #AI #ML

ID: 830355049

linkhttp://www.cs.unc.edu/~mbansal/ calendar_today18-09-2012 04:25:22

4,4K Tweet

10,10K Takipçi

696 Takip Edilen

CoLLAs 2025 (@collas_conf) 's Twitter Profile Photo

🤖 Jaehong Yoon (NTU Singapore) Talk: Toward Continually Growing Embodied AIs via Selective and Purposeful Experience From multimodal LLMs to LLM-generated training environments, Jaehong shows how purposeful experience helps agents grow efficiently. #EmbodiedAI #ContinualLearning

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents "We present BIFROST-1, a unified framework that bridges pretrained multimodal LLMs (MLLMs) and diffusion models using patch-level CLIP image embeddings as latent variables, which are natively

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

"We present BIFROST-1, a unified framework that bridges pretrained multimodal LLMs (MLLMs) and diffusion models using patch-level CLIP image embeddings as latent variables, which are natively
Han Lin (@hanlin_hl) 's Twitter Profile Photo

🤔 Can we bridge MLLMs and diffusion models more natively and efficiently, by having MLLMs produce patch-level CLIP latents already aligned with their visual encoders, while fully preserving MLLM's visual reasoning capabilities? Introducing Bifrost-1: 🌈 > High-Fidelity

🤔 Can we bridge MLLMs and diffusion models more natively and efficiently, by having MLLMs produce patch-level CLIP latents already aligned with their visual encoders, while fully preserving MLLM's visual reasoning capabilities?

Introducing Bifrost-1: 🌈

> High-Fidelity
Jaemin Cho (on faculty job market) (@jmin__cho) 's Twitter Profile Photo

Introducing Bifrost-1 Previous approaches to combine LLMs with diffusion models for image generation train the LLMs to produce visual tokens—essentially "a foreign language" they must learn—to communicate with the diffusion model What if MLLMs could connect to diffusion models

Han Lin (@hanlin_hl) 's Twitter Profile Photo

Thanks AK for sharing our work! For readers interested in our work, please check our project page: bifrost-1.github.io And here is our thread with more details: x.com/hanlin_hl/stat…

Jaemin Cho (on faculty job market) (@jmin__cho) 's Twitter Profile Photo

📢 Introducing RotBench, which tests whether SoTA MLLMs (e.g., GPT-5, GPT-4o, o3, Gemini-2.5-pro) can identify the rotation of input images (0°, 90°, 180°, and 270°). Even frontier MLLMs struggle at this spatial reasoning task that humans solve with >98% Acc. ➡️ Models struggle

📢 Introducing RotBench, which tests whether SoTA MLLMs (e.g., GPT-5, GPT-4o, o3, Gemini-2.5-pro) can identify the rotation of input images (0°, 90°, 180°, and 270°). Even frontier MLLMs struggle at this spatial reasoning task that humans solve with >98% Acc.

➡️ Models struggle
Tianyi Niu (@niu_tianyi) 's Twitter Profile Photo

📢 Excited to announce RotBench! We show that the intuitive task of identifying image rotation is challenging for SoTA MLLMs - even with various forms of auxiliary information (captions, depth maps, segmentation maps), CoT reasoning, ICL, or other guided reasoning approaches.

Elias Stengel-Eskin (on the faculty job market) (@eliaseskin) 's Twitter Profile Photo

🚨 Excited to share RotBench, where we evaluate MLLMs' ability to identify rotation in images. Although humans achieve near 100% accuracy on this, MLLMs struggle across the board, especially with identifying 90° and 270° rotations. We tested a lot of possible solutions (CoT,

Ziyang Wang (@ziyangw00) 's Twitter Profile Photo

🎉Our Video-RTS paper has been accepted at #EMNLP2025 Main!! We propose a novel video reasoning approach that combines data-efficient reinforcement learning (GRPO) with video-adaptive test-time scaling, improving reasoning performance while maintaining efficiency on multiple

Justin Chih-Yao Chen (@cyjustinchen) 's Twitter Profile Photo

Excited to share that MAgICoRe has been accepted to #EMNLP2025 main! 🎉 Our work identifies 3 key challenges in LLM refinement for reasoning: 1) Over-correction on easy problems 2) Fail to localize and fix its own errors 3) Too few refinement iterations for harder problems

Jaehong Yoon (on the faculty job market) (@jaeh0ng_yoon) 's Twitter Profile Photo

🎉 RACCooN got accepted at #EMNLP2025 Main! 🚀 Our MLLM+Video Diffusion (Video-to-Paragraph-to-Video, V2P2V) framework enables effortless video editing w/ auto-generated descriptions, multi-granular pooling & mask planning. RACCooN Achieves +9.4%p human eval & 49.7%↓ FVD,

Shoubin Yu✈️ICLR 2025🇸🇬 (@shoubin621) 's Twitter Profile Photo

🎉Excited to share that our MEXA paper is accepted to #EMNLP2025 Findings! 🚀MEXA is a general, training-free multimodal reasoning framework that dynamically selects and aggregates experts/skills for deep, free-form reasoning, and is flexible & extensible to new

Daeun Lee (@danadaeun) 's Twitter Profile Photo

🎉 Excited to share that our Video-Skill-CoT paper has been accepted to #EMNLP2025 Findings! Video-Skill-CoT is a domain-adaptive video reasoning framework that automatically constructs skill-aware Chain-of-Thought (CoT) supervisions. It builds a shared skill taxonomy from

Jaehong Yoon (on the faculty job market) (@jaeh0ng_yoon) 's Twitter Profile Photo

🥳🥳 Excited to share that our work GLIDER (Global and Local Instruction-Driven Expert Router) has been accepted to #EMNLP2025 main conference! Our approach tackles a critical challenge in MoE routing: existing methods excel at either held-in OR held-out tasks, but rarely both.

Elias Stengel-Eskin (on the faculty job market) (@eliaseskin) 's Twitter Profile Photo

🚨 Excited to share new work on LLMs and loopholes, accepted to #EMNLP2025 main! When models are faced with conflicting goals and ambiguous instructions that would let them exploit a loophole, many of the strongest models (Qwen, GPT4o, Claude, Gemini) do. This is a new risk and

🚨 Excited to share new work on LLMs and loopholes, accepted to #EMNLP2025 main!

When models are faced with conflicting goals and ambiguous instructions that would let them exploit a loophole, many of the strongest models (Qwen, GPT4o, Claude, Gemini) do.

This is a new risk and