May Fung (@may_f1_) Twitter Tweets • TwiCopy

Peng (Richard) Xia ✈️ ICLR 2025

4 months ago

🔥 Excited to share our latest work: WebWatcher 🕵️‍♂️ An open-source multimodal agent that achieves new SOTA on multiple challenging vision-language (VL) deep research benchmarks — outperforming GPT-4o & Gemini! Paper: arxiv.org/abs/2508.05748 Code: github.com/Alibaba-NLP/We…

thumb_up_off_alt61

chat_bubble_outline1

repeat11

shareShare

Alibaba Tongyi_Lab

@labtongyi96898

3 months ago

Huggingface daily papers #1 Paper of the day! WebWatcher: Open-source multimodal agent that crushes VL research benchmarks, beating GPT-4o & Gemini! Introducing BrowseComp-VL: First open-source VL benchmark for deep web research! > Dynamic tool calls (search/browse/OCR/code)

thumb_up_off_alt293

chat_bubble_outline5

repeat43

shareShare

Alibaba Tongyi_Lab

@labtongyi96898

3 months ago

Thrilled to open-source WebWatcher: our vision-language deep research agent from @Alibaba_NLP! Available in 7B & 32B parameter scales for the community. Achieving SOTA on the toughest VQA benchmarks: • HLE-VL: 13.6% (vs GPT-4o's 9.8%) • BrowseComp-VL: 27.0% (2x GPT-4o!) •

thumb_up_off_alt642

chat_bubble_outline40

repeat75

shareShare

Zheyu Fan

@zheyufan

3 months ago

🎉 New Preprint Alert! 🎉 How can we improve Video-LLM's video understanding inspired by human's task-aware information filtering and cognitive load purification? 🤔 --"Temporal Visual Screening for Video-LLMs"

thumb_up_off_alt10

chat_bubble_outline2

repeat2

shareShare

JingyuanLiu

@jingyuanliu123

3 months ago

I was lucky to work in both China and the US LLM labs, and I've been thinking this for a while. The current values of pretraining are indeed different: US labs be like: - lots of GPUs and much larger flops run - Treating stabilities more seriously, and could not tolerate spikes

thumb_up_off_alt3,3K

chat_bubble_outline59

repeat343

shareShare

Heng Ji

@hengjinlp

2 months ago

Accepted as NeurIPS2025 Spotlight! Existing large multimodal models (LMMs) have very poor visual understanding and reasoning over part-level attributes and affordances (only 5.9% gIoU). We developed novel part-centric LMMs to address these challenges arxiv.org/pdf/2505.20759

thumb_up_off_alt44

chat_bubble_outline0

repeat10

shareShare

Kyunghyun Cho

@kchonyc

2 months ago

when you give up on this nebulous idea and illusion of prestige, you will finally find peace and freedom. submit to TMLR and JMLR.

thumb_up_off_alt522

chat_bubble_outline10

repeat29

shareShare

Grégoire Mialon

@mialon_gregoire

2 months ago

🏗️ ARE: scaling up agent environments and evaluations In the LLM+RL era, evals and envs are the bottleneck Happy to release Gaia2, an extensible benchmark for agents aiming to reduce the sim2real gap + ARE, the platform in which Gaia2 is built Enjoy evaluating your agents! 👇

thumb_up_off_alt102

chat_bubble_outline1

repeat28

shareShare

Cheng Qian

@qiancheng1231

2 months ago

🚀 Introducing UserRL: a new framework to train agents that truly assist users through proactive interaction, not just chase static benchmarking scores. 📄 Paper: arxiv.org/pdf/2509.19736 💻 Code: github.com/SalesforceAIRe…

thumb_up_off_alt218

chat_bubble_outline4

repeat45

shareShare

Yuan He

@lawhy_x

2 months ago

The decision notification letters have been sent! 🎉 We sincerely thank all authors and reviewers for their valuable contributions to this workshop. Kudos to our organizing committee, advisors, and support team for their incredible efforts: Guohao Li 🐫 May Fung (hiring postdocs) Qingyun Wang

thumb_up_off_alt13

chat_bubble_outline0

repeat2

shareShare

May Fung

@may_f1_

2 months ago

See many of you (and your brilliant work in scaling agent environments) in San Diego! 🤩🎉 #NeurIPS2025

thumb_up_off_alt10

chat_bubble_outline0

repeat0

shareShare

Conference on Language Modeling

@colm_conf

2 months ago

We are excited to announce 4 outstanding papers 🏆🏆🏆🏆 --> 🧵

thumb_up_off_alt135

chat_bubble_outline2

repeat17

shareShare

Heng Ji

@hengjinlp

2 months ago

We are very excited to welcome Dr. Ruhi Sarikaya from Alexa AI, Amazon to give a talk “Path to Artificial General Intelligence: Past, Present, and Future” at our UIUC CS Colloquium this Wed, 3:30 PM CT, in SC2405 and via Zoom: calendars.illinois.edu/detail/2654?ev…

thumb_up_off_alt24

chat_bubble_outline0

repeat1

shareShare

Sharon Y. Li

@sharonyixuanli

2 months ago

We hear increasing discussion about aligning LLM with “diverse human values.” But what’s the actual price of pluralism? 🧮 In our #NeurIPS2025 paper (with Shawn Im), we move this debate from the philosophical to the measurable — presenting the first theoretical scaling law

thumb_up_off_alt284

chat_bubble_outline7

repeat33

shareShare

thamar |

@thamar_solorio

a month ago

Great keynote by the wonderful Heng Ji! Now I feel like I need to move into NLP+Science. I'm sure that we will see much more work in this direction thanks to her powerful advocating and her impactful work. EMNLP 2025 #EMNLP2025

Great keynote by the wonderful <a href="/hengjinlp/">Heng Ji</a>! Now I feel like I need to move into NLP+Science. I'm sure that we will see much more work in this direction thanks to her powerful advocating and her impactful work.
<a href="/emnlpmeeting/">EMNLP 2025</a>
#EMNLP2025

thumb_up_off_alt22

chat_bubble_outline0

repeat3

shareShare

Manling Li

@manlingli_

a month ago

#EMNLP Keynote by Heng Ji: No more Processing. Time to Discover! AI for Science is just so exciting! Let us make LLMs discover like true scientists: Observe → Think → Propose and Verify (A pity to miss the talk. Photo from May Fung (hiring postdocs) EMNLP 2025 )

#EMNLP Keynote by <a href="/hengjinlp/">Heng Ji</a>:

No more Processing. Time to Discover!

AI for Science is just so exciting! Let us make LLMs discover like true scientists: Observe → Think → Propose and Verify

(A pity to miss the talk. Photo from <a href="/May_F1_/">May Fung (hiring postdocs)</a> <a href="/emnlpmeeting/">EMNLP 2025</a> )

thumb_up_off_alt112

chat_bubble_outline4

repeat11

shareShare

Xingyao Wang

@xingyaow_

a month ago

So it seems Anthropic just rediscovered CodeAct 😄 arxiv.org/abs/2402.01030

thumb_up_off_alt487

chat_bubble_outline14

repeat27

shareShare

Heng Ji

@hengjinlp

a month ago

Happy to be with many of my academic grandchildren from HKUST, and my PhD student Cheng Qian Cheng Qian @ EMNLP2025 . So proud of Prof. May Fung May Fung (hiring postdocs) and the awesome RenAI lab she has built: renai-lab.github.io

Happy to be with many of my academic grandchildren from HKUST, and my PhD student Cheng Qian <a href="/qiancheng1231/">Cheng Qian @ EMNLP2025</a> . So proud of Prof. May Fung <a href="/May_F1_/">May Fung (hiring postdocs)</a> and the awesome RenAI lab she has built: renai-lab.github.io

thumb_up_off_alt97

chat_bubble_outline0

repeat4

shareShare

Jia-Bin Huang

@jbhuang0604

25 days ago

Diffusion language models are making a splash (again)! To learn more about this fascinating topic, check out ⏩ my video tutorial (and references within): youtu.be/8BTOoc0yDVA ⏩discrete diffusion reading group: Discrete Diffusion Reading Group

thumb_up_off_alt230

chat_bubble_outline3

repeat35

shareShare

EMNLP 2025

@emnlpmeeting

24 days ago

🎉 Congratulations to all #EMNLP2025 award winners 🎉 Starting with the ✨Best Paper award ✨: "Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index" by Hao Xu, Jiacheng Liu, Yejin Choi, Noah A. Smith, and Hannaneh Hajishirzi aclanthology.org/2025.emnlp-mai… 1/n

thumb_up_off_alt192

chat_bubble_outline2

repeat28

shareShare