Yunzhi Yao (@yyztodd) Twitter Tweets • TwiCopy

Ningyu Zhang@ZJU

6 months ago

Introducing LightThinker: Step-by-Step Compression for LLMs 🚀 LightThinker is a new method that enables Large Language Models (LLMs) to dynamically compress intermediate thoughts during reasoning, reducing memory overhead and computational costs while maintaining competitive

thumb_up_off_alt172

chat_bubble_outline7

repeat45

shareShare

Mohsen Fayyaz

@mohsen_fayyaz

5 months ago

new paper! 🌱 Collapse of Dense Retrievers We uncover major vulnerabilities in dense retrievers like Contriever, showing they favor: 📌 Shorter docs 📌 Early positions 📌 Repeated entities 📌 Literal matches ...all while ignoring the answer's presence! huggingface.co/datasets/mohse…

thumb_up_off_alt41

chat_bubble_outline2

repeat5

shareShare

Yunzhi Yao

@yyztodd

5 months ago

🍰Check out our latest work, CaKE on Knowledge Editing!

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Ningyu Zhang@ZJU

@zxlzr

5 months ago

Introducing How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training 🔍🧠 Our latest work dives into the mechanism of new knowledge acquisition in LLMs, revealing how computational subgraphs— “knowledge circuits”—adapt and evolve during

thumb_up_off_alt167

chat_bubble_outline1

repeat25

shareShare

Hongru Wang

@wangcarrey

5 months ago

Write a blog to share my recent thoughts about knowledge boundaries & tool use & language agent. This is the first time to propose three laws of knowledge boundaries!🔥 candle-walker-56d.notion.site/NAACL-2025-Ora… Chinese Version: mp.weixin.qq.com/s/XzjiLUFAr1Yc…

thumb_up_off_alt26

chat_bubble_outline1

repeat4

shareShare

uclanlp

@uclanlp

5 months ago

📣 For this week’s NLP Seminar, we are thrilled to host Zhe Gan Zhe Gan to give a talk titled “How to Build Your Multimodal LLMs: From Pre-training to Post-training and Agents”! 🗓️ 4/11 Fri 2pm PT Registration: forms.gle/TNXfBZJiMJjL18…

📣 For this week’s NLP Seminar, we are thrilled to host Zhe Gan <a href="/zhegan4/">Zhe Gan</a> to give a talk titled
“How to Build Your Multimodal LLMs: From Pre-training to Post-training and Agents”!

🗓️ 4/11 Fri 2pm PT
Registration: forms.gle/TNXfBZJiMJjL18…

thumb_up_off_alt20

chat_bubble_outline0

repeat9

shareShare

Violet Peng

@violetnpeng

4 months ago

Excited to speak more about AI creativity at SSNLP today in Singapore ssnlp-website.github.io/ssnlp25/ Also look forward to hear what Qwen team has to say about their latest breakthrough! Friends in Singapore: let’s catch up!

thumb_up_off_alt59

chat_bubble_outline1

repeat7

shareShare

Yu (Bryan) Zhou

@yu_bryan_zhou

4 months ago

#GPT4o image generation brings synthetic visual data quality to the next level. 🖼️ 🤔Is synthetic visual data finally ready to be used for improving VLMs? 🚀 We show success with CoDA, using contrastive visual data augmentation to help teach VLMs novel and confusing concepts.

thumb_up_off_alt36

chat_bubble_outline1

repeat14

shareShare

Ningyu Zhang@ZJU

@zxlzr

4 months ago

🚀 Excited to introduce EasyEdit2 — a powerful upgrade to EasyEdit, now redesigned for unified, plug-and-play LLM behavior steering at inference time! youtu.be/AkfoiPfp5rQ?si…

thumb_up_off_alt25

chat_bubble_outline0

repeat7

shareShare

Ningyu Zhang@ZJU

@zxlzr

4 months ago

🚀 Excited to introduce EasyEdit2 — a powerful upgrade to EasyEdit, now redesigned for unified, plug-and-play LLM behavior steering at inference time! #EasyEdit #LLM #ModelSteering #ModelEditing #KnowledgeEditing #EasyEdit2 #AI #InferenceTimeControl ✨ No retraining — just

thumb_up_off_alt3

chat_bubble_outline0

repeat2

shareShare

Zhaopeng Tu

@tuzhaopeng

4 months ago

Are MoE reasoning models already equipped with the right "brains" -- and just need a push? 🧠 Introducing Reinforcing Cognitive Experts (RICE), a simple, yet powerful inference-time approach that boosts reasoning accuracy by selectively strengthening just 2 cognitive experts in

thumb_up_off_alt115

chat_bubble_outline2

repeat28

shareShare

Ningyu Zhang@ZJU

@zxlzr

4 months ago

We introduce Reinforcing "Cognitive Experts" – a new approach to enhance reasoning in MoE-based Large Reasoning Models (LRMs) 🌟. Thanks to Tencent's support, we had the opportunity to explore the inner workings of ultra-large models like DeepSeek-R1-671B and Qwen3-235B. By

thumb_up_off_alt56

chat_bubble_outline1

repeat13

shareShare

Haoyi Qiu

@haoyiqiu

3 months ago

🌏How culturally safe are large vision-language models? 👉LVLMs often miss the mark. We introduce CROSS, a benchmark of 1,284 image-query pairs across 16 countries & 14 languages, revealing how LVLMs violate cultural norms in context. ⚖️ Evaluation via CROSS-EVAL 🧨 Safety

thumb_up_off_alt65

chat_bubble_outline5

repeat19

shareShare

Lucas Bandarkar

@lucasbandarkar

3 months ago

The unreasonable effectiveness of model merging for cross-lingual transfer ! Our preprint evaluates a number of *modular* approaches to fine-tuning LLMs that "assign" model params to either task or language. Surprisingly, merging experts beats all ! 🧵1/4 arxiv.org/abs/2505.18356

thumb_up_off_alt129

chat_bubble_outline2

repeat21

shareShare

Tanmay Parekh

@tparekh97

3 months ago

🚨 New work: LLMs still struggle at Event Detection due to poor long-context reasoning and inability to follow task constraints, causing precision and recall errors. We introduce DiCoRe — a lightweight 3-stage Divergent-Convergent reasoning framework to fix this.🧵📷 (1/N)

thumb_up_off_alt46

chat_bubble_outline1

repeat18

shareShare

Ningyu Zhang@ZJU

@zxlzr

3 months ago

Introducing AutoMind: Adaptive Knowledgeable Agent for Automated Data Science Paper: arxiv.org/abs/2506.10974 Code (will be released soon): github.com/innovatingAI/A… Our latest work AutoMind is a new LLM agent framework that automates end-to-end machine learning pipelines by

thumb_up_off_alt40

chat_bubble_outline2

repeat17

shareShare

Yining Hong

@yining_hong

3 months ago

Meet Embodied Web Agents that bridge physical-digital realms. Imagine embodied agents that can search for online recipes, shop for ingredients and cook for you. Embodied web agents search internet information for implementing real-world embodied tasks. All data, codes and web

thumb_up_off_alt145

chat_bubble_outline3

repeat31

shareShare

AK

@_akhaliq

2 months ago

ReCode Updating Code API Knowledge with Reinforcement Learning

thumb_up_off_alt27

chat_bubble_outline3

repeat6

shareShare

Ningyu Zhang@ZJU

@zxlzr

2 months ago

Many thanks to AK for sharing our work! Introducing "ReCode: Updating Code API Knowledge with Reinforcement Learning" — the RL framework that teaches models to update code API knowledge. Paper: huggingface.co/papers/2506.20… Code: github.com/zjunlp/ReCode 📚 Trained on 2K+ API

thumb_up_off_alt7

chat_bubble_outline0

repeat4

shareShare