Wenhao Zhu (@wenhao_nlp) Twitter Tweets • TwiCopy

Wenhao Zhu

@wenhao_nlp

+ Follow

PhD candidate @NJUNLP, visiting PhD student @EdinburghNLP, interested in multilingual LLM and machine translation.

ID: 1183979856226181120

linkhttps://owennju.github.io calendar_today15-10-2019 05:36:19

88 Tweet

472 Takipçi

672 Takip Edilen

Zixian Huang

@njuhuangzx

a year ago

🤗We propose MindMerger, which merges pre-trained multilingual encoder with LLM to better utilize the built-in multilingual capabilities of LLM and boost non-English reasoning. MindMerger boosts average accuracy by 6.7% on the MGSM dataset!🤗 Paper: arxiv.org/pdf/2405.17386

thumb_up_off_alt11

chat_bubble_outline2

repeat5

shareShare

FeYuan

@t_feyuan

a year ago

🚀Exciting new! Introducing LLaMAX, a powerful LLM with enhanced translation performance across all 101 languages. 🔥 LLaMAX provides a better starting point for multilingual tasks and lots of analysis on the multilingual continual pre-training. huggingface.co/papers/2407.05…

thumb_up_off_alt143

chat_bubble_outline2

repeat47

shareShare

Vila

@anniodance

a year ago

Make a demo for LLaMAX, a LLM for 101 languages. huggingface.co/spaces/vilarin…

thumb_up_off_alt51

chat_bubble_outline3

repeat19

shareShare

Wenhao Zhu

@wenhao_nlp

a year ago

In the upcoming ACL, I will present my work on transferring LLM's English expertise to non-English. If you're interested in large language model / multilinguality / reasoning, feel free to reach out and let's discuss their future. Looking forward to seeing you all in Bangkok!🇹🇭

thumb_up_off_alt57

chat_bubble_outline2

repeat12

shareShare

Lei Li

@lileics

a year ago

My group have 7 papers(including 1 demo) at #EMNLP2024. Topics include multilingual LLM, evaluation, LLM alignment, multimodal LLM. My students Danqing Wang Danqing Wang Andre Duarte André Duarte Wenda Xu Wenda Xu Chinmay Dandekar will present onsite. You are welcome to stop by

thumb_up_off_alt48

chat_bubble_outline0

repeat6

shareShare

Wenhao Zhu

@wenhao_nlp

a year ago

Check out Kanzhi's latest work on multi-model reasoning

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Jiao Sun

@sunjiao123sun_

a year ago

Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference NeurIPS Conference We have ethical reviews for authors, but missed it for invited speakers? 😡

Mitigating racial bias from LLMs is a lot easier than removing it from humans!

Can’t believe this happened at the best AI conference <a href="/NeurIPSConf/">NeurIPS Conference</a>

We have ethical reviews for authors, but missed it for invited speakers? 😡

thumb_up_off_alt3,3K

chat_bubble_outline184

repeat837

shareShare

Junxian He

@junxian_he

10 months ago

We replicated the DeepSeek-R1-Zero and DeepSeek-R1 training on 7B model with only 8K examples, the results are surprisingly strong. 🚀 Starting from Qwen2.5-Math-7B (base model), we perform RL on it directly. No SFT, no reward model, just 8K MATH examples for verification, the

thumb_up_off_alt3,3K

chat_bubble_outline70

repeat656

shareShare

Unitree

@unitreerobotics

10 months ago

What Dance Would You Like to Perform with Unitree G1? With the upgraded algorithm, G1 can learn any dance. Leave a comment to tell us what dance you'd like to see！😘 #Unitree #AGI #EmbodiedAI #SpringFestivalGalaRobot #AI #Humanoid #Bipedal #WorldModel #Dance

thumb_up_off_alt5,5K

chat_bubble_outline548

repeat928

shareShare

Wenhao Zhu

@wenhao_nlp

10 months ago

Tired of mGSM & multilingual MMLU? Saturated performance, limited task types & complexity... Academia researchers and industry LLM teams both need a better way to comprehensively evaluate LLM multilingual capabilities. Introducing BenchMAX! Maximizing the spectrum of

thumb_up_off_alt32

chat_bubble_outline0

repeat9

shareShare

Nathan Godey

@nthngdy

9 months ago

🚀 New Paper Alert! 🚀 We introduce Q-Filters, a training-free method for efficient KV Cache compression! It is compatible with FlashAttention and can compress along generation which is particularly useful for reasoning models ⚡ ⬇️R1-Distill-Llama-8B with 128 KV pairs ⬇️ 🧵

thumb_up_off_alt185

chat_bubble_outline4

repeat37

shareShare

Wenhao Zhu

@wenhao_nlp

8 months ago

😕Feeling frustrated with this round of ACL Rolling Review (February). The interaction between reviewers and authors seems to have deteriorated compared to previous rounds. - As an Area Chair, I noticed almost no reviewers responded or updated their reviews after the rebuttal

thumb_up_off_alt23

chat_bubble_outline0

repeat1

shareShare

Pinzhen "Patrick" Chen

@pinzhen_chen

7 months ago

📢Participate in *WMT25 terminology task* to showcase how you customise translations! What's new? More languages, more domains, sent/doc-level, and Pareto optimal of term accuracy and overall quality. Don't miss it cuz it only happens once every two years. statmt.org/wmt25/terminol…

thumb_up_off_alt17

chat_bubble_outline0

repeat4

shareShare