Wenhao Zhu (@wenhao_nlp) 's Twitter Profile
Wenhao Zhu

@wenhao_nlp

PhD candidate @NJUNLP, visiting PhD student @EdinburghNLP, interested in multilingual LLM and machine translation.

ID: 1183979856226181120

linkhttps://owennju.github.io calendar_today15-10-2019 05:36:19

88 Tweet

472 Takipçi

672 Takip Edilen

FeYuan (@t_feyuan) 's Twitter Profile Photo

🚀Exciting new! Introducing LLaMAX, a powerful LLM with enhanced translation performance across all 101 languages. 🔥 LLaMAX provides a better starting point for multilingual tasks and lots of analysis on the multilingual continual pre-training. huggingface.co/papers/2407.05…

🚀Exciting new!

Introducing LLaMAX, a powerful LLM with enhanced translation performance across all 101 languages.

🔥   LLaMAX provides a better starting point for multilingual tasks and lots of analysis on the multilingual continual pre-training.

huggingface.co/papers/2407.05…
Wenhao Zhu (@wenhao_nlp) 's Twitter Profile Photo

In the upcoming ACL, I will present my work on transferring LLM's English expertise to non-English. If you're interested in large language model / multilinguality / reasoning, feel free to reach out and let's discuss their future. Looking forward to seeing you all in Bangkok!🇹🇭

In the upcoming ACL, I will present my work on transferring LLM's English expertise to non-English. 
If you're interested in large language model / multilinguality / reasoning, feel free to reach out and let's discuss their future. Looking forward to seeing you all in Bangkok!🇹🇭
Lei Li (@lileics) 's Twitter Profile Photo

My group have 7 papers(including 1 demo) at #EMNLP2024. Topics include multilingual LLM, evaluation, LLM alignment, multimodal LLM. My students Danqing Wang Danqing Wang Andre Duarte André Duarte Wenda Xu Wenda Xu Chinmay Dandekar will present onsite. You are welcome to stop by

My group have 7 papers(including 1 demo)  at #EMNLP2024. Topics include multilingual LLM, evaluation, LLM alignment, multimodal LLM. My students Danqing Wang <a href="/dqwang122/">Danqing Wang</a>  Andre Duarte <a href="/avduarte3333/">André Duarte</a> Wenda Xu <a href="/WendaXu2/">Wenda Xu</a> Chinmay Dandekar will present onsite. You are welcome to stop by
Jiao Sun (@sunjiao123sun_) 's Twitter Profile Photo

Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference NeurIPS Conference We have ethical reviews for authors, but missed it for invited speakers? 😡

Mitigating racial bias from LLMs is a lot easier than removing it from humans! 

Can’t believe this happened at the best AI conference <a href="/NeurIPSConf/">NeurIPS Conference</a> 

We have ethical reviews for authors, but missed it for invited speakers? 😡
Junxian He (@junxian_he) 's Twitter Profile Photo

We replicated the DeepSeek-R1-Zero and DeepSeek-R1 training on 7B model with only 8K examples, the results are surprisingly strong. 🚀 Starting from Qwen2.5-Math-7B (base model), we perform RL on it directly. No SFT, no reward model, just 8K MATH examples for verification, the

We replicated the DeepSeek-R1-Zero and DeepSeek-R1 training on 7B model with only 8K examples, the results are surprisingly strong. 

🚀 Starting from Qwen2.5-Math-7B (base model), we perform RL on it directly. No SFT, no reward model, just 8K MATH examples for verification, the
Unitree (@unitreerobotics) 's Twitter Profile Photo

What Dance Would You Like to Perform with Unitree G1? With the upgraded algorithm, G1 can learn any dance. Leave a comment to tell us what dance you'd like to see!😘 #Unitree #AGI #EmbodiedAI #SpringFestivalGalaRobot #AI #Humanoid #Bipedal #WorldModel #Dance

Wenhao Zhu (@wenhao_nlp) 's Twitter Profile Photo

Tired of mGSM & multilingual MMLU? Saturated performance, limited task types & complexity... Academia researchers and industry LLM teams both need a better way to comprehensively evaluate LLM multilingual capabilities. Introducing BenchMAX! Maximizing the spectrum of

Nathan Godey (@nthngdy) 's Twitter Profile Photo

🚀 New Paper Alert! 🚀 We introduce Q-Filters, a training-free method for efficient KV Cache compression! It is compatible with FlashAttention and can compress along generation which is particularly useful for reasoning models ⚡ ⬇️R1-Distill-Llama-8B with 128 KV pairs ⬇️ 🧵

Wenhao Zhu (@wenhao_nlp) 's Twitter Profile Photo

😕Feeling frustrated with this round of ACL Rolling Review (February). The interaction between reviewers and authors seems to have deteriorated compared to previous rounds. - As an Area Chair, I noticed almost no reviewers responded or updated their reviews after the rebuttal

Pinzhen "Patrick" Chen (@pinzhen_chen) 's Twitter Profile Photo

📢Participate in *WMT25 terminology task* to showcase how you customise translations! What's new? More languages, more domains, sent/doc-level, and Pareto optimal of term accuracy and overall quality. Don't miss it cuz it only happens once every two years. statmt.org/wmt25/terminol…