Di Wu (@diwunlp) 's Twitter Profile
Di Wu

@diwunlp

PhD candidate in MT/NLP/ML @UvA_Amsterdam, working with @c_monz.

ID: 1145373242233720832

linkhttps://moore3930.github.io/ calendar_today30-06-2019 16:47:15

166 Tweet

140 Takipçi

336 Takip Edilen

Marzena Karpinska (@mar_kar_) 's Twitter Profile Photo

Can #LLMs truly reason over loooong context? 🤔 NoCha asks LLMs to verify claims about *NEW* fictional books 🪄 📚 ⛔ LLMs that solve needle-in-the-haystack (~100%) struggle on NoCha! ⛔ None of 11 tested LLMs reach human performance → 97%. The best, #GPT-4o, gets only 55.8%.

Can #LLMs truly reason over loooong context? 🤔

NoCha asks LLMs to verify claims about *NEW* fictional books 🪄 📚

⛔ LLMs that solve needle-in-the-haystack (~100%) struggle on NoCha!
⛔ None of 11 tested LLMs reach human performance → 97%. The best, #GPT-4o, gets only 55.8%.
Kyunghyun Cho (@kchonyc) 's Twitter Profile Photo

modern LM research seems to be the exact repetition of MT research. here goes the prediction; someone will reinvent minimum Bayes risk decoding but will call it super-aligned, super-reasoning majority voting of galaxy-of-thoughts.

Evgeniia Tokarchuk (@evgtokarchuk) 's Twitter Profile Photo

Next week I'll be in Vienna at ICML Conference! Want to learn more on how to explicitly model embeddings on hypersphere and encourage dispersion during training? Come to the Gram Workshop poster session 2 on 27.07 Shoutout to my collaborators Hua Chang Bakker and timorous bestie 😷 💫

Next week I'll be in Vienna at <a href="/icmlconf/">ICML Conference</a>!

Want to learn more on how to explicitly model embeddings on hypersphere and encourage dispersion during training? Come to the <a href="/GRaM_workshop/">Gram Workshop</a> poster session 2 on 27.07

Shoutout to my collaborators Hua Chang Bakker and <a href="/vnfrombucharest/">timorous bestie 😷</a> 💫
David Stap (@davidstap) 's Twitter Profile Photo

1/4 #ACL2024 Excited to share our new paper on the impact of fine-tuning on the qualitative advantages of LLMs in machine translation! 🤖 Our work highlights the importance of preserving LLM capabilities during fine-tuning. arxiv.org/abs/2405.20089

LTL-UvA (@ltl_uva) 's Twitter Profile Photo

Language Technology Lab got four papers accepted for #EMNLP2024! Congrats to authors Kata Naszadi, Shaomu Tan, Baohao Liao Baohao Liao, Di Wu Di Wu 🥳🥳

Di Wu (@diwunlp) 's Twitter Profile Photo

We show that a grammar book provides little or even no help for translation in LLMs, questioning the recent "truly zero-shot translation" --- no data no gain, still 🧐

Benjamin Marie (@bnjmn_marie) 's Twitter Profile Photo

Unsloth has identified and fixed the gradient accumulation issue I reported last week. The problem turned out to be more significant than I expected, impacting multi-GPU training as well. This means we’ve likely been training models that didn’t perform as well as they could

John Nguyen (@__johnnguyen__) 's Twitter Profile Photo

🥪New Paper! 🥪Introducing Byte Latent Transformer (BLT) - A tokenizer free model scales better than BPE based models with better inference efficiency and robustness. 🧵

🥪New Paper! 🥪Introducing Byte Latent Transformer (BLT) - A tokenizer free model scales better than BPE based models with better inference efficiency and robustness.  🧵
Longyue Wang (@wangly0229) 's Twitter Profile Photo

🎯 ComfyUI-Copilot (AIGC Assistant) is now open-source, brought to you by Alibaba International! 🎉 🍀 Enhance ComfyUI workflow design and optimization with LLM-Agent ✨ Empowering AIGC and exploring Multimodal Agents 🚀 Stay tuned for more features like dynamic parameter

🎯 ComfyUI-Copilot (AIGC Assistant) is now open-source, brought to you by Alibaba International! 🎉
🍀 Enhance ComfyUI workflow design and optimization with LLM-Agent
✨ Empowering AIGC and exploring Multimodal Agents
🚀 Stay tuned for more features like dynamic parameter
Dan Deutsch (@_danieldeutsch) 's Twitter Profile Photo

🚨New machine translation dataset alert! 🚨We expanded the language coverage of WMT24 from 9 to 55 en->xx language pairs by collecting new reference translations for 46 languages in a dataset called WMT24++ Paper: arxiv.org/abs/2502.12404… Data: huggingface.co/datasets/googl…

🚨New machine translation dataset alert! 🚨We expanded the language coverage of WMT24 from 9 to 55 en-&gt;xx language pairs by collecting new reference translations for 46 languages in a dataset called WMT24++

Paper: arxiv.org/abs/2502.12404…
Data: huggingface.co/datasets/googl…
HPLT (@hplt_eu) 's Twitter Profile Photo

We are happy to announce the second release of HPLT bilingual datasets: - 50 English-centric language pairs = 380M parallel sentences (HPLT) 🤩 - 1,275 non-English-centric language pairs = 16.7B parallel sentences (MultiHPLT) 😮 Available at the HPLT dataset catalogue and OPUS.

Taku Kudo (@taku910) 's Twitter Profile Photo

Whitespace-ignoring tokenization is an fundamental feature of Sentenepiece, implemented since its early stages (around 2017) Using whitespace yielded better results on MT. It would be helpful if you could mention this. github.com/google/sentenc…

Zirui Liu (@ziruirayliu) 's Twitter Profile Photo

🔥Exited to share our new work on reproducibility challenges in reasoning models caused by numerical precision. Ever run the same prompt twice and get completely different answers from your LLM under greedy decoding? You're not alone. Most LLMs today default to BF16 precision,

Jingcheng (Frank) Niu (@frankniujc) 's Twitter Profile Photo

📢 Next week, I will be presenting our paper "Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs" at ACL 2025! Paper: arxiv.org/abs/2505.09338 Blog Post: frankniujc.github.io/publications/a… Talk: youtube.com/watch?v=XcsKon…

📢 Next week, I will be presenting our paper "Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs" at ACL 2025!

Paper: arxiv.org/abs/2505.09338
Blog Post: frankniujc.github.io/publications/a…
Talk: youtube.com/watch?v=XcsKon…
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Beautiful Google Research paper. LLMs can learn in context from examples in the prompt, can pick up new patterns while answering, yet their stored weights never change. That behavior looks impossible if learning always means gradient descent. The mechanisms through which this

Beautiful <a href="/GoogleResearch/">Google Research</a> paper.

LLMs can learn in context from examples in the prompt,  can pick up new patterns while answering, yet their stored weights never change.

That behavior looks impossible if learning always means gradient descent.

The mechanisms through which this