Liang Ding (@liangdingnlp) 's Twitter Profile
Liang Ding

@liangdingnlp

NLP/ML Researcher (working on developing LLM and exploring its applications) & Ex-@JD_Corporate @TencentGlobal @Sydney_Uni. Opinions are my own.

ID: 1056872928631877636

linkhttp://liamding.cc calendar_today29-10-2018 11:38:36

386 Tweet

604 Followers

1,1K Following

Liang Ding (@liangdingnlp) 's Twitter Profile Photo

Unfortunately, I couldn't attend COLING 2024 in person, but here's a shoutout to our two contributions! 🎉📖 #COLING2024 #ResearchHighlights 1. Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction. [paper]

Unfortunately, I couldn't attend COLING 2024 in person, but here's a shoutout to our two contributions! 🎉📖 #COLING2024 #ResearchHighlights 

1. Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction. [paper]
Liang Ding (@liangdingnlp) 's Twitter Profile Photo

Cool work! Purely end-to-end multimodal transformation may be hard, but such kind of integration of different cross-modal encoders is absolutely a practical solution. 🔥

Liang Ding (@liangdingnlp) 's Twitter Profile Photo

Nice work by Minghao Wu ! Mimicking human expert behaviours in some specific tasks (Like last year we encouraged LLMs to human-likely evaluate translations by analyzing errors, refer to x.com/liangdingNLP/s…) and implementing them with carefully designed LLM agents is a

Liang Ding (@liangdingnlp) 's Twitter Profile Photo

Thanks so much for your post Aran Komatsuzaki In this work, we present a 🔥zero-shot promoting strategy🔥 to enhance the 🔥math&reason🔥performance, where we pushed the SOTA performance of several best LLMs.

Jungo Kasai 笠井淳吾 (@jungokasai) 's Twitter Profile Photo

Liang Ding Indeed! Well, you never know what becomes relevant. I believe diffusion models and multimodal tokenization techniques are non-autoregressive generation as well. Speculative decoding reminds me of techniques that came up in this context too. Let's keep moving and building models!

Longyue Wang (@wangly0229) 's Twitter Profile Photo

🌸Thanks for highlighting our TransAgents work in your GitHub repo! Andrew Ng 🪐We're also committed to integrating Language Agents into real-world translation service. Paper: arxiv.org/pdf/2405.11804 Our Demo is also on the way: github.com/minghao-wu/tra…

Liang Ding (@liangdingnlp) 's Twitter Profile Photo

🔥🔥🔥We release our new work "Demystifying the Compression of Mixture-of-Experts Through a Unified Framework" (with code), where we aim to find the best recipe to compress the MoE in LLMs, and in this work, by using the combined strategy of Expert Slimming and Expert Trimming

Liang Ding (@liangdingnlp) 's Twitter Profile Photo

Transformers have long been criticized for the computational complexity of their attention mechanisms. Are these computations worth it? Please check out Shwai He ‘s new work🎊

Daniel Han (@danielhanchen) 's Twitter Profile Photo

My analysis for Llama 3.1 1. 15.6T tokens, Tools & Multilingual 2. Llama arch + new RoPE 3. fp16 & static fp8 quant for 405b 4. Dedicated pad token 5. <|python_tag|><|eom_id|> for tools? 6. Roberta to classify good quality data 7. 6 staged 800B tokens long context expansion

My analysis for Llama 3.1

1. 15.6T tokens, Tools &amp; Multilingual
2. Llama arch + new RoPE
3. fp16 &amp; static fp8 quant for 405b
4. Dedicated pad token
5. &lt;|python_tag|&gt;&lt;|eom_id|&gt; for tools?
6. Roberta to classify good quality data
7. 6 staged 800B tokens long context expansion
Jeff Dean (@🏡) (@jeffdean) 's Twitter Profile Photo

AI System Achieves Silver Medal-level score in IMO The International Mathematical Olympiad (IMO) is the oldest, largest & most prestigious competition for young mathematicians. Every year, countries send their top young mathematicians to take a 6 problem test spanning two days.

AI System Achieves Silver Medal-level score in IMO

The International Mathematical Olympiad (IMO) is the oldest, largest &amp; most prestigious competition for young mathematicians.  Every year, countries send their top young mathematicians to take a 6 problem test spanning two days.
Liang Ding (@liangdingnlp) 's Twitter Profile Photo

Sorry to miss ACL, but welcome to check out our work on using LLM to evaluate translation with Error Analysis prompt ( EA is one of the earliest work to incorporate human experience such as MQM into LM based NLG eval and was published at ACL’23 oral), feel free to talk with

Chunting Zhou (@violet_zct) 's Twitter Profile Photo

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. arxiv.org/pdf/2408.11039 Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. arxiv.org/pdf/2408.11039

Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This