Tianyu Liu (@t_y_liu) Twitter Tweets • TwiCopy

Yifan Hou

3 years ago

Adding 2% parameters can enhance your multilingual language model and turn it into a knowledge base. Happy to share our #EMNLP2022 findings paper "Adapters for Enhanced Modeling of Multilingual Knowledge and Text", which uses a set of adapters for the enhancement. #AI #NLProc

thumb_up_off_alt41

chat_bubble_outline1

repeat10

shareShare

Leo Du

@leoduw

3 years ago

Did you know that, when you add up the probability your language model gives every possible string, you might come up with less than 1 ?! 🤯 Turns out, this can happen! 1/n

thumb_up_off_alt76

chat_bubble_outline3

repeat14

shareShare

Zhaofeng Wu @ ICLR

@zhaofeng_wu

2 years ago

Language models show impressive performance on a wide variety of tasks, but are they overfitting to evaluation instances and specific task instantiations seen in their pretraining? How much of this performance represents general task/reasoning abilities? 1/4

thumb_up_off_alt459

chat_bubble_outline9

repeat108

shareShare

Afra Amini

@afra_amini

2 years ago

Are you a big fan of structure? Have you ever wanted to apply the latest and greatest large language model out-of-the-box to parsing? Are you a secret connoisseur of linear-time dynamic programs? If you answered yes, our outstanding #ACL2023NLP paper may be just right for you!

thumb_up_off_alt89

chat_bubble_outline2

repeat20

shareShare

Ethan Gotlieb Wilcox

@wegotlieb

2 years ago

Thank you to #EMNLP2023 chairs for the 😱 two 😱 outstanding paper awards! I am so grateful to have worked on these projects with wonderful colleagues — Tiago Pimentel (who is the first author on one of the papers!), Clara Isabel Meister, Kyle Mahowald and @ryandcotterell

Thank you to #EMNLP2023 chairs for the 😱 two 😱 outstanding paper awards! I am so grateful to have worked on these projects with wonderful colleagues — <a href="/tpimentelms/">Tiago Pimentel</a> (who is the first author on one of the papers!), <a href="/clara__meister/">Clara Isabel Meister</a>, <a href="/kmahowald/">Kyle Mahowald</a> and @ryandcotterell

thumb_up_off_alt327

chat_bubble_outline14

repeat14

shareShare

Songlin Yang

@songlinyang4

2 years ago

Data-dependent decay and state dimension expansion are the key for Mamba/GLA matching Transformers!🚀 Also excited to present my NeurIPS spotlight paper [arxiv.org/abs/2311.04823] this Wednesday, which also shows the crucial role of data-dependent decay. Come and chat about RNNs!

thumb_up_off_alt57

chat_bubble_outline1

repeat13

shareShare

Zhaofeng Wu @ ICLR

@zhaofeng_wu

a year ago

💡We find that models “think” 💭 in English (or in general, their dominant language) when processing distinct non-English or even non-language data types 🤯 like texts in other languages, arithmetic expressions, code, visual inputs, & audio inputs ‼️ 🧵⬇️arxiv.org/abs/2411.04986

thumb_up_off_alt298

chat_bubble_outline8

repeat60

shareShare

Jirui Qi

@jirui_qi

a year ago

[1/8] Seeking a faster approach to assess prompt quality in RAG QA? Our latest work may be a good fit for you. We find that prompt quality can be measured using question likelihoods, a computation that’s parallelizable on the input side of LLMs! 📄arxiv.org/abs/2411.07773 #NLProc

thumb_up_off_alt18

chat_bubble_outline2

repeat5

shareShare