Xinyan Velocity Yu (@xinyanvyu) Twitter Tweets • TwiCopy

Xinyan Velocity Yu

@xinyanvyu

a year ago

I’ll be presenting CREPE today at 12:15 PM in Metropolitan East! Come listen to our talk! #ACL2023 #ACL2023NLP

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Following up a weekend effort by another weekend effort: llama2. rs 🦀 github.com/leo-du/llama2.… In a single Rust file w/ * zero dependencies (i.e. custom rng w/ PCG) * zero lines of `unsafe` code (very 🦀!) * support user prompts * (almost) same performance

thumb_up_off_alt36

chat_bubble_outline0

repeat7

shareShare

Xinyan Velocity Yu

@xinyanvyu

a year ago

Very cool work! Happy to see that our findings in CREPE (arxiv.org/abs/2211.17257) that false presuppositions/premises are still challenging for LLMs😆, and integrating search engine makes the answer better!

thumb_up_off_alt19

chat_bubble_outline0

repeat3

shareShare

Fangyuan Xu

@brunchavecmoi

a year ago

🔌Enhancing language models with retrieval boosts performance but demands more computes for encoding the retrieved documents. Do we need all the documents for the gains? We present 𝐑etrieve 𝐂𝐨𝐦press 𝐏repend (𝐑𝐄𝐂𝐎𝐌𝐏) arxiv.org/abs/2310.04408 (w/Weijia Shi, Eunsol Choi)

thumb_up_off_alt112

chat_bubble_outline2

repeat21

shareShare

Kaiser Sun

@kaiserwholearns

a year ago

🤔 How much do compositional generalization datasets agree with each other? We compare common compositional generalization benchmarks and find that they rank modeling approaches differently (❗) 🧵👇 #CoNLL2023 arxiv.org/abs/2310.17514

thumb_up_off_alt82

chat_bubble_outline2

repeat16

shareShare

Deqing Fu

@deqingfu

7 months ago

Do multimodal foundation models treat every modality equally? Hint: Humans have picture superiority. How about machines? Introducing IsoBench, a benchmark for multimodal models with isomorphic inputs. 🔗 IsoBench.github.io

thumb_up_off_alt136

chat_bubble_outline3

repeat25

shareShare

Zhaofeng Wu

@zhaofeng_wu

7 months ago

Want to train an aligned LM in a new language 🌏 but don’t have preference data for training the reward model (RM)? 💡 Just use a RM for another language: it often works well, sometimes even BETTER than if you had a RM in your target language! 🤯 arxiv.org/abs/2404.12318

thumb_up_off_alt131

chat_bubble_outline2

repeat30

shareShare

Xinyan Velocity Yu

@xinyanvyu

6 months ago

My takeaways when figuring out living arrangements: (1) PhD students need to be better paid as 50%-75% of my salary is on rent and commute, (2) accessible and affordable on-campus housing should be given, and (3) learn to drive early and live in less sketchy places. 🥲

thumb_up_off_alt10

chat_bubble_outline0

repeat0

shareShare

Xinyan Velocity Yu

@xinyanvyu

5 months ago

It is a great pleasure working with Ting-Rui and others on this project to understand retrieval augmentation and LM training a little bit better!

thumb_up_off_alt11

chat_bubble_outline0

repeat1

shareShare

Yushi Hu

@huyushi98

5 months ago

Humans draw to facilitate reasoning and communication. Why not let LLMs do so? 🚀We introduce✏️Sketchpad, which gives multimodal LLMs a sketchpad to draw and facilitate reasoning! arxiv.org/abs/2406.09403 Sketchpad gives GPT-4o great boosts on many vision and math tasks 📈 The

thumb_up_off_alt253

chat_bubble_outline7

repeat62

shareShare

Xinyan Velocity Yu

@xinyanvyu

5 months ago

So happy to meet new and old friends in NAACL ❤️! I’ll be presenting our work BUFFET🎉: ⏰ Monday, June 17th at 14:00 📍Don Alberto 4 If you’re into multilinguality and seeking a benchmark for fair comparison of models for both languages & methods, don’t miss it! 🤩 #NAACL2024

thumb_up_off_alt46

chat_bubble_outline0

repeat5

shareShare

Michael Saxon (in Seattle)

@m2saxon

5 months ago

Venelin Kovatchev Ben Zhou 🌴Muhao Chen🌴 Awesome analysis of what KNN-LM says abt training: Is the seeming "free lunch" of KNN-LM (replacing top LM layers with embedding store and KNN lookup) due to a weakness of the LM objctve? Seems no! Training a replacement MLP on the KNN does better! 🤔 aclanthology.org/2024.naacl-sho…

<a href="/sintelion/">Venelin Kovatchev</a> <a href="/BenZhou96/">Ben Zhou</a> <a href="/muhao_chen/">🌴Muhao Chen🌴</a> Awesome analysis of what KNN-LM says abt training:

Is the seeming "free lunch" of KNN-LM (replacing top LM layers with embedding store and KNN lookup) due to a weakness of the LM objctve? Seems no!

Training a replacement MLP on the KNN does better! 🤔

aclanthology.org/2024.naacl-sho…

thumb_up_off_alt3

chat_bubble_outline1

repeat2

shareShare

Mukund Srinath @ NAACL

@mukundsrinath3

5 months ago

#NAACL2024 NAACL HLT 2024 Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks Zhaofeng Wu (Zhaofeng Wu @ ACL) arxiv.org/pdf/2307.02477

#NAACL2024
<a href="/naaclmeeting/">NAACL HLT 2024</a>
Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks
Zhaofeng Wu (<a href="/zhaofeng_wu/">Zhaofeng Wu @ ACL</a>)
arxiv.org/pdf/2307.02477

thumb_up_off_alt25

chat_bubble_outline1

repeat6

shareShare

Belinda Li

@belindazli

5 months ago

As the world changes, documents go out of date. How can we adapt RAG systems to a stream of changing world data? We introduce ERASE, a way of updating and propagating facts within knowledge bases, and CLARK, a dataset targeting these update problems arxiv.org/abs/2406.11830… 1/

thumb_up_off_alt126

chat_bubble_outline4

repeat31

shareShare

Xinyan Velocity Yu

@xinyanvyu

5 months ago

CodeRAG-Bench is extremely meaningful! We experiment with different retrievers, types of retrieval source documents, code generation tasks, and language models to find out how retrieval can help! For more, please read our exciting paper 👉👉

thumb_up_off_alt16

chat_bubble_outline0

repeat0

shareShare

CLS

@chengleisi

2 months ago

Automating AI research is exciting! But can LLMs actually produce novel, expert-level research ideas? After a year-long study, we obtained the first statistically significant conclusion: LLM-generated ideas are more novel than ideas written by expert human researchers.

thumb_up_off_alt954

chat_bubble_outline27

repeat178

shareShare

Xinyan Velocity Yu

Xinyan Velocity Yu

Leo Du

Xinyan Velocity Yu

Fangyuan Xu

Kaiser Sun

Deqing Fu

Zhaofeng Wu

Xinyan Velocity Yu

Xinyan Velocity Yu

Yushi Hu

Xinyan Velocity Yu

Michael Saxon (in Seattle)

Mukund Srinath @ NAACL

Belinda Li

Xinyan Velocity Yu

CLS