Zhiyuan Zeng (@zhiyuanzeng_) 's Twitter Profile
Zhiyuan Zeng

@zhiyuanzeng_

PhD-ing @uwnlp @uwcse | Prev. @Tsinghua_Uni @TsinghuaNLP @princeton_nlp

ID: 1650962310880714753

linkhttp://zhiyuan-zeng.github.io calendar_today25-04-2023 20:37:54

174 Tweet

417 Followers

216 Following

Zhiyuan Zeng (@zhiyuanzeng_) 's Twitter Profile Photo

Can we use LLMs to evaluate open-ended instruction following generations? Introducing LLMBar, a benchmark for evaluating LLM evaluators 🧐LLMBar is manually curated, objective, and adversarial😈 🤯Most LLM evaluators cannot beat random guess! 📜arxiv.org/abs/2310.07641 [1/n]

Can we use LLMs to evaluate open-ended instruction following generations? Introducing LLMBar, a benchmark for evaluating LLM evaluators
🧐LLMBar is manually curated, objective, and adversarial😈
🤯Most LLM evaluators cannot beat random guess!
📜arxiv.org/abs/2310.07641

[1/n]