Zirui Cheng (@zirui_cheng_) Twitter Tweets • TwiCopy

Zirui Cheng

@zirui_cheng_

+ Follow

MS Student at @UofIllinois | Prev Undergrad @Tsinghua_Uni | Machine Learning, Human-Computer Interaction

ID: 1338850987679698944

linkhttp://chengzr01.github.io calendar_today15-12-2020 14:18:55

14 Tweet

104 Followers

523 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Zirui Cheng

@zirui_cheng_

2 years ago

After spending a wonderful day with my best friend studying at ETH Zurich, I am finally heading for #CHI2023! Can’t wait for my first CHI! 😍

thumb_up_off_alt19

chat_bubble_outline0

repeat0

shareShare

Congrats to our HCI Lab for winning CHI2023 Honorable Mention Award as top-5% papers! It introduces Voice-Accompanying Hand-to-Face Gesture (VAHF) as a parallel channel for smarter voice interaction on wearable devices. Check out more at bit.ly/3NhtUyX. #CHI2023

thumb_up_off_alt9

chat_bubble_outline0

repeat5

shareShare

Zhiyuan Zeng

@zhiyuanzeng_

2 years ago

Can we use LLMs to evaluate open-ended instruction following generations? Introducing LLMBar, a benchmark for evaluating LLM evaluators 🧐LLMBar is manually curated, objective, and adversarial😈 🤯Most LLM evaluators cannot beat random guess! 📜arxiv.org/abs/2310.07641 [1/n]

thumb_up_off_alt119

chat_bubble_outline2

repeat40

shareShare

WikiResearch

@wikiresearch

a year ago

"Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia" (for e.g. vandalism-detecting models such as ORES) arxiv.org/html/2402.1414… meta.wikimedia.org/wiki/Research:…

thumb_up_off_alt35

chat_bubble_outline1

repeat13

shareShare

Tzu-Sheng Kuo 郭子生

@tzushengkuo

a year ago

✨New #CHI2024 Paper How might we empower communities to curate evaluation datasets for AI that impacts them? We present Wikibench, a system that enables communities to collaboratively curate AI datasets, while navigating ambiguities and disagreements through discussion. (1/9)

thumb_up_off_alt138

chat_bubble_outline5

repeat31

shareShare

Jiaxuan You

@youjiaxuan

a year ago

🚀 Excited to announce "How Far Are We From AGI?", first paper summarizing current and future research directions toward #AGI. We hope it inspires researchers to achieve AGI responsibly as a #community. 📄 Paper: arxiv.org/abs/2405.10313 💻 GitHub (PR🤗): github.com/ulab-uiuc/AGI-…

thumb_up_off_alt143

chat_bubble_outline2

repeat49

shareShare

Jiaxuan You

@youjiaxuan

a year ago

We sincerely appreciate the successful organization of the ICLR 2024 AGI Workshop, the most popular workshop at ICLR with 800+ attendees. Keynotes by Yoshua Bengio, Oriol Vinyals, Yejin Choi, Andrew G Wilson, and Song Han are summarized in our paper. Web: agiworkshop.github.io

thumb_up_off_alt9

chat_bubble_outline0

repeat1

shareShare

Jiaxuan You

@youjiaxuan

10 months ago

(1/n) Human research community is far from perfect. Frustrated with NeurIPS results? Research Town simulates the community as a graph of LLM agents and knowledge. It helps you find ideas, receive reviews, refine proposals, get metareviews - essentially running “NeurIPS” by LLM.

thumb_up_off_alt118

chat_bubble_outline1

repeat23

shareShare

Lifan Yuan

@lifan__yuan

8 months ago

Wanna train PRMs but process labels, annotated manually or automatically, sound too expensive to you😖? Introduce Implicit PRM🚀 – Get your model free process rewards by training an ORM on the cheaper response-level data, with a simple parameterization at no additional cost💰!

thumb_up_off_alt212

chat_bubble_outline3

repeat48

shareShare

Zhiyuan Zeng

@zhiyuanzeng_

5 months ago

Is a single accuracy number all we can get from model evals?🤔 🚨Does NOT tell where the model fails 🚨Does NOT tell how to improve it Introducing EvalTree🌳 🔍identifying LM weaknesses in natural language 🚀weaknesses serve as actionable guidance (paper&demo 🔗in🧵) [1/n]

thumb_up_off_alt240

chat_bubble_outline4

repeat89

shareShare

Beyza Bozdag @ NAACL’25

@nbbozdag

3 months ago

Thrilled to announce our new survey that explores the exciting possibilities and troubling risks of computational persuasion in the era of LLMs 🤖💬 📄Arxiv: arxiv.org/pdf/2505.07775 💻 GitHub: github.com/beyzabozdag/Pe…

thumb_up_off_alt34

chat_bubble_outline1

repeat10

shareShare

Sagnik Mukherjee

@saagnikkk

3 months ago

🚨 Paper Alert: “RL Finetunes Small Subnetworks in Large Language Models” From DeepSeek V3 Base to DeepSeek R1 Zero, a whopping 86% of parameters were NOT updated during RL training 😮😮 And this isn’t a one-off. The pattern holds across RL algorithms and models. 🧵A Deep Dive

thumb_up_off_alt844

chat_bubble_outline17

repeat125

shareShare

Zirui Cheng

Gate.io

Zirui Cheng

Tsinghua CS

Zhiyuan Zeng

WikiResearch

Tzu-Sheng Kuo 郭子生

Jiaxuan You

Jiaxuan You

Jiaxuan You

Lifan Yuan

Zhiyuan Zeng

Beyza Bozdag @ NAACL’25

Sagnik Mukherjee