LLM Evals Workshop @NeurIPS (@llm_eval) Twitter Tweets • TwiCopy

LLM Evals Workshop @NeurIPS

@llm_eval

+ Follow

NeurIPS 2025 Workshop. Evaluating the Evolving LLM Lifecycle: Benchmarks, Emergent Abilities, and Scaling

ID: 1945760836083216384

linkhttps://sites.google.com/corp/view/llm-eval-workshop calendar_today17-07-2025 08:22:36

18 Tweet

88 Takipçi

16 Takip Edilen

good girl

@goodgirlxsz

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Ahmad Beirami @ ICLR 2025

@abeirami

4 months ago

We are still in an Evaluation crisis!

thumb_up_off_alt112

chat_bubble_outline8

repeat9

shareShare

Ahmad Beirami ✈️ NeurIPS Indeed! In our NeurIPS workshop LLM Evals Workshop @NeurIPS, we’ll tackle the most pressing evaluation challenges. Join us to discuss how we should design the next generation of evaluations with experts in the field! More details: sites.google.com/view/llm-eval-…

thumb_up_off_alt24

chat_bubble_outline0

repeat3

shareShare

Berivan Isik

@berivanisik

4 months ago

Deadline for LLM Evals Workshop @NeurIPS submissions is September 4th! If you’re interested in helping us as a reviewer, please volunteer 👇

thumb_up_off_alt7

chat_bubble_outline1

repeat1

shareShare

Berivan Isik

@berivanisik

4 months ago

Volunteer to be a reviewer: docs.google.com/forms/d/1JPQB2…

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

LLM Evals Workshop @NeurIPS

@llm_eval

3 months ago

Last 2 days to the deadline! If you're interested in helping with the reviewing process, please consider volunteering using the following form. There will be best reviewer awards! docs.google.com/forms/d/e/1FAI…

thumb_up_off_alt7

chat_bubble_outline0

repeat2

shareShare

Riccardo Cadei

@riccardocadeii

2 months ago

The Narcissus Hypothesis: --Recursive training on semi-synthetic corpora enforcing human alignment induces a Social Desirability Bias: world-models (Narcissus) aim to please rather than represent, polluting data lakes and charming us (Echo) into hanging on their every word.

thumb_up_off_alt6

chat_bubble_outline1

repeat4

shareShare

Riccardo Cadei

@riccardocadeii

2 months ago

Sketched on a few Parisian summer nights with a friend, Christian Internò . If you care about (causal) identification in a semi-synthetic future, we’d value your read and critique. Preprint: arxiv.org/pdf/2509.17999 Accepted at LLM Evals Workshop @NeurIPS workshop NeurIPS Conference

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Berivan Isik

@berivanisik

2 days ago

I’ll be NeurIPS Conference all week and would love to connect on LLM data, evaluation, benchmarking, and scaling laws. If you’re working on related problems, feel free to reach out. PS: Don’t miss our one-of-a-kind workshop on LLM evaluation: sites.google.com/view/llm-eval-…

thumb_up_off_alt93

chat_bubble_outline6

repeat4

shareShare