Roy Bar Haim (@roybarhaim) Twitter Tweets • TwiCopy

Roy Bar Haim

5 years ago

NAACL'21 main conference is starting today! Meet our researchers and recruiting team at the IBM Research virtual booth: underline.io/events/122/exp…, and learn more about IBM Research's presence at @NAACLHLT, careers and booth schedule at ibm.biz/naacl2021

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Orith Toledo-Ronen

@orithtoledo

3 years ago

Interested in TARGETED #SentimentAnalysis beyond restaurant reviews? In #NAACL2022 we suggest a robust multi-domain model relying on self-training, with no extra annotation -- arxiv.org/abs/2205.03804 Orith Toledo-Ronen matan orbach Yoav Katz Noam Slonim 🟢 #NLProc #IBMResearch (1/5)

thumb_up_off_alt14

chat_bubble_outline1

repeat3

shareShare

Roy Bar Haim

@roybarhaim

3 years ago

NAACL 2022 is starting on Sunday! Visit our website ibm.biz/naacl2022 to learn about the exciting NLP work from IBM Research that will be presented at this conference. IBM Research NAACL HLT 2027 #NAACL2022

thumb_up_off_alt31

chat_bubble_outline1

repeat11

shareShare

Avi Sil

@aviaviavi__

3 years ago

Welcome PrimeQA at #NAACL2022! Replicate the state-of-the-art on multilingual open QA quickly! Here’s a new open-source repo in collab with with Stanford NLP Group, Hugging Face, Uni Stuttgart @NLPIllinois1. Link: github.com/primeqa/primeqa Talk to me or read: research.ibm.com/blog/primeqa-f… 🧵

Welcome PrimeQA at #NAACL2022! Replicate the state-of-the-art on multilingual open QA quickly! Here’s a new open-source repo in collab with with <a href="/stanfordnlp/">Stanford NLP Group</a>, <a href="/huggingface/">Hugging Face</a>, <a href="/Uni_Stuttgart/">Uni Stuttgart</a> @NLPIllinois1. Link: github.com/primeqa/primeqa Talk to me or read:
research.ibm.com/blog/primeqa-f… 🧵

thumb_up_off_alt59

chat_bubble_outline2

repeat40

shareShare

Eyal Shnarch

@eyalshnarch

3 years ago

Want to build a text classifier in a few hours? Even if you don’t have any: labeled data #machineLearning knowledge programing skills Label Sleuth ibm.biz/label-sleuth a new open-source no-code system for annotations 🧵 IBM Research University of Notre Dame Stanford Human-Computer Interaction Group UT Dallas #NLProc

thumb_up_off_alt42

chat_bubble_outline10

repeat22

shareShare

Arie Cattan

@ariecattan

3 years ago

Curious to see how can we summarize opinions beyond plain text summaries? Check out our #ACL2023 paper: From Key Points to Key Point Hierarchy: Structured and Expressive Opinion Summarization with Lilach Eden, yoav kantor Roy Bar Haim from IBM Research IBM BIU NLP >>

thumb_up_off_alt18

chat_bubble_outline1

repeat6

shareShare

Argument Mining

@argminingorg

2 years ago

We are excited to announce that the Argument Mining workshop will take place at #ACL2024 in Bangkok, Thailand. For more info see our website at argmining-org.github.io/2024/

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

Argument Mining

@argminingorg

2 years ago

We are happy to announce two shared tasks for ArgMining 2024: 1) Perspective Argument Retrieval organized by Neele Falk and Andreas Waldis. 2) DialAM-2024 organized by Ramon Ruiz-Dolz, John Lawrence, Ella Schad, and Chris Reed.

thumb_up_off_alt10

chat_bubble_outline0

repeat4

shareShare

Argument Mining

@argminingorg

2 years ago

The call for papers for the 11th Workshop on Argument Mining #argminig_2024 is now out: argmining-org.github.io/2024/index.htm…

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Argument Mining

@argminingorg

a year ago

ArgMining 2024 ended with a great photo of its wonderful community. Kudos to all of your great ideas, contributions, and help in organizing.

thumb_up_off_alt18

chat_bubble_outline0

repeat5

shareShare

Ariel Gera

@arielgera2

a year ago

Say I want to compare system qualities - pick between 2 configurations, or rank a whole bunch of models. I'll use LLM-as-a-judge, right? 🧑🏻‍⚖️ But how do I know the LLM judge is up to the task? Who is a good judge for ranking systems? Enter our new paper!✨🧵 arxiv.org/abs/2412.09569

thumb_up_off_alt24

chat_bubble_outline1

repeat8

shareShare

Asaf Yehudai

@asafyehudai

a year ago

New preprint! ✨ Interested in LLM-as-a-Judge? Want to get the best judge for ranking your system? our new work is just for you: "JuStRank: Benchmarking LLM Judges for System Ranking" 🕺💃 arxiv.org/abs/2412.09569

thumb_up_off_alt30

chat_bubble_outline2

repeat9

shareShare

Asaf Yehudai

@asafyehudai

9 months ago

Survey on Evaluation of LLM-based Agents 🤖 Our paper is the first to provide a comprehensive overview of LLM-based agent evaluation 📜 Paper: arxiv.org/pdf/2503.16416

thumb_up_off_alt334

chat_bubble_outline4

repeat83

shareShare

Asaf Yehudai

@asafyehudai

8 months ago

Interested in Agent Evaluation? 🤖 We’re excited to launch our new repo: “Evaluation of LLM-based Agents: A Reading List” 📚 Browse benchmarks, methods, and frameworks from our recent survey. 👉 Explore & Contribute: github.com/Asaf-Yehudai/L… #LLMAgents #AgentEvaluation

thumb_up_off_alt85

chat_bubble_outline4

repeat23

shareShare

Noy Sternlicht

@noysternlicht

6 months ago

🔔 New Paper! We propose a challenging new benchmark for LLM judges: Evaluating debate speeches. Are they comparable to humans? Well... it’s debatable. 🤔 noy-sternlicht.github.io/Debatable-Inte… 👇 Here are our findings:

thumb_up_off_alt46

chat_bubble_outline3

repeat15

shareShare

elvis

@omarsar0

6 months ago

Evaluating LLM-based Agents This report has a comprehensive list of methods for evaluating AI Agents. Don't ignore evals. If done right, they are a game-changer. Highly recommend it to AI devs. (bookmark it)

thumb_up_off_alt882

chat_bubble_outline24

repeat171

shareShare

Asaf Yehudai

@asafyehudai

5 months ago

🚨 Benchmarks tell us which model is better — but not why it fails. For developers, this means tedious, manual error analysis. We're bridging that gap. Meet CLEAR: an open-source tool for actionable error analysis of LLMs. 🧵👇

thumb_up_off_alt41

chat_bubble_outline1

repeat13

shareShare