Jesse Vig (@jesse_vig) 's Twitter Profile
Jesse Vig

@jesse_vig

AI Researcher

ID: 36762530

linkhttps://jessevig.com calendar_today30-04-2009 20:27:17

460 Tweet

2,2K Followers

1,1K Following

Ian Tenney (@iftenney@sigmoid.social) (@iftenney) 's Twitter Profile Photo

Excited to announce v0.5 of the Google AI Learning Interpretability Tool (🔥LIT), an interactive platform to debug, validate, and understand ML model behavior. v0.5 includes exciting features and a new name! pair-code.github.io/lit/ #NLProc #googlePAIR (1/7)

Excited to announce v0.5 of the <a href="/GoogleAI/">Google AI</a> Learning Interpretability Tool (🔥LIT), an interactive platform to debug, validate, and understand ML model behavior. v0.5 includes exciting features and a new name! pair-code.github.io/lit/ #NLProc #googlePAIR (1/7)
Yonatan Belinkov (@boknilev) 's Twitter Profile Photo

People have been asking for slides of our ACL 2020 tutorial w/ Sebastian Gehrmann Ellie Pavlick Brown NLP on Interpretability and analysis of #nlproc. Thanks to ACL Anthology team it’s now here: aclanthology.org/2020.acl-tutor… Hopefully still useful though much has changed in the field since.

Alex Fabbri (@alexfabbri4) 's Twitter Profile Photo

🚨🆕📄🚨 How gold is your human evaluation? We seek the answer, and its implications in the GPT3 era, in our preprint “Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation” Paper: arxiv.org/abs/2212.07981 Equal contribution Yixin Liu

🚨🆕📄🚨
How gold is your human evaluation? We seek the answer, and its implications in the GPT3 era, in our preprint “Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation”
 
Paper:  arxiv.org/abs/2212.07981 
Equal contribution <a href="/YixinLiu17/">Yixin Liu</a>
Alex Fabbri (@alexfabbri4) 's Twitter Profile Photo

You can explore the ACU annotations in Rose🌹along with protocol results on our demo page and start using our dataset! Repo: github.com/Yale-LILY/ROSE Demo page: yale-lily.github.io/ROSE/ Dataset: huggingface.co/datasets/Sales…

Wojciech Kryściński (@iam_wkr) 's Twitter Profile Photo

Very excited to have the opportunity to present research done at Salesforce AI Research on Automatic Text Summarization at Zespół Inżynierii Lingwistycznej IPI PAN „Long Story Short: A Talk about Text Summarization” will cover the current state of the field, existing challenges, and future directions.

Jesse Vig (@jesse_vig) 's Twitter Profile Photo

How can NLP help us understand the diversity of news coverage of a topic? Check out the latest work from Philippe Laban et al. appearing at #CHI2023 this week.

Yixin Liu (@yixinliu17) 's Twitter Profile Photo

Delighted to announce our paper has been accepted for an oral presentation at #ACL2023 oral! In this work we emphasize the intricate complexity of human evaluation while it is becoming even more crucial for both model training and evaluation in the LLM era.

Caiming Xiong (@caimingxiong) 's Twitter Profile Photo

Finding a document too dense to decipher? 🤔Content a bit convoluted? Essay too esoteric? Check how we simplify and improve document readability using SWiPE. Join us in making knowledge accessible to all! 🌐 🔗Paper: arxiv.org/abs/2305.19204 🔗Github: github.com/salesforce/sim…

Caiming Xiong (@caimingxiong) 's Twitter Profile Photo

By aligning Wikipedia articles to their simplified versions on Simple Wikipedia, we reconstruct the process by which human editors simplify whole documents, in contrast to prior work focused on sentence-level simplification.

By aligning Wikipedia articles to their simplified versions on Simple Wikipedia, we reconstruct the process by which human editors simplify whole documents, in contrast to prior work focused on sentence-level simplification.
WikiResearch (@wikiresearch) 's Twitter Profile Photo

"SWIPE: A Dataset for Document-Level Simplification of Wikipedia Pages" leveraging the entire revision history when pairing enwiki/simplewiki pages, to identify simplification edits. (Laban et al, 2023) arxiv.org/pdf/2305.19204… Wojciech Kryściński

"SWIPE: A Dataset for Document-Level Simplification of Wikipedia Pages"  leveraging the entire revision history when pairing enwiki/simplewiki pages, to identify simplification edits.

(Laban et al, 2023)

arxiv.org/pdf/2305.19204…
<a href="/iam_wkr/">Wojciech Kryściński</a>
Caiming Xiong (@caimingxiong) 's Twitter Profile Photo

🤔Which words in your prompt are most helpful to language models? In our #ACL2023NLP paper, we explore which parts of task instructions are most important for model performance. 🔗 arxiv.org/abs/2306.01150 Code: github.com/fanyin3639/Ret…

🤔Which words in your prompt are most helpful to language models? In our #ACL2023NLP paper, we explore which parts of task instructions are most important for model performance.
🔗 arxiv.org/abs/2306.01150
Code: github.com/fanyin3639/Ret…
Caiming Xiong (@caimingxiong) 's Twitter Profile Photo

Excited to share a new preprint on the 🩴FlipFlop Effect. We prompt LLMs with a classification task, and challenge the model by following up with “Are you sure?”. The model can confirm or flip its answer. The results? More flips than a gymnastics competition! 🤸‍♂️ 1/N

Excited to share a new preprint on the 🩴FlipFlop Effect.
We prompt LLMs with a classification task, and challenge the model by following up with “Are you sure?”. The model can confirm or flip its answer. The results? More flips than a gymnastics competition! 🤸‍♂️ 1/N
Philippe Laban (@philippelaban) 's Twitter Profile Photo

Excited to share this fun new work on the 🩴FlipFlop Effect. In short: if you ask models if they're sure of their answers, they tend to change their minds (and severely degrade accuracy). What's mindblowing is how universal the effect is across LLMs (GPTs, Gemini, Claudes, …).