Ola Piktus
@olapiktus
ID: 1102704797403238400
04-03-2019 22:58:05
70 Tweet
1,1K Followers
405 Following
The Roots Search Tool from Hugging Face allowed me to quickly find XNLI examples in the BLOOM pre-training data - probably true for other benchmarks too. This 🔥 tool really highlights open-data as key to making the study of LLM capabilities a science. huggingface.co/spaces/bigscie…
📢 New blog post: the attribution problem with generative AI #NLProc AI & Society TLDR: Some argue that publicly available data is fair game for commercial models bc human text/art also has sources. But unlike models, we know when attribution is due... hackingsemantics.xyz/2022/attributi…
ROOTS Search Tool now shinier and better with both exact search and BM25-based sparse retrieval ☀️🌸 Check out Anna Rogers 's thread for details 🚀
[#FridayWiMLDSPaper 📜 curated by Marie Sacksick] "The ROOTS Search Tool: Data Transparency for LLMs", by Ola Piktus, Christopher Akiki, Paulo Villegas, Hugo Laurençon, @ggdupont, Sasha Luccioni, PhD 🦋🌎✨🤗, Yacine Jernite and Anna Rogers arxiv.org/abs/2302.14035
Every researcher's path is a rollercoaster filled with numerous lows. With unspeakable joy, today I want to share a high point: our paper, “Improving Wikipedia Verifiability with AI,” has been published in the prestigious Nature Portfolio. Dive in nature.com/articles/s4225… [1/4]
Our work on multi-epoch scaling laws with the amazing Niklas Muennighoff (applying to grad school!) and Sasha Rush Teven Le Scao Ola Piktus Nouamane Tazi TurkuNLP Thomas Wolf Colin Raffel won runner-up outstanding paper award! See Niklas Muennighoff 's thread x.com/Muennighoff/st…
Awesome poster presentation by Niklas Muennighoff for the paper "Scaling Data-Constrained Language Models" at #NeurIPS2023 Kudos Sasha Rush Boaz Barak Teven Le Scao Ola Piktus Nouamane Tazi Sampo Pyysalo Thomas Wolf Colin Raffel