Instruction Workshop, NeurIPS 2023 (@itif_workshop) 's Twitter Profile
Instruction Workshop, NeurIPS 2023

@itif_workshop

The official account of the 1st Workshop on Instruction Tuning and Instruction Following (ITIF), colocated with NeurIPS, in December 2023.

ID: 1689312241542430721

linkhttps://an-instructive-workshop.github.io/ calendar_today09-08-2023 16:27:08

185 Tweet

257 Takipçi

28 Takip Edilen

Shayne Longpre (@shayneredford) 's Twitter Profile Photo

Thrilled to collaborate on the launch of 📚 CommonPile v0.1 📚 ! Introducing the largest openly-licensed LLM pretraining corpus (8 TB), led by Nikhil Kandpal Brian Lester Colin Raffel. 📜: arxiv.org/pdf/2506.05209 📚🤖 Data & models: huggingface.co/common-pile 1/

Thrilled to collaborate on the launch of 📚 CommonPile v0.1 📚 !

Introducing the largest openly-licensed LLM pretraining corpus (8 TB), led by <a href="/kandpal_nikhil/">Nikhil Kandpal</a> <a href="/blester125/">Brian Lester</a> <a href="/colinraffel/">Colin Raffel</a>.

📜: arxiv.org/pdf/2506.05209
 📚🤖 Data &amp; models: huggingface.co/common-pile
1/
Shayne Longpre (@shayneredford) 's Twitter Profile Photo

Copyrighted 🚧, private 🛑, and sensitive ☢️ data remain major challenges for AI. FlexOlmo introduces an architectural mechanism to flexibly opt-in/opt-out segments of data in the training weights, **at inference time**. (Prior common solutions were to filter your data once

Copyrighted 🚧, private 🛑, and sensitive ☢️ data remain major challenges for AI. 

FlexOlmo introduces an architectural mechanism to flexibly opt-in/opt-out segments of data in the training weights, **at inference time**.

(Prior common solutions were to filter your data once