Instruction Workshop, NeurIPS 2023 (@itif_workshop) 's Twitter Profile
Instruction Workshop, NeurIPS 2023

@itif_workshop

The official account of the 1st Workshop on Instruction Tuning and Instruction Following (ITIF), colocated with NeurIPS, in December 2023.

ID: 1689312241542430721

linkhttps://an-instructive-workshop.github.io/ calendar_today09-08-2023 16:27:08

185 Tweet

257 Followers

28 Following

Shayne Longpre (@shayneredford) 's Twitter Profile Photo

Thrilled to collaborate on the launch of πŸ“š CommonPile v0.1 πŸ“š ! Introducing the largest openly-licensed LLM pretraining corpus (8 TB), led by Nikhil Kandpal Brian Lester Colin Raffel. πŸ“œ: arxiv.org/pdf/2506.05209 πŸ“šπŸ€– Data & models: huggingface.co/common-pile 1/

Thrilled to collaborate on the launch of πŸ“š CommonPile v0.1 πŸ“š !

Introducing the largest openly-licensed LLM pretraining corpus (8 TB), led by <a href="/kandpal_nikhil/">Nikhil Kandpal</a> <a href="/blester125/">Brian Lester</a> <a href="/colinraffel/">Colin Raffel</a>.

πŸ“œ: arxiv.org/pdf/2506.05209
 πŸ“šπŸ€– Data &amp; models: huggingface.co/common-pile
1/
Shayne Longpre (@shayneredford) 's Twitter Profile Photo

Copyrighted 🚧, private πŸ›‘, and sensitive ☒️ data remain major challenges for AI. FlexOlmo introduces an architectural mechanism to flexibly opt-in/opt-out segments of data in the training weights, **at inference time**. (Prior common solutions were to filter your data once

Copyrighted 🚧, private πŸ›‘, and sensitive ☒️ data remain major challenges for AI. 

FlexOlmo introduces an architectural mechanism to flexibly opt-in/opt-out segments of data in the training weights, **at inference time**.

(Prior common solutions were to filter your data once