Daniel van Strien (@vanstriendaniel) 's Twitter Profile
Daniel van Strien

@vanstriendaniel

Machine Learning Librarian @huggingface 🤗
I like datasets.

ID: 2828117077

linkhttps://danielvanstrien.xyz/ calendar_today23-09-2014 13:43:54

4,4K Tweet

4,4K Takipçi

1,1K Takip Edilen

Alex Strick van Linschoten (@strickvl) 's Twitter Profile Photo

I've been building a research tool that automatically extracts entities from historical document collections to create structured knowledge databases. (It's called 'hinbox' (i.e. 'historian in a box') via McChrystal's framing IYKYK 🫠) The project connects my historian

I've been building a research tool that automatically extracts entities from historical document collections to create structured knowledge databases. (It's called 'hinbox' (i.e. 'historian in a box') via McChrystal's framing IYKYK 🫠)

The project connects my historian
Daniel van Strien (@vanstriendaniel) 's Twitter Profile Photo

Over 1.5M models on Hugging Face… How do you pick the right one for your needs? 🔍 Try this semantic search prototype with size filters (0-1B to 70B+): 🔗 huggingface.co/spaces/librari…

Edwin Rijgersberg (@e_rijgersberg) 's Twitter Profile Photo

At the NFI, our AI can be both incredibly nerdy and incredibly awesome! As one of the first acts of the newly founded NFI Digital Forensics Datalab, we're open sourcing a new binary code embedding model for ARM64-code: 🦾 ARM64BERT-embedding 🦾 huggingface.co/NetherlandsFor…

Dongfu Jiang (@dongfujiang) 's Twitter Profile Photo

🚀 New benchmark alert! StructEval evaluates how well LLMs generate structured outputs - from JSON APIs to React components to scientific diagrams. We tested 12 models and found some interesting results… 🧵 Why structured outputs matter (and why they're hard): - Real apps

Raphaël Sourty (@raphaelsrty) 's Twitter Profile Photo

I'm thrilled to announce the release of FastPlaid ! 🚀🚀 FastPlaid is a high-performance engine for multi-vector search, built from the ground up in Rust (with the help of Torch C++)⚡️ You can view FastPlaid as the counterpart of Faiss for multi vectors.

vLLM (@vllm_project) 's Twitter Profile Photo

Congrats on the launch! vLLM is proud to support the new Qwen3 embedding models, check it out 👉🏻 github.com/QwenLM/Qwen3-E…

Ryan Marten (@ryanmart3n) 's Twitter Profile Photo

Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals. We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data

Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals.

We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data
Sebastian Majstorovic (@storytracer) 's Twitter Profile Photo

Common Pile v0.1 is only the beginning. At EleutherAI we will publish open datasets on a regular basis from now on, using the toolkit we launched together with Mozilla to extract high-quality data from openly licensed content: blog.mozilla.org/en/mozilla/ai/…. Stay tuned for more!

Colaboratory (@googlecolab) 's Twitter Profile Photo

🤗The future of ML is accessible & collaborative 🤝 We’ve partnered with Hugging Face to add “Open in Colab” support for all models on the Hugging Face Hub. Now you can directly launch a Colab notebook from any model card, making it easier than ever to experiment with and