Sinatras (@myainotez) 's Twitter Profile
Sinatras

@myainotez

Entropy Preservation Officer
Bs CS&EE , AI/ML Engineer in Automotive

ID: 1795764095041302529

linkhttp://sinatras.dev calendar_today29-05-2024 10:28:32

939 Tweet

751 Takipçi

253 Takip Edilen

Luis (@lusxvr) 's Twitter Profile Photo

FineVision is not only bigger and more diverse than 3 popular open-source alternatives, models trained on it also perform significantly better. Check out all the details in the Blog Post: huggingface.co/spaces/Hugging…

FineVision is not only bigger and more diverse than 3 popular open-source alternatives, models trained on it also perform significantly better.

Check out all the details in the Blog Post: huggingface.co/spaces/Hugging…
Omar Sanseviero (@osanseviero) 's Twitter Profile Photo

Introducing EmbeddingGemma🎉 🔥With only 308M params, this is the top open model under 500M 🌏Trained on 100+ languages 🪆Flexible embeddings (768 to 128 dims) with Matryoshka 🤗Works with your favorite open tools 🤏Runs with as little as 200MB developers.googleblog.com/en/introducing…

Introducing EmbeddingGemma🎉

🔥With only 308M params, this is the top open model under 500M
🌏Trained on 100+ languages
🪆Flexible embeddings (768 to 128 dims) with Matryoshka
🤗Works with your favorite open tools
🤏Runs with as little as 200MB

developers.googleblog.com/en/introducing…
DailyPapers (@huggingpapers) 's Twitter Profile Photo

Meta FAIR just unveiled VLWM on Hugging Face! This Vision Language World Model is a new foundation for planning with reasoning directly on natural videos. It combines reactive (system-1) and reflective (system-2) planning for SoTA performance!

OpenAI (@openai) 's Twitter Profile Photo

By popular request: you can now branch conversations in ChatGPT, letting you more easily explore different directions without losing your original thread. Available now to logged-in users on web.

By popular request: you can now branch conversations in ChatGPT, letting you more easily explore different directions without losing your original thread.

Available now to logged-in users on web.
Qwen (@alibaba_qwen) 's Twitter Profile Photo

Big news: Introducing Qwen3-Max-Preview (Instruct) — our biggest model yet, with over 1 trillion parameters! 🚀 Now available via Qwen Chat & Alibaba Cloud API. Benchmarks show it beats our previous best, Qwen3-235B-A22B-2507. Internal tests + early user feedback confirm:

Big news: Introducing Qwen3-Max-Preview (Instruct) — our biggest model yet, with over 1 trillion parameters! 🚀

Now available via Qwen Chat & Alibaba Cloud API.

Benchmarks show it beats our previous best, Qwen3-235B-A22B-2507. Internal tests + early user feedback confirm:
Prime Intellect (@primeintellect) 's Twitter Profile Photo

Lean 4 Theorem Proving Multi-turn formal theorem proving in Lean 4, where models alternate between reasoning, sketching proof code, receiving feedback. Ideal for search guided rl, process rewards, and curriculum design. By Sinatras app.primeintellect.ai/dashboard/envi…

Guilherme Penedo (@gui_penedo) 's Twitter Profile Photo

> we've hit a data wall > pretraining is dead Is it? Today we are releasing 📄 FinePDFs: 3T tokens of new text data for pre-training that until now had been locked away inside PDFs. It is the largest permissively licensed corpus sourced exclusively from PDFs.

> we've hit a data wall
> pretraining is dead

Is it?

Today we are releasing 📄 FinePDFs: 3T tokens of new text data for pre-training that until now had been locked away inside PDFs. 
It is the largest permissively licensed corpus sourced exclusively from PDFs.
elie (@eliebakouch) 's Twitter Profile Photo

Freshly curated open dataset with 3T multilingual tokens from PDFs. > containing about 3 trillion tokens across 475 million documents in 1,733 languages. > new source of data, with a knowledge cutoff in february 2025 > sota performance when mixed with fineweb-edu/dclm.

Freshly curated open dataset with 3T multilingual tokens from PDFs.

> containing about 3 trillion tokens across 475 million documents in 1,733 languages.
> new source of data, with a knowledge cutoff in february 2025
> sota performance when mixed with fineweb-edu/dclm.
Sinatras (@myainotez) 's Twitter Profile Photo

Test it for free, i got some impressive eval results for its size on couple domain specific tasks past week. Will revisit it with SFT in near future.

Sinatras (@myainotez) 's Twitter Profile Photo

Official reminder that your favorite coding asistant will be quantized to 1.58 bits in ~10 minutes have fun with smart model while its lasts, until next time cheers

Official reminder that your favorite coding asistant will be quantized to 1.58 bits in ~10 minutes have fun with smart model while its lasts, until next time cheers