Milan Kryl (@mikr) 's Twitter Profile
Milan Kryl

@mikr

Data Scientist at @AlmaCareer_CZ
Training models for work and pleasure.
#python #huggingface #whisper #mistral #llama #LLM #hardware #technology

ID: 14515944

linkhttp://www.milankryl.cz/ calendar_today24-04-2008 18:53:03

12,12K Tweet

668 Followers

318 Following

Marktechpost AI Research News ⚡ (@marktechpost) 's Twitter Profile Photo

LLMs No Longer Require Powerful Servers: Researchers from MIT, KAUST, ISTA, and Yandex Introduce a New AI Approach to Rapidly Compress Large Language Models without a Significant Loss of Quality The Yandex Research team, together with researchers from the Massachusetts Institute

LLMs No Longer Require Powerful Servers: Researchers from MIT, KAUST, ISTA, and Yandex Introduce a New AI Approach to Rapidly Compress Large Language Models without a Significant Loss of Quality

The Yandex Research team, together with researchers from the Massachusetts Institute
Sergei Nozdrenkov (@nozdrenkov) 's Twitter Profile Photo

We’re open-sourcing 352GB of Coral Reef pics (13 sites, 90k pics) from Indonesia under CC-BY-4.0 🌏🪸 3D photogrammetry data to accelerate research/conservation, no strings attached 🤗 🔵 Why? Coral reefs are so precious, beautiful, incredibly complex and threatened ecosystems.

We’re open-sourcing 352GB of Coral Reef pics (13 sites, 90k pics) from Indonesia under CC-BY-4.0 🌏🪸 3D photogrammetry data to accelerate research/conservation, no strings attached 🤗

🔵 Why? Coral reefs are so precious, beautiful, incredibly complex and threatened ecosystems.
Aurick Qiao (@aurickq) 's Twitter Profile Photo

Excited to share our work on Speculative Decoding Snowflake AI Research! 🚀 4x faster LLM inference for coding agents like OpenHands All Hands AI 💬 2.4x faster LLM inference for interactive chat 💻 Open-source via Arctic Inference as a plugin for vLLM 🧵

Excited to share our work on Speculative Decoding <a href="/Snowflake/">Snowflake</a> AI Research!

🚀 4x faster LLM inference for coding agents like OpenHands <a href="/allhands_ai/">All Hands AI</a>

💬 2.4x faster LLM inference for interactive chat 

💻 Open-source via Arctic Inference as a plugin for <a href="/vllm_project/">vLLM</a> 

🧵
clem 🤗 (@clementdelangue) 's Twitter Profile Photo

The LeRobot hackathon is now scheduled to happen in 44 different locations at the same time. Which city is missing: London (UK) - Cotono (Benin) - Toulouse, Paris & 2 in Lyon (France) - Antwerp (Belgium) - Santiago (Chile) - Isfahan (Iran) - Anchen, Berlin & Munich (Germany)

Neel Kant (@_neel_kant) 's Twitter Profile Photo

🎉Factorio Learning Environment 0.2.0 released! 📖Details: jackhopkins.github.io/factorio-learn… New Features: - Multi-agent support - Reasoning models + MCP - Reflection & backtracking - Vision-augmented inputs and more frontier model results! The initial release of FLE was met with great

🎉Factorio Learning Environment 0.2.0 released!
📖Details: jackhopkins.github.io/factorio-learn…

New Features:
- Multi-agent support
- Reasoning models + MCP
- Reflection &amp; backtracking
- Vision-augmented inputs
and more frontier model results!

The initial release of FLE was met with great
Crémieux (@cremieuxrecueil) 's Twitter Profile Photo

Someone slapped together a big dataset of password leaks today. Here's the distribution of pin numbers from a few times those got leaked a while back.

Someone slapped together a big dataset of password leaks today.

Here's the distribution of pin numbers from a few times those got leaked a while back.
Black Forest Labs (@bfl_ml) 's Twitter Profile Photo

High quality image editing no longer needs closed models We release FLUX.1 Kontext [dev] - an open weights model for proprietary-level image editing performance. Runs on consumer chips. ✓ Open weights available ✓ Best in-class performance ✓ Self-serve commercial licensing

High quality image editing no longer needs closed models

We release FLUX.1 Kontext [dev] - an open weights model for proprietary-level image editing performance. Runs on consumer chips.

✓ Open weights available
✓ Best in-class performance
✓ Self-serve commercial licensing
elie (@eliebakouch) 's Twitter Profile Photo

New variant of attention by meta going beyond the standard bilinear form. It's changing the beta coef in scaling laws (which is a big deal) + there is a efficient triton implementation. Huge.

New variant of attention by meta going beyond the standard bilinear form. It's changing the beta coef in scaling laws (which is a big deal) + there is a efficient triton implementation. Huge.
机器之心 JIQIZHIXIN (@synced_global) 's Twitter Profile Photo

Another attention! Introducing Power Attention—a breakthrough in efficient attention mechanisms! This novel linear-cost attention layer features tunable state size, completely decoupled from model parameters. Why it stands out: ⚡ Blazing-fast GPU kernels with fused operations

Another attention!

Introducing Power Attention—a breakthrough in efficient attention mechanisms!

This novel linear-cost attention layer features tunable state size, completely decoupled from model parameters.

Why it stands out:
⚡ Blazing-fast GPU kernels with fused operations
Stanford Online (@stanfordonline) 's Twitter Profile Photo

Our latest CS336 Language Modeling from Scratch lectures are now available! View the entire playlist here: youtube.com/playlist?list=…

Philipp Schmid (@_philschmid) 's Twitter Profile Photo

Interesting new Memory framework and paper released! MemOS claims to outperform competitors by treating memory as OS-like framework using a three-layer system: Interface, Operation, and Infrastructure. MemOS is an open-source library that claims to differentiates itself by: 1.

Interesting new Memory framework and paper released! MemOS claims to outperform competitors by treating memory as OS-like framework using a three-layer system: Interface, Operation, and Infrastructure. MemOS is an open-source library that claims to differentiates itself by:

1.
Dmitry Krotov (@dimakrotov) 's Twitter Profile Photo

In physics there is an elegant method for computing the correlation functions called generating function. The idea is simple - instead of computing correlators one by one - you define a function of a parameter and compute the average of that new function. Individual correlators

In physics there is an elegant method for computing the correlation functions called generating function. The idea is simple - instead of computing correlators one by one - you define a function of a parameter  and compute the average of that new function. Individual correlators
elie (@eliebakouch) 's Twitter Profile Photo

Kimi team just trained a state of the art open source model 32B active parameter/1T total with 0 training instabilities, thanks to MuonClip, this is amazing

Kimi team just trained a state of the art open source model 32B active parameter/1T total with 0 training instabilities, thanks to MuonClip, this is amazing
Sukjun (June) Hwang (@sukjun_hwang) 's Twitter Profile Photo

Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data

Tiezhen WANG (@xianbao_qian) 's Twitter Profile Photo

Why did Kimi.ai switched from closed source to open-source and released K2? - Reputation. If K2 is API only, it might have ended up like Grok-4 - clearly well-built yet still taking a lot of flak. - Work with the whole ecosystem. Less than 24 hours after release, the

Why did <a href="/Kimi_Moonshot/">Kimi.ai</a> switched from closed source to  open-source and released K2?

- Reputation. If K2 is API only, it might have ended up like <a href="/grok/">Grok</a>-4 - clearly well-built yet still taking a lot of flak.

- Work with the whole ecosystem. Less than 24 hours after release, the
👋 Jan (@jandotai) 's Twitter Profile Photo

Mistral released 2 open-source speech models: Voxtral (24B) & Voxtral Mini (3B). Both beat Whisper v3, GPT-4o-mini, and Scribe on ASR across 7+ languages, while supporting Q&A, summarization, and function-calling directly from voice. huggingface.co/mistralai

Mistral AI (@mistralai) 's Twitter Profile Photo

In our continued commitment to open-science, we are releasing the Voxtral Technical Report: arxiv.org/abs/2507.13264 The report covers details on pre-training, post-training, alignment and evaluations. We also present analysis on selecting the optimal model architecture, which

In our continued commitment to open-science, we are releasing the Voxtral Technical Report: arxiv.org/abs/2507.13264

The report covers details on pre-training, post-training, alignment and evaluations. We also present analysis on selecting the optimal model architecture, which
Black Forest Labs (@bfl_ml) 's Twitter Profile Photo

Today we are releasing FLUX.1 Krea [dev] - a new state-of-the-art open-weights FLUX model, built for photorealism. Developed in collaboration with KREA AI, this model is focused on images with unique aesthetics. No “AI look”, no blown-out highlights, just natural detail.