Kirill Solodskikh (@garchfather) Twitter Tweets • TwiCopy

Kirill Solodskikh

@garchfather

+ Follow

Almost Phd, Almost Founder, Almost Team Lead, Almost Successful, married.

@TheStageAI Co-founder, CEO/CTO, ex Huawei P50 AI cameras

ID: 1577397163189014538

linkhttp://thestage.ai calendar_today04-10-2022 20:36:27

143 Tweet

215 Followers

757 Following

Kirill Solodskikh

@garchfather

5 months ago

We updated TheWhisper, our open source speech-to-text engine for self-hosted/on-device use. It now supports NVIDIA H100, L40S, RTX 4090, and RTX 5090. Benchmarks vs other Whisper libs show the best Time to First Token and Real-Time Factor. Try it

thumb_up_off_alt357

chat_bubble_outline15

repeat17

shareShare

Kirill Solodskikh

@garchfather

5 months ago

Genius of content 😂

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

TheStage AI

@thestageai

5 months ago

Significant speed and size gains in model inference are possible without hurting output quality. ANNA is our PyTorch framework for automated model acceleration, a new way to think about MLOps. Smaller ckpts, lower cost, faster inference, no retrain. Test demo or request access

thumb_up_off_alt146

chat_bubble_outline1

repeat8

shareShare

Kirill Solodskikh

@garchfather

3 months ago

Hey Grok, please summarize this TheWhisper repo: github.com/TheStageAI/The…

thumb_up_off_alt2

chat_bubble_outline1

repeat0

shareShare

Kirill Solodskikh

@garchfather

3 months ago

There are a lot of releases on ASR! One of them is open-weight and with optimized Apple inference engines. github.com/TheStageAI/The…

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

TheStage AI

@thestageai

3 months ago

We know what you mean Adele

thumb_up_off_alt25

chat_bubble_outline1

repeat8

shareShare

Kirill Solodskikh

@garchfather

3 months ago

Good weekend! I spent time testing our releases more extensively and writing usage guides during my tests. Suddenly Akshat Bubna and Charles 🎉 Frye from Modal liked my notebook. While testing TheWhisper with Azim K, I found that Mati Staniszewski started following me! Quietly motivating!

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

TheStage AI

@thestageai

3 months ago

Are you a big fan of jacket potato? This is an open-source, real-time multilingual ASR for live speech. It stays robust in heavy noise – even at SNR 0 dB. That’s why it understands speech where people struggle to hear. Use it for transcription, research, and multilingual apps

thumb_up_off_alt363

chat_bubble_outline2

repeat32

shareShare

TheStage AI

@thestageai

2 months ago

Proud to team up with Brilliant Labs and Neuphonic on Halo’s on-device privacy engine. Coming to Brilliant Labs’ Halo smart glasses: real-time voice + vision, POV stays private. ANNA + GPU/NPU SDK + memory manager for wake word, STT, TTS, diarization. SDK demo 👇

thumb_up_off_alt23

chat_bubble_outline5

repeat10

shareShare

TheStage AI

@thestageai

a month ago

How do you make text-to-music run in real time in production? The model has to keep audio generation ahead of playback. Our new case study with Mirelo.AI shows how inference optimization delivered up to 2.4х higher throughput. See the full case study ↓

thumb_up_off_alt8

chat_bubble_outline0

repeat4

shareShare

Kirill Solodskikh

@garchfather

25 days ago

Open-source experiments dashboard for AI researchers. Cool comparison overlays across modalities. What add next? S3 integration, authentication, model registry? github.com/TheStageAI/Spi…

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Kirill Solodskikh

@garchfather

24 days ago

Actually, comparing 1-bit with 16-bit has no sense. Everyone is using 4-bit weights with MLX. And the speed will be around 150-180 tok/s on M4 Pro. Moreover, 4-bit quantization in MLX can be done as block quantization what preserve quality for the most cases.

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Kirill Solodskikh

@garchfather

16 days ago

Self-hosted AGI starts with inference infra teams can actually run. Well. Elastic Models v0.2.0 is much more self-serve: world’s fastest whisper-large-v3-turbo, Wan2.2 generating 5s of video in 34s on H100, and instant FLUX LoRA switching. Explore v0.2.0

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

TheStage AI

@thestageai

14 days ago

Beyoncé heard cursing. TheWhisper heard Arsenal. The fastest Whisper in the world. Open-source real-time ASR. Top 5 on OpenASR benchmarks. 1800 RTFx. Built for live captions, transcription, and voice apps. See the repo

thumb_up_off_alt182

chat_bubble_outline4

repeat19

shareShare