TheStage AI (@thestageai) 's Twitter Profile
TheStage AI

@thestageai

A full-stack AI platform πŸ‘½ Trusted voice in AI, we grindin', no sleep ✨

ID: 1655522257555320832

linkhttps://www.thestage.ai/ calendar_today08-05-2023 10:38:05

43 Tweet

392 Followers

24 Following

TheStage AI (@thestageai) 's Twitter Profile Photo

Wrong model = slow app. We help you pick the right one for your GPU. You can now explore a new Models section on our platform β€” with performance-optimized versions of open-source models like Qwen, Mistral, Llama, DeepSeek, and Flux. These models are tuned for real tasks: β†’

TheStage AI (@thestageai) 's Twitter Profile Photo

πŸ₯ Bon appΓ©tit, developers. New Mistral AI models for self-hosting accelerated by TheStage AI: - New LLM: Mistral Small 24B - New VLM: Mistral Small 3.1 24B - Achieves speeds up to 90 tok/s on a single H100! - Available in our standard 4 tiers: S, M, L, XL Models follow

TheStage AI (@thestageai) 's Twitter Profile Photo

Bonjour, Paris πŸ‡«πŸ‡· Just wrapped 2 amazing days at @NVIDIA #GTCParis at Viva Technology β€” AI infra, agentic systems, and robots walking around. Great convos with ElevenLabs, @mistralai, Nebius, Recraft & more. Still in town β€” DM us if you wanna talk AI (IRL in Paris β˜•πŸ₯)

Bonjour, Paris πŸ‡«πŸ‡·

Just wrapped 2 amazing days at @NVIDIA #GTCParis at <a href="/VivaTech/">Viva Technology</a> β€” AI infra, agentic systems, and robots walking around. Great convos with <a href="/elevenlabsio/">ElevenLabs</a>, @mistralai, <a href="/nebiusai/">Nebius</a>, <a href="/recraftai/">Recraft</a> &amp; more. Still in town β€” DM us if you wanna talk AI (IRL in Paris β˜•πŸ₯)
Kirill Solodskikh (@garchfather) 's Twitter Profile Photo

β–šβ–žβ–šβ–ž DATA LOG: AI EUROPE β–šβ–žβ–šβ–ž For years, AI talk was all Silicon Valley. After @NVIDIA #GTCParis, one thing became clear: Europe’s AI ecosystem has already kicked into high gear. πŸ‡«πŸ‡· Mistral AI’s dropping open weights that actually run. πŸ‡©πŸ‡ͺ Aleph Alpha building native

Kirill Solodskikh (@garchfather) 's Twitter Profile Photo

⌁ EUROPE SIGNAL: ACTIVE ⌁ ↳ Want to accelerate your model’s inference? ↳ These guys sure do. ✦ Berlin: mapped next steps with our investors Christophe Maire and Lukas Erbguth of Atlantic Labs. ✦ Paris: NVIDIA GTC showed us what’s possible. ✦ Germany: more investor talks

⌁ EUROPE SIGNAL: ACTIVE ⌁

↳ Want to accelerate your model’s inference?
↳ These guys sure do.

✦ Berlin: mapped next steps with our investors Christophe Maire and Lukas Erbguth of Atlantic Labs.
✦ Paris: <a href="/NVIDIAGTC/">NVIDIA GTC</a> showed us what’s possible.
✦ Germany: more investor talks
Kirill Solodskikh (@garchfather) 's Twitter Profile Photo

Meet Elastic MusicGen Large β€” our optimized fork of AI at Meta's MusicGen, powered by ANNA (TheStage AI’s Automated Neural Network Accelerator): huggingface.co/TheStageAI/Ela… Ye ye used AI for vocals on "Bully," calling it the "next Auto-Tune." He switched up later, but tracks

TheStage AI (@thestageai) 's Twitter Profile Photo

πŸ₯— What if you could generate 10,000+ AI images for $1 β€” each in just 1.2 seconds? We made it happen β€” 2.4Γ— faster than most RTX 4090 pipelines, at a fraction of the cost. Check it out here: app.thestage.ai/models/FLUX.1-… ⟿ How? We tuned Black Forest Labs's FLUX.1 [schnell] model with our ANNA

TheStage AI (@thestageai) 's Twitter Profile Photo

AI engineers and researchers can now use our Quantization API to run accelerated LLMs, VLMs, and diffusion on NVIDIA and edge. Faster, cheaper, same quality. Built by our research lab. Docs & API live β€” early testers welcome.

Kirill Solodskikh (@garchfather) 's Twitter Profile Photo

Our TheStage AI team was happy to gain early access to the NVIDIA B200 from Nebius and establish benchmarking for our optimized diffusion models. We now fully support inference of optimized models on B200 across various AI applications - LLMs, VLMs, Text-to-Image,

Kirill Solodskikh (@garchfather) 's Twitter Profile Photo

Our research team took AI at Meta LLaMA-8B, quantized it with QLIP using post-training int8, applied SmoothQuant, and used pre-defined compiler-compatible NVIDIA configs. Why do this? Up to 2Γ— fewer weights and 3.6Γ— faster on one GPU. Try it with our simple Jupyter Notebook.

Kirill Solodskikh (@garchfather) 's Twitter Profile Photo

Can LLMs recognize ASCII art? Our tests show accelerated Elastic Models analyze line-by-line features and combine them using statistical patterns. Try it yourself with DeepSeek-Qwen-14B – 120 tok/s on H100, 40 tok/s on L40s, up to 3Γ— faster. Free API token!

Kirill Solodskikh (@garchfather) 's Twitter Profile Photo

Self-hosted text-to-image on H100 with TheStage AI Elastic Models, accelerated from FLUX.1-schnell Black Forest Labs. Our fastest model S generates a high-quality image in 0.5β€―s. Precompiled and ready-to-deploy – minimal cold start. Tutorial + access token inside if you want to try.

TheStage AI (@thestageai) 's Twitter Profile Photo

Imagine paying $30 for 10k images when Salad Cloud + ANNA does it for $1 πŸ’€ FLUX.1-schnell ~1.2 s/image, high-quality output ANNA auto-tunes models to balance speed and quality OpenAI-compatible API, fully self-hosted. Quick guide shows how to run your own endpoint

Kirill Solodskikh (@garchfather) 's Twitter Profile Photo

Quantization delivers speedup but can reduce quality. Our researchers prepared a tutorial showing how ANNA automatically quantizes Flux and accelerates it 2Γ— while keeping quality high. Orig. model latency: 6.4 s. Check the link. DM or comment for early access.

TheStage AI (@thestageai) 's Twitter Profile Photo

For AI builders and researchers: get early access to QLIP + ANNA for DNN optimization and acceleration – cloud, self-host, edge. Get a free commercial license. Collaborate with us on research, integrate your algorithms, or simplify deployment. Limited spots – apply today ↓

Kirill Solodskikh (@garchfather) 's Twitter Profile Photo

πŸš€ Early access to ANNA: Automated NNs Accelerator now available! ✨ Get your access here: app.thestage.ai/contact Questions? DM or comment below! πŸ’¬ With ANNA, you can: πŸ”„ Simply upload your model, data, and desired metrics πŸŽ›οΈ Fine-tune model size, latency, and quality with

πŸš€ Early access to ANNA: Automated NNs Accelerator now available! ✨

Get your access here: app.thestage.ai/contact
Questions? DM or comment below! πŸ’¬

With ANNA, you can:

πŸ”„ Simply upload your model, data, and desired metrics
πŸŽ›οΈ Fine-tune model size, latency, and quality with
Kirill Solodskikh (@garchfather) 's Twitter Profile Photo

How to measure the quality of text-to-image models? Our research team TheStage AI put together a comprehensive guide to check perceptual quality, sharpness, color, prompt alignment, and more. All the tricky image quality questions researchers usually ask are covered here↓

Azim K (@quaz1m) 's Twitter Profile Photo

Validation is a key step when compressing or accelerating models. It shows if the network still performs well. Our research team TheStage AI shared evaluation methods for sharpness, tone, color, object placement, and more

TheStage AI (@thestageai) 's Twitter Profile Photo

Excited to share our MLPerf Inference v5.1 results (MLCommons). We ran Stability AI SDXL on 8Γ—H100 via Nebius with our stack, ANNA. 18.1 img/s in target quality range. Fast, reproducible, world-class performance from our team, submitted alongside top AI players ↓