Pankaj Gupta (@defpan) Twitter Tweets • TwiCopy

Baseten

5 months ago

We're thrilled to be included in the #ForbesAI50! 🎉 Congratulations to everyone who made it, it's great to see so many of our customers and partners here too!

thumb_up_off_alt19

chat_bubble_outline3

repeat6

shareShare

We have day 0 support for #Qwen3 by Alibaba Qwen on Baseten using SGLang. Qwen 3 235B's architecture benefits from both Tensor Parallelism and Expert Parallelism to run Attention and Sparse MoE efficiently across 4 or 8 H100 GPUs depending on quantization. More in 🧵

thumb_up_off_alt50

chat_bubble_outline4

repeat12

shareShare

Baseten

@basetenco

5 months ago

"This is the thing about AI — you gotta burn the boats.” Our CEO Tuhin Srivastava sat down with Emma Cosgrove and the Business Insider team to discuss keeping pace with the constant hardware, software, and AI model drops.

thumb_up_off_alt7

chat_bubble_outline1

repeat3

shareShare

Elias

@eliasfiz

5 months ago

People told us they want Orpheus TTS in production. So we partnered with Baseten as our preferred inference provider! Baseten runs Orpheus with: •⁠ ⁠Low latency (<200 ms TTFB) •⁠ ⁠High throughput (up to 48 real-time streams per H100) •⁠ ⁠Secure, worldwide infra

People told us they want Orpheus TTS in production.

So we partnered with <a href="/basetenco/">Baseten</a> as our preferred inference provider!

Baseten runs Orpheus with:

•⁠ ⁠Low latency (<200 ms TTFB)
•⁠ ⁠High throughput (up to 48 real-time streams per H100)
•⁠ ⁠Secure, worldwide infra

thumb_up_off_alt165

chat_bubble_outline16

repeat14

shareShare

Philip Kiely

@philip_kiely

5 months ago

Deploying and vibe checking Orpheus TTS, an open-source model for generating speech. Our implementation supports up to 48 concurrent real-time users per H100 GPU!

thumb_up_off_alt20

chat_bubble_outline4

repeat4

shareShare

Baseten

@basetenco

4 months ago

Congrats to our friends at Patronus AI on the new AI agent launch, Percival! Percival can fix other agents across 20+ common failure modes, a very necessary tool in the growing agent landscape. Check it out.

thumb_up_off_alt14

chat_bubble_outline0

repeat2

shareShare

Baseten

@basetenco

4 months ago

🚀 We've been heads down for months, and now it's finally launch week. Today, we’re releasing our new brand. We believe inference is the foundation of all AI going forward. That's what our new look is all about: 𝗕𝗮𝘀𝗲𝘁𝗲𝗻 𝗶𝘀 𝘁𝗵𝗲 𝗯𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗯𝗹𝗼𝗰𝗸𝘀 𝗼𝗳

thumb_up_off_alt66

chat_bubble_outline4

repeat11

shareShare

Baseten

@basetenco

4 months ago

🚀 Our "technical" marketer might not be looped in, but today is our biggest launch day yet. We're introducing two new products to serve the inference lifecycle: Model APIs and Training. Model APIs are frontier models running on the Baseten Inference Stack, purpose-built for

thumb_up_off_alt89

chat_bubble_outline7

repeat22

shareShare

Greg Schoeninger

@gregschoeninger

4 months ago

Excited to be partnering with Baseten to combine oxen.ai's datasets with their multi-node GPU infra 🚀 reach out if you want early access!

Excited to be partnering with <a href="/basetenco/">Baseten</a> to combine <a href="/oxen_ai/">oxen.ai</a>'s datasets with their multi-node GPU infra 🚀 reach out if you want early access!

thumb_up_off_alt37

chat_bubble_outline10

repeat15

shareShare

Baseten

@basetenco

4 months ago

Our secret sauce? The Baseten Inference Stack. It consists of two core layers: the Inference Runtime and Inference-optimized Infrastructure. Our engineers break down all the levers we pull to optimize each layer in our new white paper.

thumb_up_off_alt27

chat_bubble_outline4

repeat6

shareShare

Baseten

@basetenco

4 months ago

Congrats to our friends at Retool! Agents are a game-changer for automating repetitive tasks (and Retool has automated over 100M hours of labor already). We're thrilled to power Retool Agents with our Model APIs, which support tool usage out of the box!

thumb_up_off_alt13

chat_bubble_outline0

repeat1

shareShare

rime

@rimelabs

4 months ago

Big news: Rime has raised a $5.5M seed round! 💸💸 We're building the most expressive, lifelike AI voices for real-time conversations, voices that sound truly human. Led by Unusual Ventures with support from Founders You Should Know, Cadenza, and incredible angels like Michael

thumb_up_off_alt58

chat_bubble_outline12

repeat9

shareShare

Baseten

@basetenco

4 months ago

New DeepSeek just dropped. Proud to serve the fastest DeepSeek R1 0528 inference on OpenRouter (#1 on TTFT and TPS) with our Model APIs.

thumb_up_off_alt21

chat_bubble_outline4

repeat9

shareShare

Wispr Flow

@wisprflow

4 months ago

It's official — Wispr Flow is now live on the iPhone App Store! We built the first immersive voice keyboard that lets you dictate with incredible accuracy anywhere — 5x faster than typing. Delightful. Effortless. Intelligent. Our mission is to change how people interact with

thumb_up_off_alt388

chat_bubble_outline85

repeat41

shareShare

Baseten

@basetenco

4 months ago

We’re excited to partner with oxen.ai on their fine-tuning launch. It’s almost too easy — zero-code fine-tuning, from dataset to custom model in a few clicks.

thumb_up_off_alt20

chat_bubble_outline2

repeat9

shareShare

Ian Cairns

@cairns

4 months ago

🎙️ New Deployed episode with Zed founder Nathan Sobo is live! Nathan's been building better code editors for 10+ years. Now Zed has some of the most impressive agent AI editing features (including real-time streaming edits). His number one piece of advice: "Automate your

thumb_up_off_alt9

chat_bubble_outline1

repeat5

shareShare

Google Cloud

@googlecloud

4 months ago

AI inference matters. Baseten's revolutionary AI infrastructure platform, built on Google Cloud, optimizes processing even for massive models, gets your AI products to market 50% faster, and slashes costs with 90% savings compared to endpoint vendors ↓

thumb_up_off_alt130

chat_bubble_outline4

repeat32

shareShare

Baseten

@basetenco

3 months ago

Our customers run AI products where every millisecond and request matter. Over the years, we found fundamental limitations in traditional deployment approaches — single points of failure, regional and cloud-specific capacity constraints, and the operational headache of managing

thumb_up_off_alt21

chat_bubble_outline2

repeat6

shareShare

Baseten

@basetenco

3 months ago

Forward deployed engineers (FDEs) are core to our company. They work directly with customers, contribute to product development, and shape our roadmap. Vlad, our Head of FDE, wrote a blog to break down what makes FDE special, when to use FDEs, and how to scale a successful team.

thumb_up_off_alt23

chat_bubble_outline1

repeat3

shareShare

Baseten

@basetenco

3 months ago

We're excited to introduce the Baseten Performance Client, a new open-source Python library for up to 12x higher throughput for high-volume embedding tasks! Stand up a new vector database, preprocess text, and run massive workloads in <2 minutes (vs. 15+ with AsyncOpenAI).

thumb_up_off_alt23

chat_bubble_outline5

repeat6

shareShare