Phil Howes (@saltyph) Twitter Tweets • TwiCopy

Software Engineering Daily

5 years ago

BaseTen: Creating Machine Learning APIs with Tuhin Srivastava and Amir Haghighat Tuhin Srivastava softwareengineeringdaily.com/2021/05/19/bas…

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

Phil Howes

@saltyph

5 years ago

Finally revealing the result of our efforts over the past year and change. It's been so amazing building this with everyone on the Baseten team

thumb_up_off_alt4

chat_bubble_outline1

repeat0

shareShare

Today we're launching BaseTen. We've been working hard over the last 18 months to build the tool that we wished we had to operationalize our models. Please check it out, and if you're interested, we'd love to give you early access!

thumb_up_off_alt111

chat_bubble_outline10

repeat15

shareShare

Yad

@yadkonrad

5 years ago

Great convo between Jeff Meyerson, Amir Haghighat et al. I like BaseTen, that's what I would imagine near future of deployed ML models will be leaning towards to, given the amount of overhead needed to put together: - hosting a model - integrating other types of logic - usable interface

Great convo between <a href="/the_prion/">Jeff Meyerson</a>, <a href="/amiruci/">Amir Haghighat</a> et al.

I like BaseTen, that's what I would imagine near future of deployed ML models will be leaning towards to, given the amount of overhead needed to put together:

- hosting a model
- integrating other types of logic
- usable interface

thumb_up_off_alt2

chat_bubble_outline0

repeat2

shareShare

Baseten

@basetenco

3 years ago

Here’s another sneak peek at our Blueprint progress 👀 Meet our Web IDE Think of it as the single place for building, testing, and deploying API endpoints with generative AI models—in your browser.

thumb_up_off_alt18

chat_bubble_outline3

repeat6

shareShare

Amir Haghighat

@amiruci

3 years ago

We launched Blueprint today: an easy way for engineers to fine-tune and serve open source foundation models. 🧵

thumb_up_off_alt105

chat_bubble_outline2

repeat13

shareShare

Tuhin Srivastava

@tuhinone

2 years ago

We keep getting asked by users if they can use the 70B parameter model in production. We're serving the chat variant of Llama-2 70B on 2xA100 and getting pretty great throughput — it's cooking!

thumb_up_off_alt90

chat_bubble_outline4

repeat14

shareShare

Phil Howes

@saltyph

2 years ago

Repurposing Tuhin Srivastava's Llama v2 truss, got FreeWilly 2 up in under a minute. `:s/meta-llama\/Llama-2-70b-chat-hf/stabilityai\/FreeWilly2`. 275GB of weights later we're running at 23 tok/s out of the box.

thumb_up_off_alt48

chat_bubble_outline1

repeat11

shareShare

Baseten

@basetenco

2 years ago

Ready to try open source LLMs? Switch from GPT to Mistral 7B in the smallest refactor you'll ever ship: just 3 tiny code changes. If you're making the jump, DM us for $1,000 in free credits. baseten.co/blog/gpt-vs-mi…

thumb_up_off_alt15

chat_bubble_outline0

repeat8

shareShare

Phil Howes

@saltyph

2 years ago

every day i get to work with a world class team supporting customers with world class products. today we get to dream a little bigger

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

Phil Howes

@saltyph

2 years ago

when i tell people working in infra is like being a plumber people assume it’s because of lots of pipe connecting, when in fact it’s because i spend most of my day digging through shit

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

abu

@aqaderb

2 years ago

2 things. 1. i have loved working on this team. model performance is so much fun and so rewarding. 2. persistence is key. we started working on model performance end of 2023 and watching us slowly become better and better has been an incredible experience.

thumb_up_off_alt20

chat_bubble_outline1

repeat3

shareShare

Conviction

@conviction

2 years ago

Congrats to Conviction and Embed companies Baseten Figure Harvey LangChain Mistral AI @sierraplatform Pika (and our many pioneering friends) for making the #ForbesAI50 list! Ground floor of the revolution that will lead to many massive companies.

Congrats to Conviction and Embed companies <a href="/basetenco/">Baseten</a> <a href="/Figure_robot/">Figure</a>
<a href="/harvey__ai/">Harvey</a> <a href="/LangChainAI/">LangChain</a> <a href="/MistralAI/">Mistral AI</a>
@sierraplatform <a href="/pika_labs/">Pika</a> (and our many pioneering friends) for making the #ForbesAI50 list!

Ground floor of the revolution that will lead to many massive companies.

thumb_up_off_alt26

chat_bubble_outline2

repeat7

shareShare

Baseten

@basetenco

2 years ago

The models are available at the following links: Llama 3 8B Instruct: baseten.co/library/llama-… Llama 3 70B Instruct: baseten.co/library/llama-…

thumb_up_off_alt6

chat_bubble_outline0

repeat3

shareShare

Phil Howes

@saltyph

10 months ago

hit new peak demand today, 3 million RPS. thanks for stress testing our infra anon internet friend

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Michael Feil

@feilsystem

10 months ago

New Qwen-QWQ running at 90tokens/s generation speed on a single H100 Baseten using a new spec-dec stack. Around 2x more than the rest of the leaderboard (artificialanalysis.ai/leaderboards/p…).

thumb_up_off_alt29

chat_bubble_outline1

repeat11

shareShare

Phil Howes

@saltyph

7 months ago

you can just do things faster

thumb_up_off_alt11

chat_bubble_outline1

repeat0

shareShare

Phil Howes

@saltyph

5 months ago

💪🫡 still plenty of juice to squeeze out of this one

thumb_up_off_alt10

chat_bubble_outline1

repeat0

shareShare

Phil Howes

@saltyph

2 months ago

speculation, in this case a eagle-3, remains one of the biggest levers to go from good to great. amazing job to leapfrog the market and get the most out of our GPUs

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Phil Howes

@saltyph

2 months ago

so much potential in this model and abu coming out of the gates just ripping the landscape on perf

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare