Vansh Singh (@vanshcsingh) 's Twitter Profile
Vansh Singh

@vanshcsingh

Finetuning @DbrxMosaicAI. Previously @Stripe.

ID: 934296125695180800

calendar_today25-11-2017 05:42:09

102 Tweet

218 Takipçi

623 Takip Edilen

Jonathan Frankle (@jefrankle) 's Twitter Profile Photo

Meet DBRX, a new sota open llm from Databricks. It's a 132B MoE with 36B active params trained from scratch on 12T tokens. It sets a new bar on all the standard benchmarks, and - as an MoE - inference is blazingly fast. Simply put, it's the model your data has been waiting for.

Meet DBRX, a new sota open llm from <a href="/databricks/">Databricks</a>. It's a 132B MoE with 36B active params trained from scratch on 12T tokens. It sets a new bar on all the standard benchmarks, and - as an MoE - inference is blazingly fast. Simply put, it's the model your data has been waiting for.
Vansh Singh (@vanshcsingh) 's Twitter Profile Photo

Yes, we dropped the best open-source model as of today. Yes, you can train your own SOTA models too. Train on Databricks! databricks.com/blog/introduci…

Databricks (@databricks) 's Twitter Profile Photo

Meta Llama 3 models are being rolled out across all Databricks Model Serving regions over the next few days. Once available, they can be accessed via the UI, API, or SQL interfaces. dbricks.co/3Qcwffh

Naveen Rao (@naveengrao) 's Twitter Profile Photo

I don’t think everyone has comprehended the massive disruption and distortion that is going to happen in the Gen AI market due to Llama3. Moats will be destroyed and investments will go to zero. Just like everything in Gen AI, this will all happen fast.

Sam Havens (@sam_havens) 's Twitter Profile Photo

SnowflakeDB Awesome work training such a big model with a permissive license! I think you had a mistake in your IFEval implementation, your reported number is less than 2x what we observe (though it does vary with inference server and sampling parameters). You should see in the high 60s

<a href="/SnowflakeDB/">SnowflakeDB</a> Awesome work training such a big model with a permissive license!

I think you had a mistake in your IFEval implementation, your reported number is less than 2x what we observe (though it does vary with inference server and sampling parameters). You should see in the high 60s
Davis Blalock (@davisblalock) 's Twitter Profile Photo

Most ML folks don't realize how different a beast it is to serve enterprise. Like being a "leader" in the Gartner Magic Quadrant is legit more important than your MMLU score. This isn't enterprise ignorance. It's the opposite. Businesses are machines strictly more complex than

Naveen Rao (@naveengrao) 's Twitter Profile Photo

It’s official! The biggest venture round in history. And I feel like we’re just getting started… databricks.com/company/newsro…

will o’brien (@willobri) 's Twitter Profile Photo

Software creates soft men Soft men create hard times Hard times create hard men Hard men create hardware Hardware creates good times Good times creates software

Sam Altman (@sama) 's Twitter Profile Photo

we will cross well over 1 million GPUs brought online by the end of this year! very proud of the team but now they better get to work figuring out how to 100x that lol

Sam Altman (@sama) 's Twitter Profile Photo

thank you to our partners at microsoft, nvidia, oracle, google, and coreweave for making this possible! lots and lots of GPUs working overtime.