Richard Kuzma (@rskuzma) 's Twitter Profile
Richard Kuzma

@rskuzma

GenAI @googlecloud, ex-LLMs @CerebrasSystems, ML @USSOCOM Tech for Public Good @DIU_x and Harvard @Kennedy_School

ID: 887464277615013888

calendar_today19-07-2017 00:09:06

233 Tweet

438 Followers

1,1K Following

Richard Kuzma (@rskuzma) 's Twitter Profile Photo

Why train LLMs for only 1 epoch? Do we have to stop there? Demand is growing for small-parameter LLMs trained on more data. Internet-scale data isn't available for all languages and domain-specific (e.g. finance, life science) text Training for more epochs seems advisable!

Richard Kuzma (@rskuzma) 's Twitter Profile Photo

Great work by my colleagues Daria Soboleva, Nolan Dey, and others to further clean and deduplicate RedPajama to make SlimPajama, a (still massive) extremely high quality dataset giving practitioners more control over how much (if any) duplication they want!

Cerebras (@cerebrassystems) 's Twitter Profile Photo

πŸ“£ Today we are announcing Condor Galaxy-1: a 4 exaflop AI supercomputer built in partnership with G42. Powered by 64 Cerebras CS-2 systems, 54M cores, and 82TB of memory – it's the largest AI supercomputer we've ever built. But that's not all: CG-1 is just the start..

πŸ“£ Today we are announcing Condor Galaxy-1: a 4 exaflop AI supercomputer built in partnership with <a href="/G42ai/">G42</a>. Powered by 64 Cerebras CS-2 systems, 54M cores, and 82TB of memory – it's the largest AI supercomputer we've ever built. But that's not all: CG-1 is just the start..
OpenΟ„ensor FoundaΟ„ion (@opentensor) 's Twitter Profile Photo

The Opentensor Foundation and Cerebras are pleased to announce Bittensor Language Model (BTLM), a new state-of-the-art 3 billion parameter language model that achieves breakthrough accuracy across a dozen AI benchmarks

Cerebras (@cerebrassystems) 's Twitter Profile Photo

Introducing BTLM-3B-8K: an open, state-of-the art 3B parameter model with 7B level performance. When quantized, it fits in as little as 3GB of memory 🀯. It runs on iPhone, Google Pixel, even Raspberry Pi. BTLM goes live on Bittensor later this week! πŸ§΅πŸ‘‡ buff.ly/3Q5dtY5

Introducing BTLM-3B-8K: an open, state-of-the art 3B parameter model with 7B level performance. When quantized, it fits in as little as 3GB of memory 🀯. It runs on iPhone, Google Pixel, even Raspberry Pi. BTLM goes live on Bittensor later this week! πŸ§΅πŸ‘‡
buff.ly/3Q5dtY5
Richard Kuzma (@rskuzma) 's Twitter Profile Photo

Announcing BTLM-3B-8k-base! - 7B performance in a 3B model βœ… - 8k context length βœ… - quantize to fit in 3GB of memory βœ… - trained on high quality data βœ… - apache 2.0 license βœ… huggingface.co/cerebras/btlm-… Great work by my colleagues Daria Soboleva Nolan Dey Faisal Al-khateeb and others πŸ‘

Cerebras (@cerebrassystems) 's Twitter Profile Photo

The Cerebras team has had a great time sharing our work at #ICML23. Below is a summary of the posters we presented, let us know if you are interested in discussing any of them further!

The Cerebras team has had a great time sharing our work at #ICML23. Below is a summary of the posters we presented, let us know if you are interested in discussing any of them further!
Richard Kuzma (@rskuzma) 's Twitter Profile Photo

πŸ₯³ CerebrasGPT proved to the world in March how effectively you can train LLMs on Cerebras hardware. Now BTLM surpasses 1M downloads in ~3 weeks on Hugging Face! πŸš€

Ritwik Gupta πŸ‡ΊπŸ‡¦ (@ritwik_g) 's Twitter Profile Photo

I read Leopold Aschenbrenner's essay on the future of AI research and geopolitical competition. It's well-researched, well-presented, and passionate. However, Leopold advocates for an unreasonably strict and exclusionary future for AI developmentβ€”a view that's gaining traction. (1/9)

Richard Kuzma (@rskuzma) 's Twitter Profile Photo

Crazy speed from the team at @CerebrasSystems! Unlocks lots of interesting use cases across fast agent tool calling, multi-agent systems, self-consistency, and more!

Ted Mabrey (@mabreyted) 's Twitter Profile Photo

Man so excited we could finally unveil this. This is THE applied AI project. Google walked away from it. We embraced it. The world is a different place because of it. It provides so many foundational learnings that we are now applying to the commercial world via AIP. The

Lydia Hylton (@lyd_hylton) 's Twitter Profile Photo

Thrilled to officially announce what I've been working on for the last year: Strella.io! At Strella, we believe that the customer’s needs should be a company’s North Star. Using Strella’s AI, we enable companies to make informed decisions in hours, not weeks β­οΈπŸŒŸπŸš€

Daria Soboleva (@dmsobol) 's Twitter Profile Photo

This might be the most information dense blog I've ever written. Added "show me the math" section into MoE 101 p4 episode. We believe it fully models MoE training perf on both gpu and cerebras wse devices. cerebras.ai/blog/moe-guide… 🧡1/n