Lucas Beyer (bl16)(@giffmana) 's Twitter Profileg
Lucas Beyer (bl16)

@giffmana

Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]

ID:2236047510

linkhttp://lucasb.eyer.be calendar_today08-12-2013 13:31:09

11,0K Tweets

56,1K Followers

445 Following

Lucas Beyer (bl16)(@giffmana) 's Twitter Profile Photo

So much potential🥹

1: One moment, let me state the vision
No more sliding windows, we on a mission!
udio.com/songs/mfsfmg5V…

2: No 1x1, that's just a view
Yann LeCun can't argue, our models slew
udio.com/songs/eb2bmgzT…

So much potential🥹 1: One moment, let me state the vision No more sliding windows, we on a mission! udio.com/songs/mfsfmg5V… 2: No 1x1, that's just a view @ylecun can't argue, our models slew udio.com/songs/eb2bmgzT…
account_circle
Soham De(@sohamde_) 's Twitter Profile Photo

There are bound to be some rough edges with this release. Let us know as you encounter them, and we will do our best to resolve them!

account_circle
Soham De(@sohamde_) 's Twitter Profile Photo

RecurrentGemma is close to Gemma-2B while being trained on 1T fewer tokens (!), making it one of the strongest open models at 2B scale.

+ performs very well vs much larger 7B Mistral on human evals testing instruction following and safety.

Details here: storage.googleapis.com/deepmind-media…

RecurrentGemma is close to Gemma-2B while being trained on 1T fewer tokens (!), making it one of the strongest open models at 2B scale. + performs very well vs much larger 7B Mistral on human evals testing instruction following and safety. Details here: storage.googleapis.com/deepmind-media…
account_circle
Samuel L Smith(@SamuelMLSmith) 's Twitter Profile Photo

Announcing RecurrentGemma!
github.com/google-deepmin…

- A 2B model with open weights based on Griffin
- Replaces transformer with mix of gated linear recurrences and local attention
- Competitive with Gemma-2B on downstream evals
- Higher throughput when sampling long sequences

Announcing RecurrentGemma! github.com/google-deepmin… - A 2B model with open weights based on Griffin - Replaces transformer with mix of gated linear recurrences and local attention - Competitive with Gemma-2B on downstream evals - Higher throughput when sampling long sequences
account_circle
Lucas Beyer (bl16)(@giffmana) 's Twitter Profile Photo

This is exactly what I hate with all big frameworks. TF is terrible. PyTorch used to be straightforward but turned terrible too. Torch7 was very direct. JAX/Flax still ok, but I pray every day that it doesn’t end up with the same fate over time.

This is exactly what I hate with all big frameworks. TF is terrible. PyTorch used to be straightforward but turned terrible too. Torch7 was very direct. JAX/Flax still ok, but I pray every day that it doesn’t end up with the same fate over time.
account_circle
Lucas Beyer (bl16)(@giffmana) 's Twitter Profile Photo

OK, now I just need to convince my wife that this is a great investment of 1.7M chf. I'm sure that domain name will double in value over the next year!

something something bubble

OK, now I just need to convince my wife that this is a great investment of 1.7M chf. I'm sure that domain name will double in value over the next year! something something bubble
account_circle
Robert Dadashi(@robdadashi) 's Twitter Profile Photo

I am very happy to announce that Gemma 1.1 Instruct 2B and “7B” are out! Here are a few details about the new models:
1/11

account_circle