Lucas Beyer (bl16) (@giffmana) Twitter Tweets • TwiCopy

So much potential🥹

1: One moment, let me state the vision
No more sliding windows, we on a mission!
udio.com/songs/mfsfmg5V…

2: No 1x1, that's just a view
Yann LeCun can't argue, our models slew
udio.com/songs/eb2bmgzT…

So much potential🥹 1: One moment, let me state the vision No more sliding windows, we on a mission! udio.com/songs/mfsfmg5V… 2: No 1x1, that's just a view @ylecun can't argue, our models slew udio.com/songs/eb2bmgzT…

thumb_up_off_alt40

chat_bubble_outline0

repeat2

shareShare

account_circle

Soham De

@sohamde_

2 weeks ago

There are bound to be some rough edges with this release. Let us know as you encounter them, and we will do our best to resolve them!

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

account_circle

Lucas Beyer (bl16)

@giffmana

2 weeks ago

This is probably the best announcement post of the day!

thumb_up_off_alt25

chat_bubble_outline0

repeat0

shareShare

account_circle

Soham De

@sohamde_

2 weeks ago

RecurrentGemma is close to Gemma-2B while being trained on 1T fewer tokens (!), making it one of the strongest open models at 2B scale.

+ performs very well vs much larger 7B Mistral on human evals testing instruction following and safety.

Details here: storage.googleapis.com/deepmind-media…

thumb_up_off_alt27

chat_bubble_outline0

repeat6

shareShare

account_circle

Samuel L Smith

@SamuelMLSmith

2 weeks ago

Announcing RecurrentGemma!
github.com/google-deepmin…

- A 2B model with open weights based on Griffin
- Replaces transformer with mix of gated linear recurrences and local attention
- Competitive with Gemma-2B on downstream evals
- Higher throughput when sampling long sequences

account_circle

Lucas Beyer (bl16)

@giffmana

3 weeks ago

This is exactly what I hate with all big frameworks. TF is terrible. PyTorch used to be straightforward but turned terrible too. Torch7 was very direct. JAX/Flax still ok, but I pray every day that it doesn’t end up with the same fate over time.

account_circle

Lucas Beyer (bl16)

@giffmana

3 weeks ago

OK, now I just need to convince my wife that this is a great investment of 1.7M chf. I'm sure that domain name will double in value over the next year!

something something bubble