Sean Zhang (@seeeeaaaannnnnn) 's Twitter Profile
Sean Zhang

@seeeeaaaannnnnn

Training Neural Network @Manifest__AI, Ex-Meta & Ex-Voleon, Longtermism

ID: 4699665505

linkhttp://seanzhang.me calendar_today03-01-2016 01:05:13

32 Tweet

492 Followers

590 Following

Victor Gevers (@0xdude) 's Twitter Profile Photo

Around 364 million online profiles and their chats & file transfers get processed daily. Then these accounts get linked to a real ID/person. The data is then distributed over police stations per city/province to separate operators databases with the same surveillance network name

Mark Dreyer (@dreyerchina) 's Twitter Profile Photo

This is amazing. Due to the backlash from Chinese fans seeing unmasked crowds in Qatar, Chinese TV is now replacing live crowds shots during games and instead cutting to close-ups of players and coaches.

Russian Market (@runews) 's Twitter Profile Photo

These women handing out water and food to migrants traveling through Mexico on their way to the United States Imagine living out the life experience of basic survival and one day deciding to set off for the potential of something better, anything better, aboard La Bestia, also

Joseph Wang (@fedguy12) 's Twitter Profile Photo

In 2008 the banks got rich, went bust, and got bailed out. It was unfair, so we created a regulatory system to prevent it from happening. SVB is a minor bank. We could have let the process play out and show how the system has been improved. But it seems nothing has changed

In 2008 the banks got rich, went bust, and got bailed out. It was unfair, so we created a regulatory system to prevent it from happening.

SVB is a minor bank. We could have let the process play out and show how the system has been improved. But it seems nothing has changed
Amjad Masad (@amasad) 's Twitter Profile Photo

Convinced wee are witnessing the birth of a new kind of computer. From: Memorizing Transformers arxiv.org/pdf/2203.08913…

Convinced wee are witnessing the birth of a new kind of computer.

From: Memorizing Transformers arxiv.org/pdf/2203.08913…
Sean Zhang (@seeeeaaaannnnnn) 's Twitter Profile Photo

Great to see AI technological advances is providing value to foundational research as well. Terence Tao on "Machine Assistant Proof": youtube.com/watch?v=AayZuu…

Great to see AI technological advances is providing value to foundational research as well.

Terence Tao on "Machine Assistant Proof": youtube.com/watch?v=AayZuu…
Sean Zhang (@seeeeaaaannnnnn) 's Twitter Profile Photo

Excited to share our latest work Manifest AI on a new linear transformer architecture that has potential to outperform standard softmax transformers! Check it out here manifestai.com/articles/symme…

Sean Zhang (@seeeeaaaannnnnn) 's Twitter Profile Photo

Balance is the key. We show, in our latest work, that both quadratic attention and linear-attention-based architectures are not fit for long context jobs because they spend way too much of their flops budget on either state or weight.

Balance is the key. We show, in our latest work, that both quadratic attention and linear-attention-based architectures are not fit for long context jobs because they spend way too much of their flops budget on either state or weight.
Jacob Buckman (@jacobmbuckman) 's Twitter Profile Photo

The massive performance upgrades of power attention are clearly visible at the 1.6B parameter scale on 32k-length documents. The improvement is due to better in-context learning.

The massive performance upgrades of power attention are clearly visible at the 1.6B parameter scale on 32k-length documents. The improvement is due to better in-context learning.