Yi Tay (@YiTayML) Twitter Tweets • TwiCopy

Yi Tay

@YiTayML

+ Follow

chief scientist / cofounder @RekaAILabs 🫠
past: research scientist @google brain 🤯
currently learning to be a dad 🍼

ID:790033937531703296

linkhttp://yitay.net calendar_today23-10-2016 03:35:43

3,0K Tweets

29,0K Followers

97 Following

Jason Wei

6 days ago

Enjoyed this paper that plots emergent abilities with pretraining loss on the x-axis, which is actually a suggestion that Oriol Vinyals also made a few years back: arxiv.org/abs/2403.15796

The paper uses intermediate checkpoints to plot a variety of pretraining losses. For some

Enjoyed this paper that plots emergent abilities with pretraining loss on the x-axis, which is actually a suggestion that @OriolVinyalsML also made a few years back: arxiv.org/abs/2403.15796 The paper uses intermediate checkpoints to plot a variety of pretraining losses. For some

thumb_up_off_alt336

chat_bubble_outline0

account_circle

Yi Tay

1 week ago

instead of evaluating models, we can start to evaluate researchers instead! 😀

i've always had this floating idea of giving people transformer configs and asking them to predict configurations that works better. could be data mix, architectures, hparams whatever. would be a fun

thumb_up_off_alt90

chat_bubble_outline0

account_circle

Reka

1 week ago

🔥Newly updated scores for Reka Core, Flash and Edge on MMMU leaderboard: mmmu-benchmark.github.io.

🔥Newly updated scores for Reka Core, Flash and Edge on MMMU leaderboard: mmmu-benchmark.github.io.

thumb_up_off_alt76

chat_bubble_outline0

account_circle

lmsys.org

2 weeks ago

Yes, check out Reka's strong Flash-21B model!

thumb_up_off_alt80

chat_bubble_outline0

account_circle

AK

2 weeks ago

Reka Core, Flash, and Edge

A Series of Powerful Multimodal Language Models

We introduce Reka Core, Flash, and Edge, a series of powerful multimodal language models trained from scratch by Reka. Reka models are able to process and reason with text, images, video, and audio

Reka Core, Flash, and Edge A Series of Powerful Multimodal Language Models We introduce Reka Core, Flash, and Edge, a series of powerful multimodal language models trained from scratch by Reka. Reka models are able to process and reason with text, images, video, and audio

thumb_up_off_alt231

chat_bubble_outline0

account_circle

foam shazeer

2 weeks ago

I heard the key to Reka's success is a new algorithm called AgiHi-PPO

thumb_up_off_alt21

chat_bubble_outline0

account_circle

Yi Tay

2 weeks ago

'To be frontier you first need to be be pareto-frontier'. ~ First law of LLM training. 😃

'To be frontier you first need to be be pareto-frontier'. ~ First law of LLM training. 😃

thumb_up_off_alt412

chat_bubble_outline0

account_circle

Yi Tay

2 weeks ago

Our Reka Tech Report / Paper is out! 🔥

Tech reports with completely no information are kinda boring so we’re revealing some interesting information on how we train our series of Reka models including tokens, architecture, data & human evaluation workflows. 😃

We tried

Our @RekaAILabs Tech Report / Paper is out! 🔥 Tech reports with completely no information are kinda boring so we’re revealing some interesting information on how we train our series of Reka models including tokens, architecture, data & human evaluation workflows. 😃 We tried

thumb_up_off_alt416

chat_bubble_outline0

account_circle

Yi Tay

2 weeks ago

One year since I posted this so here's an update! Adding Donovan Ong to the list of notable Singaporean researchers/engineers doing great work in AI and LLMs. He helped train Reka's (@RekaAILabs) series of OP models (Core, Flash, Edge) so he deserves to be on this list! 🔥

thumb_up_off_alt26

chat_bubble_outline0

account_circle

Yi Tay

2 weeks ago

It's been a wild ride. Just 20 of us, burning through thousands of H100s over the past months, we're glad to finally share this with the world! 💪

One of the goals we’ve had when starting Reka was to build cool innovative models at the frontier. Reaching GPT-4/Opus level was a

thumb_up_off_alt936

chat_bubble_outline0

account_circle

Karim

2 weeks ago

It's inspiring to see what a small team can accomplish in such a short period of time.

Reka, an enterprise multimodal LLM company, has only had access to 90% of their compute for the past 4 months, but that hasn't stopped the brilliant team of 20 to go head-to-head in

thumb_up_off_alt57

chat_bubble_outline0

account_circle

Yi Tay

2 weeks ago

Didn't get much chance to share this yesterday with everything else going on with the Reka core launch but here's the most non-cherry picked showcase of Reka Core vs GPT-4 vs Claude Opus on multimodal chat tasks. 👇

We put together this showcase with examples our team created.

Didn't get much chance to share this yesterday with everything else going on with the Reka core launch but here's the most non-cherry picked showcase of Reka Core vs GPT-4 vs Claude Opus on multimodal chat tasks. 👇 We put together this showcase with examples our team created.

thumb_up_off_alt89

chat_bubble_outline0

account_circle

Reka

2 weeks ago

Meet Reka Core, our best and most capable multimodal language model yet. 🔮

It’s been a busy few months training this model and we are glad to finally ship it! 💪

Core has a lot of capabilities, and one of them is understanding video --- let’s see what Core thinks of the 3 body

thumb_up_off_alt1,1K

chat_bubble_outline0

account_circle

Teortaxes▶️

2 weeks ago

Feels legit. I might prefer Reka Core's multimodal performance to 1.5 too.

thumb_up_off_alt18

chat_bubble_outline0

account_circle

小猫遊りょう（たかにゃし・りょう）

2 weeks ago

現時点でトップクラスの言語モデルを作成できた組織
① OpenAI（GPT-4）
② Google（Gemini Ultra、Gemini 1.5 Pro）
③ Anthropic（Claude 3 Opus）
④ Inflection AI（Inflection 2.5）
⑤ Reka（Reka Core）
⑥ xAI（Grok-1.5）
⑦ Mistral（Mistral large）

Metaは次のLLaMA 3で加わる可能性あり

thumb_up_off_alt408

chat_bubble_outline0

account_circle