Yi Tay(@YiTayML) 's Twitter Profileg
Yi Tay

@YiTayML

chief scientist / cofounder @RekaAILabs 🫠
past: research scientist @google brain 🤯
currently learning to be a dad 🍼

ID:790033937531703296

linkhttp://yitay.net calendar_today23-10-2016 03:35:43

3,0K Tweets

29,0K Followers

97 Following

Jason Wei(@_jasonwei) 's Twitter Profile Photo

Enjoyed this paper that plots emergent abilities with pretraining loss on the x-axis, which is actually a suggestion that Oriol Vinyals also made a few years back: arxiv.org/abs/2403.15796

The paper uses intermediate checkpoints to plot a variety of pretraining losses. For some

Enjoyed this paper that plots emergent abilities with pretraining loss on the x-axis, which is actually a suggestion that @OriolVinyalsML also made a few years back: arxiv.org/abs/2403.15796 The paper uses intermediate checkpoints to plot a variety of pretraining losses. For some
account_circle
Yi Tay(@YiTayML) 's Twitter Profile Photo

instead of evaluating models, we can start to evaluate researchers instead! 😀

i've always had this floating idea of giving people transformer configs and asking them to predict configurations that works better. could be data mix, architectures, hparams whatever. would be a fun

account_circle
AK(@_akhaliq) 's Twitter Profile Photo

Reka Core, Flash, and Edge

A Series of Powerful Multimodal Language Models

We introduce Reka Core, Flash, and Edge, a series of powerful multimodal language models trained from scratch by Reka. Reka models are able to process and reason with text, images, video, and audio

Reka Core, Flash, and Edge A Series of Powerful Multimodal Language Models We introduce Reka Core, Flash, and Edge, a series of powerful multimodal language models trained from scratch by Reka. Reka models are able to process and reason with text, images, video, and audio
account_circle
Yi Tay(@YiTayML) 's Twitter Profile Photo

Our Reka Tech Report / Paper is out! 🔥

Tech reports with completely no information are kinda boring so we’re revealing some interesting information on how we train our series of Reka models including tokens, architecture, data & human evaluation workflows. 😃

We tried

Our @RekaAILabs Tech Report / Paper is out! 🔥 Tech reports with completely no information are kinda boring so we’re revealing some interesting information on how we train our series of Reka models including tokens, architecture, data & human evaluation workflows. 😃 We tried
account_circle
Yi Tay(@YiTayML) 's Twitter Profile Photo

One year since I posted this so here's an update! Adding Donovan Ong to the list of notable Singaporean researchers/engineers doing great work in AI and LLMs. He helped train Reka's (@RekaAILabs) series of OP models (Core, Flash, Edge) so he deserves to be on this list! 🔥

account_circle
Yi Tay(@YiTayML) 's Twitter Profile Photo

It's been a wild ride. Just 20 of us, burning through thousands of H100s over the past months, we're glad to finally share this with the world! 💪

One of the goals we’ve had when starting Reka was to build cool innovative models at the frontier. Reaching GPT-4/Opus level was a

account_circle
Karim(@KarimBhalwani) 's Twitter Profile Photo

It's inspiring to see what a small team can accomplish in such a short period of time.

Reka, an enterprise multimodal LLM company, has only had access to 90% of their compute for the past 4 months, but that hasn't stopped the brilliant team of 20 to go head-to-head in

account_circle
Yi Tay(@YiTayML) 's Twitter Profile Photo

Didn't get much chance to share this yesterday with everything else going on with the Reka core launch but here's the most non-cherry picked showcase of Reka Core vs GPT-4 vs Claude Opus on multimodal chat tasks. 👇

We put together this showcase with examples our team created.

Didn't get much chance to share this yesterday with everything else going on with the Reka core launch but here's the most non-cherry picked showcase of Reka Core vs GPT-4 vs Claude Opus on multimodal chat tasks. 👇 We put together this showcase with examples our team created.
account_circle
Reka(@RekaAILabs) 's Twitter Profile Photo

Meet Reka Core, our best and most capable multimodal language model yet. 🔮

It’s been a busy few months training this model and we are glad to finally ship it! 💪

Core has a lot of capabilities, and one of them is understanding video --- let’s see what Core thinks of the 3 body

account_circle
小猫遊りょう(たかにゃし・りょう)(@jaguring1) 's Twitter Profile Photo

現時点でトップクラスの言語モデルを作成できた組織
① OpenAI(GPT-4)
② Google(Gemini Ultra、Gemini 1.5 Pro)
③ Anthropic(Claude 3 Opus)
④ Inflection AI(Inflection 2.5)
⑤ Reka(Reka Core)
⑥ xAI(Grok-1.5)
⑦ Mistral(Mistral large)

Metaは次のLLaMA 3で加わる可能性あり

account_circle