eddy (@eddy_data3) 's Twitter Profile
eddy

@eddy_data3

Working on reasoning and synthetic data

ID: 1529848369979437059

calendar_today26-05-2022 15:34:17

478 Tweet

399 Takipçi

4,4K Takip Edilen

Yi-01.AI (@01ai_yi) 's Twitter Profile Photo

01.AI Yi-Large-Preview ranks only 2 places under the newest OpenAI ChatGPT GPT-4o on Alpaca Eval2.0 verified category. Not bad, we'd say. Your thought? 😉 tatsu-lab.github.io/alpaca_eval ➡️ Yi-Large API beta sign up: platform.01.ai

01.AI Yi-Large-Preview ranks only 2 places under the newest <a href="/OpenAI/">OpenAI</a> <a href="/ChatGPTapp/">ChatGPT</a> GPT-4o on <a href="/alpacaml/">Alpaca</a> Eval2.0 verified category.  Not bad, we'd say.  Your thought? 😉 tatsu-lab.github.io/alpaca_eval ➡️ Yi-Large API beta sign up: platform.01.ai
Xiang Yue@ICLR2025🇸🇬 (@xiangyue96) 's Twitter Profile Photo

Thank AK for sharing our work! Paper: arxiv.org/pdf/2405.15071 Key takeaways: 1) Transformers can learn to implicitly reason, but only through extended training far beyond overfitting, a phenomenon known as grokking. 2) Transformers exhibit different levels of

Thank <a href="/_akhaliq/">AK</a>  for sharing our work!
Paper: arxiv.org/pdf/2405.15071

Key takeaways:
1) Transformers can learn to implicitly reason, but only through extended training far beyond overfitting, a phenomenon known as grokking.

2) Transformers exhibit different levels of
eddy (@eddy_data3) 's Twitter Profile Photo

Love Yann LeCun ‘s spicy takes 😂 I see his point though — is gradient descent “learning” if you limit it to a few iterations? The life expectancy of a human is short in the timescale of technological progress. Science has a scaling law too 🤔

Mira (@_mira___mira_) 's Twitter Profile Photo

> Obstinacy is a reflexive resistance to changing one's ideas. If you have a model M of the world which predicts X with confidence X', and you observe ~X, then: M -> X = ~M or X = ~M The most obstinate may reject the evidence for ~X. It's usually noisy. Maybe you made a

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) (@teortaxestex) 's Twitter Profile Photo

xjdr Thinking about emergence, feels that we can semi-rigorously say which tasks are hard-locked by serial operation depth or by width/fundamental richness limits on LLM representations, and can only be approached asymptotically with compute in small models; and which are… just hard.

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

LLM model size competition is intensifying… backwards! My bet is that we'll see models that "think" very well and reliably that are very very small. There is most likely a setting even of GPT-2 parameters for which most people will consider GPT-2 "smart". The reason current

Thomas Wolf (@thom_wolf) 's Twitter Profile Photo

It’s Sunday morning we have some time with the coffee so let me tell you about some of our recent surprising journey in synthetic data and small language models. This post is prompted by the coming release of an instant, in-browser model called SmolLM360 (link at the end) The

It’s Sunday morning we have some time with the coffee so let me tell you about some of our recent surprising journey in synthetic data and small language models.

This post is prompted by the coming release of an instant, in-browser model called SmolLM360 (link at the end)

The
Nous Research (@nousresearch) 's Twitter Profile Photo

What if you could use all the computing power in the world to train a shared, open source AI model? Preliminary report: github.com/NousResearch/D… Nous Research is proud to release a preliminary report on DisTrO (Distributed Training Over-the-Internet) a family of

What if you could use all the computing power in the world to train a shared, open source AI model?

Preliminary report: github.com/NousResearch/D…

Nous Research is proud to release a preliminary report on DisTrO (Distributed Training Over-the-Internet) a family of
Xiang Yue@ICLR2025🇸🇬 (@xiangyue96) 's Twitter Profile Photo

🚀 Introducing MMMU-Pro: A more robust version of MMMU arxiv.org/pdf/2409.02813… After launching MMMU, we received valuable feedback from the community: 1️⃣ Some questions were answerable without even seeing the images. 2️⃣ Models didn’t always "know" the answer but found shortcuts

🚀 Introducing MMMU-Pro: A more robust version of MMMU
arxiv.org/pdf/2409.02813…
After launching MMMU, we received valuable feedback from the community:

1️⃣ Some questions were answerable without even seeing the images.
2️⃣ Models didn’t always "know" the answer but found shortcuts
roon (@tszzl) 's Twitter Profile Photo

the cognitive profile of humans is not generally distributed across all cognitive skills. for example we’re unreasonably good at modeling our friends theory of mind and super bad at learning ten languages or difficult math (except on the margins)

Rhymes.AI (@rhymes_ai_) 's Twitter Profile Photo

🚀 Introducing Aria from Rhymes.AI : The first open-source, multimodal native MoE model! Aria uses 3.9B parameters per token, excelling in multimodal & language tasks. It features a 64K token context window, captioning 256-frame videos in 10 seconds. Lightweight, fast, &

🚀 Introducing Aria from <a href="/rhymes_ai_/">Rhymes.AI</a> : The first open-source, multimodal native MoE model! Aria uses 3.9B parameters per token, excelling in multimodal &amp; language tasks.

It features a 64K token context window, captioning 256-frame videos in 10 seconds. Lightweight, fast, &amp;
Rhymes.AI (@rhymes_ai_) 's Twitter Profile Photo

New Model From Rhymes! ✨ We're thrilled to announce, Allegro — a small and efficient open-source text-to-video model that transforms your text into stunning 6-second videos at 15 FPS and 720p! 🚀 🩷 Explore Allegro - Gallery: rhymes.ai/allegro_gallery - Hugging Face:

World Labs (@theworldlabs) 's Twitter Profile Photo

We’ve been busy building an AI system to generate 3D worlds from a single image. Check out some early results on our site, where you can interact with our scenes directly in the browser! worldlabs.ai/blog 1/n

Xiang Yue@ICLR2025🇸🇬 (@xiangyue96) 's Twitter Profile Photo

🚨 Our latest work evaluates the synthetic data generation abilities of different LLMs. Key findings: - Strong task-solvers ≠ strong synthetic data generators. - Bigger isn’t always better! E.g., Llama3.1-8B can outperform a 405B model in certain settings. - More synthetic data

🚨 Our latest work evaluates the synthetic data generation abilities of different LLMs.

Key findings:
- Strong task-solvers ≠ strong synthetic data generators.
- Bigger isn’t always better! E.g., Llama3.1-8B can outperform a 405B model in certain settings.
- More synthetic data
Li Junnan (@lijunnan0409) 's Twitter Profile Photo

Introducing 🔥Aria-Chat🔥, our latest multimodal chat model optimized for open-ended and multi-round dialogs! It outperforms Aria by 7 points on WildVision-Bench, offering enhanced reliability and stronger multilingual support. Download the model now: huggingface.co/rhymes-ai/Aria…

Jiayi Pan (@jiayi_pirate) 's Twitter Profile Photo

We reproduced DeepSeek R1-Zero in the CountDown game, and it just works Through RL, the 3B base LM develops self-verification and search abilities all on its own You can experience the Ahah moment yourself for < $30 Code: github.com/Jiayi-Pan/Tiny… Here's what we learned 🧵

We reproduced DeepSeek R1-Zero in the CountDown game, and it just works 

Through RL, the 3B base LM develops self-verification and search abilities all on its own 

You can experience the Ahah moment yourself for &lt; $30 
Code: github.com/Jiayi-Pan/Tiny…

Here's what we learned 🧵
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability. It lives always-on and by default on every computer as the kernel of LLM personal computing. Its features are slowly crystalizing: - Natively multimodal