VinceK 🏳️🇺🇦🏳️ (@westernmishima) 's Twitter Profile
VinceK 🏳️🇺🇦🏳️

@westernmishima

Vincent - NB

ID: 1370007922084831239

calendar_today11-03-2021 13:45:35

2,2K Tweet

105 Followers

90 Following

DeepSeek (@deepseek_ai) 's Twitter Profile Photo

🌟 Meet #DeepSeekMoE: The Next Gen of Large Language Models! Performance Highlights: 📈 DeepSeekMoE 2B matches its 2B dense counterpart with 17.5% computation. 🚀 DeepSeekMoE 16B rivals LLaMA2 7B with 40% computation. 🛠 DeepSeekMoE 145B significantly outperforms Gshard,

🌟 Meet #DeepSeekMoE: The Next Gen of Large Language Models!

Performance Highlights:
📈 DeepSeekMoE 2B matches its 2B dense counterpart with 17.5% computation.
🚀 DeepSeekMoE 16B rivals LLaMA2 7B with 40% computation.
🛠 DeepSeekMoE 145B significantly outperforms Gshard,
Zephyr (@zephyr_z9) 's Twitter Profile Photo

My Third Post Implications of the H20 It will have profound change on the dynamics of US-China AI race especially in the RL/inference age

My Third Post
Implications of the H20
It will have profound change on the dynamics of US-China AI race especially in the RL/inference age
VinceK 🏳️🇺🇦🏳️ (@westernmishima) 's Twitter Profile Photo

Let's be honest, the raw capabilities of the models are still quite far from the greatest human geniuses, today we are more around 115/120 IQ (at best) with still a lot of optimization to be done on agency, and avenues of research such as lifelong learning and more.

Stephen McAleer (@mcaleerstephen) 's Twitter Profile Photo

We've entered a new phase where progress in chatbots is starting to top out but progress in automating AI research is steadily improving. It's a mistake the confuse the two.

Z.ai (@zai_org) 's Twitter Profile Photo

Introducing GLM-4.5V: a breakthrough in open-source visual reasoning GLM-4.5V delivers state-of-the-art performance among open-source models in its size class, dominating across 41 benchmarks. Built on the GLM-4.5-Air base model, GLM-4.5V inherits proven techniques from

Introducing GLM-4.5V: a breakthrough in open-source visual reasoning

GLM-4.5V delivers state-of-the-art performance among open-source models in its size class, dominating across 41 benchmarks.

Built on the GLM-4.5-Air base model, GLM-4.5V inherits proven techniques from
Sheryl Hsu (@sherylhsu02) 's Twitter Profile Photo

1/n I’m thrilled to share that our OpenAI reasoning system scored high enough to achieve gold 🥇🥇 in one of the world’s top programming competitions - the 2025 International Olympiad in Informatics (IOI) - placing first among AI participants! 👨‍💻👨‍💻

1/n I’m thrilled to share that our <a href="/OpenAI/">OpenAI</a> reasoning system scored high enough to achieve gold 🥇🥇 in one of the world’s top programming competitions - the 2025 International Olympiad in Informatics (IOI) - placing first among AI participants! 👨‍💻👨‍💻
Noam Brown (@polynoamial) 's Twitter Profile Photo

In my opinion, the most important takeaway from this result is that our OpenAI International Math Olympiad (IMO) gold model is also our best competitive coding model. 🧵

Aidan McLaughlin (@aidan_mclau) 's Twitter Profile Photo

you always need money. you need money for compute. you need money for hard-to-get data. you need money for researchers today and brand emissaries tomorrow. you need money for when the algorithmic advances tap breaks its laminar flow

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) (@teortaxestex) 's Twitter Profile Photo

This story is so insane, dream narrative for burgers and their pure-hearted vassals, that I might well cook up my own, also based on half-baked rumors, experience in an authoritarian society and just a little bit of sleuthing. DeepSeek was a breakout hit, but patronage networks

This story is so insane, dream narrative for burgers and their pure-hearted vassals, that I might well cook up my own, also based on half-baked rumors, experience in an authoritarian society and just a little bit of sleuthing.
DeepSeek was a breakout hit, but patronage networks
Shai Shalev-Shwartz (@shai_s_shwartz) 's Twitter Profile Photo

Are frontier AI models really capable of “PhD-level” reasoning? To answer this question, we introduce FormulaOne, a new reasoning benchmark of expert-level Dynamic Programming problems. We have curated a benchmark consisting of three tiers, in increasing complexity, which we call

Are frontier AI models really capable of “PhD-level” reasoning? To answer this question, we introduce FormulaOne, a new reasoning benchmark of expert-level Dynamic Programming problems. We have curated a benchmark consisting of three tiers, in increasing complexity, which we call
Elon Musk (@elonmusk) 's Twitter Profile Photo

Rohan Paul False, intelligence still scales logarithmically with compute. And it doesn’t make sense to call them LLMs when they’re natively multimodal. Just models – it’s cleaner.

Jenia Jitsev 🏳️‍🌈 🇺🇦 🇮🇱 (@jjitsev) 's Twitter Profile Photo

Debunking yet another study of many that claim benefit of "brain-inspired" mechanisms without doing proper controls - comparing to reference transformer of same size. Doing reference comparisons is the way to avoid "brain-inspired" further sliding towards red flag for scams.

Interconnects (@interconnectsai) 's Twitter Profile Photo

China's Top 19 Open Model Labs We ranked all the organizations in China releasing open models, from the top of DeepSeek to small, newer academic labs making waves with tech reports and niche models. interconnects.ai/p/chinas-top-1…

Nick (@nickcammarata) 's Twitter Profile Photo

every day that nvidia stock is down I glance at the kaplan et al scaling law chart and laugh confidently into the future

Yam Peleg (@yampeleg) 's Twitter Profile Photo

Paper TLDR: - Reasoning LLMs fail as task complexity rises (e.g. more Hanoi’s tower rings). - The reasoning chain starts fine then diverge and falls apart. My Guess: - It’s not the task complexity, it’s the reasoning chain length. - Train to make longer chains: ceiling goes up.

Paper TLDR:
- Reasoning LLMs fail as task complexity rises (e.g. more Hanoi’s tower rings).
- The reasoning chain starts fine then diverge and falls apart.

My Guess:
- It’s not the task complexity, it’s the reasoning chain length.
- Train to make longer chains: ceiling goes up.
Z.ai (@zai_org) 's Twitter Profile Photo

Introducing ComputerRL, a framework for autonomous desktop intelligence that enables agents to operate complex digital workspaces skillfully. arxiv.org/abs/2508.14040 ComputerRL features the API-GUI paradigm, which unifies programmatic API calls and direct GUI interaction to

Introducing ComputerRL, a framework for autonomous desktop intelligence that enables agents to operate complex digital workspaces skillfully.
arxiv.org/abs/2508.14040

ComputerRL features the API-GUI paradigm, which unifies programmatic API calls and direct GUI interaction to
Le Grand Continent (@grand_continent) 's Twitter Profile Photo

«L’histoire ne nous jugera pas avec clémence.» Une conversation avec Gabrielius Landsbergis🇱🇹, ancien ministre lituanien des Affaires étrangères, signée Maria Tadeo. A lire absolument—à discuter. legrandcontinent.eu/fr/2025/08/20/…

DeepSeek (@deepseek_ai) 's Twitter Profile Photo

Introducing DeepSeek-V3.1: our first step toward the agent era! 🚀 🧠 Hybrid inference: Think & Non-Think — one model, two modes ⚡️ Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs. DeepSeek-R1-0528 🛠️ Stronger agent skills: Post-training boosts tool use and