TensorTonic (@tensortonic) 's Twitter Profile
TensorTonic

@tensortonic

Machine Learning papers, concepts, and resources.

ID: 1917184992616321024

calendar_today29-04-2025 11:51:44

16 Tweet

636 Takipçi

0 Takip Edilen

Mistral AI (@mistralai) 's Twitter Profile Photo

Introducing Magistral Small 1.2 & Magistral Medium 1.2, minor updates to our Magistral 1.1 models! - Multimodality: Now equipped with a vision encoder, these models handle both text and images seamlessly. - Performance Boost: 15% improvements on math and coding benchmarks such

Introducing Magistral Small 1.2 & Magistral Medium 1.2, minor updates to our Magistral 1.1 models!

- Multimodality: Now equipped with a vision encoder, these models handle both text and images seamlessly.
- Performance Boost: 15% improvements on math and coding benchmarks such
Alexandr Wang (@alexandr_wang) 's Twitter Profile Photo

new research from Meta FAIR: Code World Model (CWM), a 32B research model we encourage the research community to research this open-weight model! pass@1 evals, for the curious: 65.8 % on SWE-bench Verified 68.6 % on LiveCodeBench 96.6 % on Math-500 76.0 % on AIME 2024 🧵

new research from Meta FAIR: Code World Model (CWM), a 32B research model

we encourage the research community to research this open-weight model!

pass@1 evals, for the curious:

65.8 % on SWE-bench Verified
68.6 % on LiveCodeBench
96.6 % on Math-500
76.0 % on AIME 2024

🧵
DeepSeek (@deepseek_ai) 's Twitter Profile Photo

🚀 Introducing DeepSeek-V3.2-Exp — our latest experimental model! ✨ Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context. 👉 Now live on App, Web, and API. 💰 API prices cut by 50%+! 1/n

Claude (@claudeai) 's Twitter Profile Photo

Introducing Claude Haiku 4.5: our latest small model. Five months ago, Claude Sonnet 4 was state-of-the-art. Today, Haiku 4.5 matches its coding performance at one-third the cost and more than twice the speed.

Introducing Claude Haiku 4.5: our latest small model.

Five months ago, Claude Sonnet 4 was state-of-the-art. Today, Haiku 4.5 matches its coding performance at one-third the cost and more than twice the speed.
Qwen (@alibaba_qwen) 's Twitter Profile Photo

Excited to announce the launch of Qwen3-VL-Flash on Alibaba Cloud Model Studio! 🚀 A powerful new vision-language model that combines reasoning and non-reasoning modes, outperforming open-source Qwen3-VL-30B-A3B and Qwen2.5-72B with faster responses, stronger capabilities, and

Excited to announce the launch of Qwen3-VL-Flash on Alibaba Cloud Model Studio! 🚀

A powerful new vision-language model that combines reasoning and non-reasoning modes, outperforming open-source Qwen3-VL-30B-A3B and Qwen2.5-72B with faster responses, stronger capabilities, and
vLLM (@vllm_project) 's Twitter Profile Photo

Announcing the completely reimagined vLLM TPU! In collaboration with Google, we've launched a new high-performance TPU backend unifying PyTorch and JAX under a single lowering path for amazing performance and flexibility. 🚀 What's New? - JAX + Pytorch: Run PyTorch models on

Announcing the completely reimagined vLLM TPU! In collaboration with <a href="/Google/">Google</a>, we've launched a new high-performance TPU backend unifying <a href="/PyTorch/">PyTorch</a> and JAX under a single lowering path for amazing performance and flexibility.

🚀 What's New?
- JAX + Pytorch: Run PyTorch models on