TensorTonic (@tensortonic) Twitter Tweets • TwiCopy

TensorTonic

@tensortonic

+ Follow

Machine Learning papers, concepts, and resources.

ID: 1917184992616321024

calendar_today29-04-2025 11:51:44

16 Tweet

636 Takipçi

0 Takip Edilen

TensorTonic

@tensortonic

7 months ago

Great video for learning CUDA programming

thumb_up_off_alt1,1K

chat_bubble_outline4

repeat170

shareShare

TensorTonic

@tensortonic

7 months ago

Great video to understand GRPO.

thumb_up_off_alt36

chat_bubble_outline0

repeat4

shareShare

Introducing Magistral Small 1.2 & Magistral Medium 1.2, minor updates to our Magistral 1.1 models! - Multimodality: Now equipped with a vision encoder, these models handle both text and images seamlessly. - Performance Boost: 15% improvements on math and coding benchmarks such

thumb_up_off_alt1,1K

chat_bubble_outline47

repeat206

shareShare

Alexandr Wang

@alexandr_wang

7 months ago

new research from Meta FAIR: Code World Model (CWM), a 32B research model we encourage the research community to research this open-weight model! pass@1 evals, for the curious: 65.8 % on SWE-bench Verified 68.6 % on LiveCodeBench 96.6 % on Math-500 76.0 % on AIME 2024 🧵

thumb_up_off_alt1,1K

chat_bubble_outline87

repeat159

shareShare

DeepSeek

@deepseek_ai

6 months ago

🚀 Introducing DeepSeek-V3.2-Exp — our latest experimental model! ✨ Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context. 👉 Now live on App, Web, and API. 💰 API prices cut by 50%+! 1/n

thumb_up_off_alt6,6K

chat_bubble_outline227

repeat893

shareShare

OpenAI

@openai

6 months ago

You can now chat with apps in ChatGPT.

thumb_up_off_alt16,16K

chat_bubble_outline616

repeat1,1K

shareShare

Claude

@claudeai

6 months ago

Introducing Claude Haiku 4.5: our latest small model. Five months ago, Claude Sonnet 4 was state-of-the-art. Today, Haiku 4.5 matches its coding performance at one-third the cost and more than twice the speed.

thumb_up_off_alt7,7K

chat_bubble_outline314

repeat1,1K

shareShare

Qwen

@alibaba_qwen

6 months ago

Excited to announce the launch of Qwen3-VL-Flash on Alibaba Cloud Model Studio! 🚀 A powerful new vision-language model that combines reasoning and non-reasoning modes, outperforming open-source Qwen3-VL-30B-A3B and Qwen2.5-72B with faster responses, stronger capabilities, and

thumb_up_off_alt639

chat_bubble_outline29

repeat86

shareShare

vLLM

@vllm_project

6 months ago

Announcing the completely reimagined vLLM TPU! In collaboration with Google, we've launched a new high-performance TPU backend unifying PyTorch and JAX under a single lowering path for amazing performance and flexibility. 🚀 What's New? - JAX + Pytorch: Run PyTorch models on

Announcing the completely reimagined vLLM TPU! In collaboration with <a href="/Google/">Google</a>, we've launched a new high-performance TPU backend unifying <a href="/PyTorch/">PyTorch</a> and JAX under a single lowering path for amazing performance and flexibility.

🚀 What's New?
- JAX + Pytorch: Run PyTorch models on

thumb_up_off_alt969

chat_bubble_outline17

repeat122

shareShare

TensorTonic

@tensortonic

6 months ago

Check it out!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

TensorTonic

TensorTonic

TensorTonic

Mistral AI

Alexandr Wang

DeepSeek

OpenAI

Claude

Qwen

vLLM

TensorTonic