Cognitive Computations (@cognitivecompai) 's Twitter Profile
Cognitive Computations

@cognitivecompai

We make AI models Dolphin and Samantha
BTC 3ENBV6zdwyqieAXzZP2i3EjeZtVwEmAuo4

ID: 2854214132

calendar_today13-10-2014 10:22:51

5,5K Tweet

14,14K Followers

478 Following

bycloud (@bycloudai) 's Twitter Profile Photo

As im also making a video on model distillation, this is probs one of my favorite paper this week So u basically distill a transformer into a mamba and it can "retain" its original capabilities This performs best on benchmarks compared to any "existing" RNN attn hybrid cope?

Byron Hsu (@hsu_byron) 's Twitter Profile Photo

CUDA MODE 28: Writing experimental Triton kernels might be easy, but building production-grade ones is hard youtu.be/gWble4FreV4?si… By production, it means: 1. Reliable: Does not crash due to out-of-bounds or illegal memory access, etc. 2. Numerically stable: Does not overflow

Alpin (@alpindale) 's Twitter Profile Photo

4 months of playing catch-up with the amazing team at vLLM, and 120k lines of code later, I'm finally free. Aphrodite gets its 0.6.0 release. I had to temporarily remove exl2 support, but we have plenty of alternatives now. Asymmetric TP is now supported, and support for

4 months of playing catch-up with the amazing team at <a href="/vllm_project/">vLLM</a>, and 120k lines of code later, I'm finally free. Aphrodite gets its 0.6.0 release. I had to temporarily remove exl2 support, but we have plenty of alternatives now. Asymmetric TP is now supported, and support for
Stefano Fiorucci (@theanakin87) 's Twitter Profile Photo

🎯 Selective fine-tuning of Language Models with Spectrum 1/5 I've just published this practical tutorial on Spectrum + TRL 👉 huggingface.co/blog/anakin87/… What is Spectrum? Why use this method? 🧵

🎯 Selective fine-tuning of Language Models with Spectrum

1/5 

I've just published this practical tutorial on Spectrum + TRL
👉 huggingface.co/blog/anakin87/…

What is Spectrum? Why use this method? 🧵
Chubby♨️ (@kimmonismus) 's Twitter Profile Photo

Without any doubt the best and most consistent AI Video I’ve ever seen. Just imagine where we were 18 months ago (first Will Smith Spaghetti-Video) and where we are now. Absolutely stunning.

Cognitive Computations (@cognitivecompai) 's Twitter Profile Photo

OLMoE is cool, but, comparing it with mistral-7b and llama3.1-8b, I'm not sure it is preferable. It's the same size, faster, but performs worse. It seems the same old trade-off - MoE is faster, but less capable, than equivalent sized dense models.

OLMoE is cool, but, comparing it with mistral-7b and llama3.1-8b, I'm not sure it is preferable. It's the same size, faster, but performs worse.  It seems the same old trade-off - MoE is faster, but less capable, than equivalent sized dense models.
Woosuk Kwon (@woosuk_k) 's Twitter Profile Photo

Developing vLLM taught me a tough lesson: to keep the GPU fully utilized, we need to pay close attention to everything happening on the CPU. Over the past month, the vLLM community conducted an in-depth study and made key optimizations, leading to significant

Together AI (@togethercompute) 's Twitter Profile Photo

We are excited to share our latest work on speculative decoding for high-throughput inference! Before this work, we thought speculative decoding was useless at large batch sizes since the GPUs would go brrrr from processing all the different inputs. Much to our surprise, we

nisten - e/acc (@nisten) 's Twitter Profile Photo

gg, i think... we may just have local inference better than sonnet now. had a brief test of this while it was in development. GPQA is one of the hardest benchmarks, it beats sonnet, all you need to do is delete the <reflection> parts from the output. 🤯 will post some tests

Alpin (@alpindale) 's Twitter Profile Photo

Makes me think. If I had to make my own list, I'd definitely include Woosuk Kwon/Zhuohan Li (et al.) for creating Paged Attention and Tri Dao for Flash Attention. They might've saved open source.

DeepSeek (@deepseek_ai) 's Twitter Profile Photo

🚀 Exciting news! We’ve officially launched DeepSeek-V2.5 – a powerful combination of DeepSeek-V2-0628 and DeepSeek-Coder-V2-0724! Now, with enhanced writing, instruction-following, and human preference alignment, it’s available on Web and API. Enjoy seamless Function Calling,

🚀 Exciting news! We’ve officially launched DeepSeek-V2.5 – a powerful combination of DeepSeek-V2-0628 and DeepSeek-Coder-V2-0724! Now, with enhanced writing, instruction-following, and human preference alignment, it’s available on Web and API. Enjoy seamless Function Calling,