Avinash Sooriyarachchi (@avitwit3) 's Twitter Profile
Avinash Sooriyarachchi

@avitwit3

Building AI Systems @mistralai

ID: 1377740842455216128

linkhttps://github.com/avisoori1x calendar_today01-04-2021 21:53:15

28 Tweet

111 Followers

116 Following

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

# on shortification of "learning" There are a lot of videos on YouTube/TikTok etc. that give the appearance of education, but if you look closely they are really just entertainment. This is very convenient for everyone involved : the people watching enjoy thinking they are

Avinash Sooriyarachchi (@avitwit3) 's Twitter Profile Photo

I wanted to extend the simple from-scratch MoE LM implementation I wrote with expert capacity. Given Grok-1 is open source, hope this helps understand MoEs a bit better. Again the base for this is makemore/ nanoGPT from⁦⁦⁦⁦Andrej Karpathy⁩ huggingface.co/blog/AviSoori1…

Avinash Sooriyarachchi (@avitwit3) 's Twitter Profile Photo

I took a stab at implementing a vision language model from scratch in pure PyTorch. The inspiration for this is moondream 2 from vik . I basically modified makemore from Andrej Karpathy and built everything else around it. Here’s my write up: huggingface.co/blog/AviSoori1…

Devendra Chaplot (@dchaplot) 's Twitter Profile Photo

Excited to announce Mistral-NeMo 12B trained in collab with NVIDIA! - Outperforms Gemma2 9B and Llama3 8B - 128K context - Multilingual in 100+ languages: excels in European, Asian & Indian languages - Quant-Aware Training at FP8 - Apache 2.0 Blog: mistral.ai/news/mistral-n…

Excited to announce Mistral-NeMo 12B trained in collab with <a href="/nvidia/">NVIDIA</a>!
- Outperforms Gemma2 9B and Llama3 8B
- 128K context
- Multilingual in 100+ languages: excels in European, Asian &amp; Indian languages
- Quant-Aware Training at FP8
- Apache 2.0

Blog: mistral.ai/news/mistral-n…
Jeremy Howard (@jeremyphoward) 's Twitter Profile Photo

Why didn't anyone tell me how amazingly-great Bootstrap has gotten in recent years? Which I'd known sooner -- would have saved me so much time futzing around with tailwind classes. getbootstrap.com

Guillaume Lample @ NeurIPS 2024 (@guillaumelample) 's Twitter Profile Photo

Today, we release Mistral Large 2, the new version of our largest model. Mistral Large 2 is a 123B-parameter model with a 128k context window. On many benchmarks (notably in code generation and math), it is superior or on par with Llama 3.1 405B. Like Mistral NeMo, it was trained

Avinash Sooriyarachchi (@avitwit3) 's Twitter Profile Photo

I see a bunch of people look at benchmarks and think of Large 2 as ‘the other model’ and not as performant as 3.5 Sonnet and 4o. Honestly till you try it out for your particular use case, you really wouldn’t know. If it doesn’t cut it, it’s fine. But at least you know

Avinash Sooriyarachchi (@avitwit3) 's Twitter Profile Photo

I’ve seen a lot of interest from developers to reduce cost and deploy LLMs on device. With these new models from Mistral AI and our QAT stack, on device deployments without degradation is a reality. Amazing work Pierre Stock Sandeep Subramanian Teven Le Scao and team!!

Grant Sanderson (@3blue1brown) 's Twitter Profile Photo

I learned yesterday the video I made in 2017 explaining how Bitcoin works was taken down, and my channel received a copyright strike (despite it being 100% my own content). The request seems to have been issued by a company chainpatrol, on behalf of Arbitrum, whose website says

Mistral AI (@mistralai) 's Twitter Profile Photo

magnet:?xt=urn:btih:11f2d1ca613ccf5a5c60104db9f3babdfa2e6003&dn=Mistral-Small-3-Instruct&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=http%3A%2F%https://t.co/ua2yzvEYLu%3A1337%2Fannounce

Mistral AI (@mistralai) 's Twitter Profile Photo

Introducing Mistral Medium 3.1. Overall performance boost, tone improvement, smarter web searches. Try it now in Le Chat (default model) or via our API (`mistral-medium-2508`).

Introducing Mistral Medium 3.1.

Overall performance boost, tone improvement, smarter web searches.

Try it now in Le Chat (default model) or via our API (`mistral-medium-2508`).
Avinash Sooriyarachchi (@avitwit3) 's Twitter Profile Photo

Proud to share the first public model I worked on at Mistral AI. A decoder-only LLM optimized for creative writing, narrative generation, roleplay, and character-driven dialogue. Now live via API as labs-mistral-small-creative docs.mistral.ai/models/mistral…