
Harpreet Singh
@harpreetmann24
ID: 1414330830575521794
11-07-2021 21:08:48
52 Tweet
4 Followers
761 Following


New open model from Mistral AI! 🧠 Yesterday night, Mistral released Mixtral 8x22B a 176B MoE via magnet link. 🔗🤯 What we know so far: 🧮 176B MoE with ~40B active 📜 context length of 65k tokens. 🪨 Base model can be fine-tuned 👀 ~260GB VRAM in fp16, 73GB in int4 📜 Apache








Easily Fine-tune AI at Meta Llama 3 70B! 🦙 I am excited to share a new guide on how to fine-tune Llama 3 70B with PyTorch FSDP, Q-Lora, and Flash Attention 2 (SDPA) using Hugging Face build for consumer-size GPUs (4x 24GB). 🚀 Blog: philschmid.de/fsdp-qlora-lla… The blog covers: 👨💻





Here's an engaging intro to evals by Midnight Maniac Sri and Wil Chung. They've clearly put a lot of care and effort into it, where the content is well organized with plenty of illustrations throughout. Across 60 pages, they explain model vs. system evals, vibe checks and property-based




