I guess I should post something once in a while. So here's a whole chatbot in 26 lines of Python running Mixtral 8x7B real fast on one 3090. Idk, I think it's neat. 🐈
🚀 TACO: a new benchmark for code generation from BAAI with 26,443 problems.
• 🤖 English questions & Python solutions
• 🧠 Ideal for evaluating code generation from natural language
• 📊 Train: 25,443 samples, Test: 1,000 samples
• 📚 Diverse difficulty levels
I performed a successful vocabulary transplant on Qwen2-0.5B and turned it into a useful draft model for Llama-3. What a time to be alive. #hashtag
huggingface.co/turboderp/Qwam…
1 year ago, I made TabbyAPI with turboderp as a side project.
Now, it's my most popular side project.
I wanted to break away from the bloated nature of AIO local model backends and just run #exllama.
Thanks to all the contributors and testers.
github.com/theroyallab/ta…
TabbyAPI now supports vision.
Thanks to turboderp for exllamav2's updates and DocShotgun for the initial work.
Any Exl2 supported vision model works, but this release focuses on Pixtral from Mistral AI
Supply chain alert!
Don't use the comfyUI Impact pack! Its dependency ultralytics has been compromised on pypi.
Thanks Shinon for letting me know in Discord.
github.com/ultralytics/ul…
Seems to still be true that larger models are less sensitive to quantization. Here is Mistral-Large 123B at 1.4 bits per weight, running on one 24 GB GPU. #AI or something
TabbyAPI now supports ExllamaV3 with automatic backend detection! 🎉
Please note that exl3 is being actively worked on and mileage may vary compared to exl2
Thanks to turboderp and all contributors for making this a reality.
We just added Tensor Parallelism to TabbyAPI!
Huge thanks to turboderp and testers who made this possible.
Now, enjoy the clip diving into the origins of Exllama.
Wanna see TabbyAPI built live? Follow me on twitch: kingbri1st