Lexi (@orenguteng_ai) 's Twitter Profile
Lexi

@orenguteng_ai

AI Innovation - (un)alignment research
huggingface.co/Orenguteng

ID: 1783705442817830913

calendar_today26-04-2024 03:51:49

55 Tweet

73 Takipçi

25 Takip Edilen

Lexi (@orenguteng_ai) 's Twitter Profile Photo

LexiFun - New Llama-3-8B Uncensored model with a fun personality. huggingface.co/Orenguteng/Lla… #huggingface #llama3 #ai #ArtificialIntelligence

LexiFun - New Llama-3-8B Uncensored model with a fun personality.
huggingface.co/Orenguteng/Lla…
#huggingface #llama3 #ai #ArtificialIntelligence
Lexi (@orenguteng_ai) 's Twitter Profile Photo

Censorhip in AI - LLM dumbs down the model. This is shown by simply "uncensoring" it, not training it with any additional data or knowledge - it beats the original Llama 3.1 8B Instruct model. huggingface.co/Orenguteng/Lla… #llm #huggingface #ArtificialInteligence Hugging Face Meta

Lexi (@orenguteng_ai) 's Twitter Profile Photo

You are correct, this is not what o1 is doing at all. This is an interesting approach, but far from any solution. You interupt the natural flow and whatever internal process is ongoing to produce the response. I love how people jump on every fancy word to gain tractions - o1!!!!

Daniel Han (@danielhanchen) 's Twitter Profile Photo

Fixed a bug which caused all training losses to diverge for large gradient accumulation sizes. 1. First reported by Benjamin Marie, GA is supposed to be mathematically equivalent to full batch training, but losses did not match. 2. We reproed the issue, and further investigation

Fixed a bug which caused all training losses to diverge for large gradient accumulation sizes.

1. First reported by <a href="/bnjmn_marie/">Benjamin Marie</a>, GA is supposed to be mathematically equivalent to full batch training, but losses did not match.
2. We reproed the issue, and further investigation
Daniel Han (@danielhanchen) 's Twitter Profile Photo

Quantizing a model to 4bits will sometimes break models entirely! Unsloth AI now has a dynamic 4bit quant format which chooses some parameters to be in 16bit! We find that: 1. You need to check activation and weight quantization errors. Solely relying on 1 does not work. 2.

Quantizing a model to 4bits will sometimes break models entirely! <a href="/UnslothAI/">Unsloth AI</a> now has a dynamic 4bit quant format which chooses some parameters to be in 16bit!

We find that:
1. You need to check activation and weight quantization errors. Solely relying on 1 does not work.
2.
Unsloth AI (@unslothai) 's Twitter Profile Photo

Introducing 1.58bit DeepSeek-R1 GGUFs! 🐋 DeepSeek-R1 can now run in 1.58-bit, while being fully functional. We shrank the 671B parameter model from 720GB to just 131GB - a 80% size reduction. Naively quantizing all layers breaks the model entirely, causing endless loops &

Introducing 1.58bit DeepSeek-R1 GGUFs! 🐋

DeepSeek-R1 can now run in 1.58-bit, while being fully functional. We shrank the 671B parameter model from 720GB to just 131GB - a 80% size reduction.

Naively quantizing all layers breaks the model entirely, causing endless loops &amp;
Unsloth AI (@unslothai) 's Twitter Profile Photo

Train your own reasoning LLM using DeepSeek's GRPO algorithm with our free notebook! You'll transform Llama 3.1 (8B) to have chain-of-thought. Unsloth makes GRPO use 80% less VRAM. Guide: docs.unsloth.ai/basics/reasoni… GitHub: github.com/unslothai/unsl… Colab: colab.research.google.com/github/unsloth…

Lexi (@orenguteng_ai) 's Twitter Profile Photo

Pro - o3 high NERFED today Today, 2 things happened: 1: o3 produces worse responses and the old GPT4 issue that suddenly came to existence back in time where they replaced code response with comments "insert XYZ here" , shortened responses. (read comment below for point 2)

Pro - o3 high NERFED today
Today, 2 things happened:

1: o3 produces worse responses and the old GPT4 issue that suddenly came to existence back in time where they replaced code response with comments "insert XYZ here" , shortened responses.
(read comment below for point 2)
Unsloth AI (@unslothai) 's Twitter Profile Photo

Today, we’re launching new algorithms that enable 10x longer context lengths & 90% less VRAM for training Reasoning Models (GRPO). Using Unsloth, you can now train your own reasoning model with just 5GB VRAM for Qwen2.5-1.5B with no accuracy loss. Blog: unsloth.ai/blog/grpo

Today, we’re launching new algorithms that enable 10x longer context lengths &amp; 90% less VRAM for training Reasoning Models (GRPO).

Using Unsloth, you can now train your own reasoning model with just 5GB VRAM for Qwen2.5-1.5B with no accuracy loss.

Blog: unsloth.ai/blog/grpo
Daniel Han (@danielhanchen) 's Twitter Profile Photo

Having endless repetitions with QwQ-32B? I made a guide to help debug stuff! When using repetition penalties to counteract looping, it rather causes looping! Try adding this to llama.cpp: --samplers "top_k;top_p;min_p;temperature;dry;typ_p;xtc" I also uploaded dynamic 4bit

Having endless repetitions with QwQ-32B? I made a guide to help debug stuff!

When using repetition penalties to counteract looping, it rather causes looping!

Try adding this to llama.cpp:
--samplers "top_k;top_p;min_p;temperature;dry;typ_p;xtc"

I also uploaded dynamic 4bit
Unsloth AI (@unslothai) 's Twitter Profile Photo

We made a Guide on mastering LoRA Hyperparameters, so you can learn to fine-tune LLMs correctly! Learn to: • Train smarter models with fewer hallucinations • Choose optimal: learning rates, epochs, LoRA rank, alpha • Avoid overfitting & underfitting 🔗docs.unsloth.ai/get-started/fi…

We made a Guide on mastering LoRA Hyperparameters, so you can learn to fine-tune LLMs correctly!

Learn to:
• Train smarter models with fewer hallucinations
• Choose optimal: learning rates, epochs, LoRA rank, alpha
• Avoid overfitting &amp; underfitting

🔗docs.unsloth.ai/get-started/fi…
Lexi (@orenguteng_ai) 's Twitter Profile Photo

Since when did Meta start verifying fake AI profiles? I thought verified meant a real person or brand. Literally passport verification. #ai

Since when did <a href="/Meta/">Meta</a> start verifying fake AI profiles? I thought verified meant a real person or brand. Literally passport verification. #ai
Unsloth AI (@unslothai) 's Twitter Profile Photo

OpenAI gpt-oss with ultra long context is here!🚀 Introducing Unsloth Flex Attention which enables 61K context for gpt-oss bf16 training on a 80GB GPU. Unsloth achieves 8×longer context, 50% less VRAM & 1.5×faster training vs. all implementations. 🔗docs.unsloth.ai/basics/long-co…

OpenAI gpt-oss with ultra long context is here!🚀

Introducing Unsloth Flex Attention which enables 61K context for gpt-oss bf16 training on a 80GB GPU.

Unsloth achieves 8×longer context, 50% less VRAM &amp; 1.5×faster training vs. all implementations.

🔗docs.unsloth.ai/basics/long-co…
Unsloth AI (@unslothai) 's Twitter Profile Photo

You can now run FP8 reinforcement learning on consumer GPUs! Try DeepSeek-R1’s FP8 GRPO at home using only a 5GB GPU. Qwen3-1.7B fits in 5GB VRAM. We collabed with PyTorch to make FP8 RL inference 1.4× faster. Unsloth: 60% less VRAM, 12× longer context. docs.unsloth.ai/new/fp8-reinfo…

You can now run FP8 reinforcement learning on consumer GPUs!

Try DeepSeek-R1’s FP8 GRPO at home using only a 5GB GPU.

Qwen3-1.7B fits in 5GB VRAM.
We collabed with PyTorch to make FP8 RL inference 1.4× faster.
Unsloth: 60% less VRAM, 12× longer context.

docs.unsloth.ai/new/fp8-reinfo…
Unsloth AI (@unslothai) 's Twitter Profile Photo

You can now do 500K context length fine-tuning with Unsloth! Train gpt-oss-20b to extend its context window to 530K on 80GB VRAM & 750K+ on 192GB - no accuracy loss. Unsloth's new algorithms + Tiled MLP = 72% less VRAM & 6x more context Blog + Notebook: docs.unsloth.ai/new/500k-conte…

You can now do 500K context length fine-tuning with Unsloth!

Train gpt-oss-20b to extend its context window to 530K on 80GB VRAM &amp; 750K+ on 192GB - no accuracy loss.

Unsloth's new algorithms + Tiled MLP = 72% less VRAM &amp; 6x more context

Blog + Notebook: docs.unsloth.ai/new/500k-conte…
Unsloth AI (@unslothai) 's Twitter Profile Photo

Mistral releases Ministral 3, their new reasoning and instruct models! 🔥 Ministral 3 comes in 3B, 8B, and 14B with vision support and best-in-class performance. Run the 14B models locally with 24GB RAM. Guide + Notebook: docs.unsloth.ai/new/ministral-3 GGUFs: huggingface.co/collections/un…

Mistral releases Ministral 3, their new reasoning and instruct models! 🔥

Ministral 3 comes in 3B, 8B, and 14B with vision support and best-in-class performance.

Run the 14B models locally with 24GB RAM.

Guide + Notebook: docs.unsloth.ai/new/ministral-3
GGUFs: huggingface.co/collections/un…
Unsloth AI (@unslothai) 's Twitter Profile Photo

You can now train LLMs 3× faster with no accuracy loss, via our new RoPE and MLP kernels. Our Triton kernels plus smart auto packing delivers ~3× faster training & 30% less VRAM vs optimized FA3 setups. Train Qwen3-4B 3x faster on just 3.9GB VRAM. Blog: docs.unsloth.ai/new/3x-faster-…

You can now train LLMs 3× faster with no accuracy loss, via our new RoPE and MLP kernels.

Our Triton kernels plus smart auto packing delivers ~3× faster training &amp; 30% less VRAM vs optimized FA3 setups.

Train Qwen3-4B 3x faster on just 3.9GB VRAM.

Blog: docs.unsloth.ai/new/3x-faster-…