labml.ai (@labmlai) 's Twitter Profile
labml.ai

@labmlai

๐Ÿ“ Annotated paper implementations nn.labml.ai

ID: 1338393738511589377

linkhttp://labml.ai calendar_today14-12-2020 08:02:04

648 Tweet

12,12K Followers

9 Following

NOTBAD AI (@notbadai) 's Twitter Profile Photo

Weโ€™ve been training NVIDIA Mistral-NeMo-Minitron-8B-Base for math reasoning on the GSM8K-Aug dataset, and we have a version with a 70.2% gsm8k score, up from a 58.5% cot score (reported in the paper LLM Pruning and distillation). ๐Ÿ‘‡

NOTBAD AI (@notbadai) 's Twitter Profile Photo

๐Ÿ“ข We are excited to announce Notbad v1.0 Mistral 24B, a new reasoning model trained in math and Python coding. This model is built upon the Mistral AI Small 24B 2501 and has been further trained with reinforcement learning on math and coding.

๐Ÿ“ข We are excited to announce Notbad v1.0 Mistral 24B, a new reasoning model trained in math and Python coding. This model is built upon the <a href="/MistralAI/">Mistral AI</a> Small 24B 2501 and has been further trained with reinforcement learning on math and coding.
NOTBAD AI (@notbadai) 's Twitter Profile Photo

We're open-sourcing a math reasoning dataset with 270k samples, generated by our RL-based self-improved Mistral 24B 2501 model and used to train Notbad v1.0 Mistral 24B. Available on Hugging Face: huggingface.co/datasets/notbaโ€ฆ

vpj (@vpj) 's Twitter Profile Photo

Uploaded the dataset of 270k math reasoning samples that we used to finetune Notbad v1.0 Mistral 24B (MATH-500=77.52% GSM8k Platinum=97.55%) to Hugging Face (link in reply) Follow NOTBAD AI for updates

NOTBAD AI (@notbadai) 's Twitter Profile Photo

We just released a Python coding reasoning dataset with 200k samples on Hugging Face This was generated by our RL-based self-improved Mistral 24B 2501 model. This dataset was used to train train Notbad v1.0 Mistral 24B. ๐Ÿค— Links in replies ๐Ÿ‘‡

We just released a Python coding reasoning dataset with 200k samples on <a href="/huggingface/">Hugging Face</a>

This was generated by our RL-based self-improved Mistral 24B 2501 model. This dataset was used to train train Notbad v1.0 Mistral 24B.

๐Ÿค— Links in replies ๐Ÿ‘‡
NOTBAD AI (@notbadai) 's Twitter Profile Photo

We are releasing an updated reasoning model with improvements on IFEval scores (77.9%) than our previous model (only 51.4%). ๐Ÿ‘‡ Links to try the model and to download weights below

We are releasing an updated reasoning model with improvements on IFEval scores (77.9%) than our previous model (only 51.4%).

๐Ÿ‘‡ Links to try the model and to download weights below
vpj (@vpj) 's Twitter Profile Photo

The new training also improved GPQA from 64.2% to 67.3% and MMLU Pro from 64.2% to 67.3%. This model was also trained with the same reasoning datasets we used to train the v1.0 model. We mixed more general instruction data with answers sampled from the

vpj (@vpj) 's Twitter Profile Photo

The following scripts stalls and times out on B200 x 8. Seems like we are having problems with NCCL. Anyone else experiencing this? PyTorch

The following scripts stalls and times out on B200 x 8. Seems like we are having problems with NCCL. Anyone else experiencing this? <a href="/PyTorch/">PyTorch</a>
vpj (@vpj) 's Twitter Profile Photo

Wrote an annotated Triton implementation of Flash Attention 2. (Links in reply) This is based on the flash attention implementation by the Triton team. Changed it to support GQA and cleaned up a little bit. Check it out to read the code for forward and backward passes along

Wrote an annotated Triton implementation of Flash Attention 2. (Links in reply)

This is based on the flash attention implementation by the Triton team. Changed it to support GQA and cleaned up a little bit.

Check it out to read the code for forward and backward passes along
vpj (@vpj) 's Twitter Profile Photo

Added the JAX transformer model to annotated paper implementations project. x.com/vpj/status/143โ€ฆ Link ๐Ÿ‘‡

Added the JAX transformer model to annotated paper implementations project.
x.com/vpj/status/143โ€ฆ

Link ๐Ÿ‘‡
vpj (@vpj) 's Twitter Profile Photo

GEPA appears to be an effective method for enhancing LLM performance, requiring significantly fewer rollouts than reinforcement learning (RL). It maintains a pool of system prompts. It uses an the LLM to improve them by reflecting on the generated answers and the scores/feedback

GEPA appears to be an effective method for enhancing LLM performance, requiring significantly fewer rollouts than reinforcement learning (RL).

It maintains a pool of system prompts. It uses an the LLM to improve them by reflecting on the generated answers and the scores/feedback
NOTBAD AI (@notbadai) 's Twitter Profile Photo

We've open-sourced our internal AI coding IDE. We built this IDE to help with coding and to experiment with custom AI workflows. It's based on a flexible extension system, making it easy to develop, test, and tweak new ideas quickly. Each extension is a Python script that runs

We've open-sourced our internal AI coding IDE.

We built this IDE to help with coding and to experiment with custom AI workflows.

It's based on a flexible extension system, making it easy to develop, test, and tweak new ideas quickly. Each extension is a Python script that runs