Manu Romero (@mrm8488) Twitter Tweets • TwiCopy

a month ago

FLUX #LoRAs work!

thumb_up_off_alt20

chat_bubble_outline1

repeat1

Manu Romero

a month ago

It's crazy

thumb_up_off_alt16

chat_bubble_outline2

repeat0

This model looks really good and the post-training recipe for SFT models combines a bunch of cool tricks that the community has developed over the past year: - Filter a large SFT corpus for quality x difficulty (similar to Llama3) - Use the Spectrum method from Cognitive Computations

Fco. Jesús Martínez Murcia

@pakitochus

a month ago

Seguimos con la sesión de tarde con Manu Romero , que nos introducirá en el lenguaje natural, y nos hará un recorrido por el Hub de Hugging Face

Seguimos con la sesión de tarde con <a href="/mrm8488/">Manu Romero</a> , que nos introducirá en el lenguaje natural, y nos hará un recorrido por el Hub de <a href="/huggingface/">Hugging Face</a>

thumb_up_off_alt7

repeat5

Andrés Ortiz

@andortizg

a month ago

Esta tarde en nuestro curso #IAenUNIA haciendo fine tuning y cuantizando un modelo LLM (Qlora) con Manu Romero

Esta tarde en nuestro curso #IAenUNIA haciendo fine tuning y cuantizando un modelo LLM (Qlora) con <a href="/mrm8488/">Manu Romero</a>

thumb_up_off_alt6

repeat5

Philipp Schmid

@_philschmid

a month ago

Phi goes MoE! Microsoft just released Phi-3.5-MoE a 42B parameter MoE built upon datasets used for Phi-3. Phi-3.5 MoE outperforms bigger models in reasoning capability and is only behind GPT-4o-mini. 👀 TL;DR 🧮 42B parameters with 6.6B activated during generation 👨‍🏫 16

Phi goes MoE! <a href="/Microsoft/">Microsoft</a> just released Phi-3.5-MoE a 42B parameter MoE built upon datasets used for Phi-3. Phi-3.5 MoE outperforms bigger models in reasoning capability and is only behind GPT-4o-mini. 👀

TL;DR
🧮 42B parameters with 6.6B activated during generation
👨‍🏫 16

thumb_up_off_alt362

chat_bubble_outline12

repeat73

Andrés Ortiz

@andortizg

a month ago

Hoy, en #IAenUNIA, interesantísima sesión sobre cuantización de modelos de lenguaje de gran escala y RAGs, con Manu Romero y María Grandury SomosNLP Universidad Intern. de Andalucía

Hoy, en #IAenUNIA, interesantísima sesión sobre cuantización de modelos de lenguaje de gran escala y RAGs, con <a href="/mrm8488/">Manu Romero</a> y <a href="/mariagrandury/">María Grandury</a> <a href="/SomosNLP_/">SomosNLP</a> <a href="/UNIAuniversidad/">Universidad Intern. de Andalucía</a>

thumb_up_off_alt10

repeat6

Loubna Ben Allal

@loubnabenallal1

a month ago

Small talk is more difficult than expected for chat models, so we built a dataset to fix this! When using standard SFT datasets like Magpie or WebInstruct models often still spectacularly fail when you just greet them.

Manu Romero

a month ago

Continuous self-critique/review in agents (LLMs) is akin to a `while True` loop. It's as if they always have something to say (or improve) and can end with very weird results.

thumb_up_off_alt5

repeat2

Sayak Paul

@risingsayak

a month ago

Service LLMs greatly reduce the barrier to entry for many applications that want to endow cool things. However, they impose internet access and privacy concerns. We present LlamaDuo, a simple pipeline that mimics a service LLM on SPECIFIC TASKS through a small LM in crisis

Manu Romero

a month ago

Our solution rocks!

thumb_up_off_alt8

repeat0

AK

@_akhaliq

a month ago

FLUX.1-schnell OpenVINO support github: github.com/rupeshs/fastsd…

Manu Romero

a month ago

Cool release!🚀

thumb_up_off_alt4

repeat0

Manu Romero

a month ago

Forgot to make it public: huggingface.co/mrm8488/multil… Matryoshka Embeddings model fine-tuned for better performance on Spanish texts

thumb_up_off_alt46

chat_bubble_outline3

repeat6

Amjad Masad

@amasad

24 days ago

This is a “good enough” way to run *any* company. Founder Mode is N=1

thumb_up_off_alt175

chat_bubble_outline8

repeat8

Interconnects

@interconnectsai

21 days ago

OpenAI’s Strawberry, LM self-talk, inference scaling laws, and spending more on inference Whether or not scaling works, we should spend more on inference. interconnects.ai/p/openai-straw…

thumb_up_off_alt81

repeat8

Manu Romero

18 days ago

In these times when we know data quality is vital to creating better LLMs, I've fine-tuned a set of SoTA Embeddings models on the WebInstruct dataset to help with this kind of task. Hugging Face collection: huggingface.co/collections/mr…

thumb_up_off_alt11

chat_bubble_outline1

repeat2