Manu Romero (@mrm8488) Twitter Tweets • TwiCopy

Manu Romero

@mrm8488

+ Follow

CSO/Co-founder @maisaAI_. Head Contrib/ Ambassador🤗 @huggingface. Research 🌸@bigsciencew/@BigCodeProject | ex @narrativaAI

ID: 237973737

linkhttps://linktr.ee/mrm8488 calendar_today14-01-2011 02:19:04

45,45K Tweet

20,20K Followers

2,2K Following

Manu Romero

@mrm8488

7 months ago

Senior in Python and MLE projects? Ping me, we are hiring!

thumb_up_off_alt13

chat_bubble_outline0

repeat7

shareShare

S.E. Digitalización e Inteligencia Artificial

@sediagob

7 months ago

¿Quieres aprender sobre aplicaciones prácticas de la #IA en empresas y Administración Pública? 📢 ¡Matricúlate en el curso de verano organizado por S.E. Digitalización e Inteligencia Artificial en la Universidad Menéndez Pelayo UIMP! 💡 1.5 ETCS 📅 16-17-18 de julio Plazas limitadas 👉 uimp.es/agenda-link.ht…

thumb_up_off_alt11

chat_bubble_outline0

repeat6

shareShare

𝚐𝔪𝟾𝚡𝚡𝟾

@gm8xx8

7 months ago

Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs Introduces a method to monitor and control structured reasoning in LLMs by extracting and manipulating a “thinking progress vector” (TPV) from hidden states. 𝖯𝖠𝖯𝖤𝖱 𝖨𝖭 𝖠𝖫𝖳

thumb_up_off_alt124

chat_bubble_outline2

repeat22

shareShare

Manu Romero

@mrm8488

7 months ago

LLMs Context Ops... That's the field, the topic, the science

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Manu Romero

@mrm8488

7 months ago

Just saying to the LLM in the system prompt "...you must reason in <lang>" doesn't seem to work well when lang!=English. Luckily, a few RL steps (tested using GRPO) can help a lot.

thumb_up_off_alt3

chat_bubble_outline2

repeat1

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

6 months ago

BREAD: Branched Rollouts from Expert Anchors Bridge SFT & RL for Reasoning "we introduce BREAD: a GRPO variant that unifies the SFT and RL stages via partial expert guidance and branched rollouts. When self-generated traces fail, BREAD adaptively inserts short expert

thumb_up_off_alt138

chat_bubble_outline4

repeat20

shareShare

Manu Romero

@mrm8488

6 months ago

Kind reminder! We're still looking for senior Python engineers with 2–3 years of experience working with GenAI.

thumb_up_off_alt42

chat_bubble_outline5

repeat12

shareShare

Manu Romero

@mrm8488

6 months ago

My passion for how Operating Systems work helped me realize that a limited context window isn't a problem—as long as you keep the necessary information in context at each step. This insight was also key in developing Maisa’s KPU

thumb_up_off_alt9

chat_bubble_outline2

repeat2

shareShare

Guilherme Penedo

@gui_penedo

6 months ago

We have finally released the 📝paper for 🥂FineWeb2, our large multilingual pre-training dataset. Along with general (and exhaustive) multilingual work, we introduce a concept that can also improve English performance: deduplication-based upsampling, which we call rehydration.

thumb_up_off_alt316

chat_bubble_outline7

repeat63

shareShare

Manu Romero

@mrm8488

6 months ago

When using LLMs' structured outputs (JSON mode) feature, you may note that the values of the resulting schema are not very long. Fortunately, if you are using an open-source LLM, reinforcement learning (RL) can help you there!

thumb_up_off_alt6

chat_bubble_outline2

repeat0

shareShare

Manu Romero

@mrm8488

6 months ago

AI researchers are the new rock stars 👏

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare