Ramon Astudillo (@ramonastudill12) 's Twitter Profile
Ramon Astudillo

@ramonastudill12

Principal RS at IBM Research AI. Speech, Formal/Natural Language Processing. Currently LLM post-training, structured SDG/RL. Opinions my own and non stationary

ID: 1117927926152990720

linkhttp://ramon.astudillo.com calendar_today15-04-2019 23:09:22

2,2K Tweet

555 Takipçi

409 Takip Edilen

Ramon Astudillo (@ramonastudill12) 's Twitter Profile Photo

The 15th edition of the Lisbon Machine Learnings School (LxMLS 2025) is looking for its monitor team. As always alumni are especially welcome. Apply before the month ends! bgmartins.github.io/lxmls-website-…

Rulin Shao (@rulinshao) 's Twitter Profile Photo

🎉Our Spurious Rewards is available on ArXiv! We added experiments on - More prompts/steps/models/analysis... - Spurious Prompts! Surprisingly, we obtained 19.4% gains when replacing prompts with LaTex placeholder text (\lipsum) 😶‍🌫️ Check out our 2nd blog: tinyurl.com/spurious-prompt

🎉Our Spurious Rewards is available on ArXiv! We added experiments on
- More prompts/steps/models/analysis...
- Spurious Prompts!
Surprisingly, we obtained 19.4% gains when replacing prompts with LaTex placeholder text (\lipsum) 😶‍🌫️

Check out our 2nd blog: tinyurl.com/spurious-prompt
kache (@yacinemtb) 's Twitter Profile Photo

Lex Fridman you know what i learned? 1m qps is actually easier than 100k qps. at my last big tech wagie job, my last eng lead explained why the scale swallows up the errors the scale. swallows up the tails but the tails! the tails are people like me we are weird. we use the app, weird

Ramon Astudillo (@ramonastudill12) 's Twitter Profile Photo

All this "perplexity went down but benchmark did not go up" as if it was fully unexpected. It's "transfer learning" right? there should be a limit to the transfer, i.e. objectives are not the same.

Shinji Watanabe (@shinjiw_at_cmu) 's Twitter Profile Photo

Heading to #Interspeech2025! I’ll be involved in a tutorial, regular and special sessions (22 papers) & MLC-SLM workshop — and excited to chat about our new project: ESPnet3, CHiME-9, Urgent3, LARC, and YODAS++. If you’re interested, come say hi — let’s collaborate! 🚀

Ramon Astudillo (@ramonastudill12) 's Twitter Profile Photo

If you think about when the first rumours of Q* started, it's like a year before O1-preview ... shouldn't we have heard something about the next thing already? not the best of signs?

Aashka Trivedi (@aashkaa_) 's Twitter Profile Photo

Granite Embedding R2 Models are here! 🔥 8k context 🏆 Top performance on BEIR, MTEB, COIR, MLDR, MT-RAG, Table IR, LongEmbed ⚡Fast and lightweight 🎯 Apache 2.0 license (trained on commercial friendly data) Try them now on Hugging Face 👉 hf.co/ibm-granite

Ramon Astudillo (@ramonastudill12) 's Twitter Profile Photo

Peer review is at risk of disappearing mainly for reasons unrelated to the rise of bureaucrats to power on the orgs that coordinate/control it, but this is definitely making the situation far worse.

Ramon Astudillo (@ramonastudill12) 's Twitter Profile Photo

Is it me or that there has been an update to ChatGPT''s voice and now it sounds like it has difficulties speaking and breathing at the same time?

Ramon Astudillo (@ramonastudill12) 's Twitter Profile Photo

IMO the problem with telling young people "work very hard" is that, at best, it is a necessary but not sufficient condition. So much more is needed like knowing what you want, how to get there, what are your limits, people skills, etc. "Work hard" alone is a recipe for burnout.

Ramon Astudillo (@ramonastudill12) 's Twitter Profile Photo

If domain adaptation provides a moat against LLM providers is a million (billion) $ question. Watching Cursor's trajectory is probably one of the best ways to answer it.