Gal Yona (@_galyo) 's Twitter Profile
Gal Yona

@_galyo

Research scientist @googleai, previously CS PhD @weizmannscience

ID: 86293812

linkhttps://galyona.github.io/ calendar_today30-10-2009 11:45:44

284 Tweet

474 Takipçi

486 Takip Edilen

Mor Geva (@megamor2) 's Twitter Profile Photo

Excited to attend EMNLP 2025 in Miami next week 🤩 DM me if you'd like to grab a coffee and chat about interpretability, knowledge, or reasoning in LLMs! Our group/collabs will be presenting a bunch of cool works, come check them out! 🧵

Excited to attend <a href="/emnlpmeeting/">EMNLP 2025</a> in Miami next week 🤩 DM me if you'd like to grab a coffee and chat about interpretability, knowledge, or reasoning in LLMs!

Our group/collabs will be presenting a bunch of cool works, come check them out! 🧵
Yoav Wald (@wald_yoav) 's Twitter Profile Photo

What prompt generated the image on the right? Come find out today at our tutorial on OOD generalization: Shortcuts, Spuriousness, and Stability @Maggiemakar aahlad puli Panel: Elan Rosenfeld Aditi Raghunathan Danica Sutherland

What prompt generated the image on the right?
Come find out today at our tutorial on OOD generalization: Shortcuts, Spuriousness, and Stability

@Maggiemakar <a href="/aahladpuli/">aahlad puli</a> 
Panel: <a href="/ElanRosenfeld/">Elan Rosenfeld</a> <a href="/AdtRaghunathan/">Aditi Raghunathan</a> Danica Sutherland
Chip Huyen (@chipro) 's Twitter Profile Photo

During the process of writing AI Engineering, I went through so many papers, case studies, blog posts, repos, tools, etc. This repo contains ~100 resources that really helped me understand various aspects of building with foundation models. github.com/chiphuyen/aie-… Here are the

François Chollet (@fchollet) 's Twitter Profile Photo

Today OpenAI announced o3, its next-gen reasoning model. We've worked with OpenAI to test it on ARC-AGI, and we believe it represents a significant breakthrough in getting AI to adapt to novel tasks. It scores 75.7% on the semi-private eval in low-compute mode (for $20 per task

Today OpenAI announced o3, its next-gen reasoning model. We've worked with OpenAI to test it on ARC-AGI, and we believe it represents a significant breakthrough in getting AI to adapt to novel tasks.

It scores 75.7% on the semi-private eval in low-compute mode (for $20 per task
Rafael Rafailov @ NeurIPS (@rm_rafailov) 's Twitter Profile Photo

We have a new position paper on "inference time compute" and what we have been working on in the last few months! We present some theory on why it is necessary, how does it work, why we need it and what does it mean for "super" intelligence.

We have a new position paper on "inference time compute" and what we have been working on in the last few months! We present some theory on why it is necessary, how does it work, why we need it and what does it mean for "super" intelligence.
Sasha Rush (@srush_nlp) 's Twitter Profile Photo

Simons Institute Workshop: "Future of LLMs and Transformers": 21 talks Monday - Friday next week. simons.berkeley.edu/workshops/futu…

Simons Institute Workshop: "Future of LLMs and Transformers": 21 talks Monday - Friday next week.

simons.berkeley.edu/workshops/futu…
Zorik Gekhman (@zorikgekhman) 's Twitter Profile Photo

🚨 It's often claimed that LLMs know more facts than they show in their outputs, but what does this actually mean, and how can we measure this “hidden knowledge”? In our new paper, we clearly define this concept and design controlled experiments to test it. 1/🧵

🚨 It's often claimed that LLMs know more facts than they show in their outputs, but what does this actually mean, and how can we measure this “hidden knowledge”?

In our new paper, we clearly define this concept and design controlled experiments to test it.
1/🧵
Stanford NLP Group (@stanfordnlp) 's Twitter Profile Photo

.Percy Liang & Tatsunori Hashimoto start the 2nd offering of CS336 Language Modeling from Scratch at Stanford NLP Group. The class philosophy is Understanding by Building. We need many people who understand the detailed design of modern LLMs, not just a few at “frontier” 🤭 AI companies.

.<a href="/percyliang/">Percy Liang</a> &amp; <a href="/tatsu_hashimoto/">Tatsunori Hashimoto</a> start the 2nd offering of CS336 Language Modeling from Scratch at <a href="/stanfordnlp/">Stanford NLP Group</a>. The class philosophy is Understanding by Building. We need many people who understand the detailed design of modern LLMs, not just a few at “frontier” 🤭 AI companies.
Gal Yona (@_galyo) 's Twitter Profile Photo

This was a great 30-minute conceptual read. It neatly ties together classic RL, LLMs of the past few years, and where agents are headed next. Honestly, I find the future of agents interacting w the world with less human mediation ("experiencing") both exciting and terrifying

Jeffrey Emanuel (@doodlestein) 's Twitter Profile Photo

Sam Altman the single biggest thing you could do for safety/alignment is to put a massive emphasis in the RL feedback loop on basic HONESTY and never misleading, tricking, overstating, exaggerating, etc. It should be like touching a hot stove to the model. Just like how you raise kids

(((ل()(ل() 'yoav))))👾 (@yoavgo) 's Twitter Profile Photo

we write too much. more than we can read, and many small incremental things. i think there should be some mechanism to restrict paper submissions and acceptances per person per year, to force people to prioritize their best work, and invest more in it.

Josh Breiner (@joshbreiner) 's Twitter Profile Photo

מצב המשטרה: השתמשה בצ'ט GPT שהמציא עבורה חוק חדש על מנת לנצח בהליך להחרמת פלאפון בדיון בבית משפט השלום בחדרה. השופט היה המום כשהדבר התגלה: "30 שנה אני שופט וחשבתי שראיתי הכל. כנראה שטעיתי"

מצב המשטרה: השתמשה בצ'ט GPT שהמציא עבורה חוק חדש על מנת לנצח בהליך להחרמת פלאפון בדיון בבית משפט השלום בחדרה.
השופט היה המום כשהדבר התגלה: "30 שנה אני שופט וחשבתי שראיתי הכל. כנראה שטעיתי"
Gal Yona (@_galyo) 's Twitter Profile Photo

new work by Gabrielle Kaili-May Liu shows that LLMs still struggle to faithfully express their uncertainty in words, but cool to see that meta cognitive inspired prompting can go a long way. looking forward to seeing more positive results on this fundamental problem!