Yusuf Kocyigit (@mykocyigit) 's Twitter Profile
Yusuf Kocyigit

@mykocyigit

CS PhD at Boston University. NLP, Evaluation. Previously @google, @AIatMeta and @AmazonScience

ID: 1816639014696550400

linkhttp://yusufkocyigit.me calendar_today26-07-2024 00:57:58

25 Tweet

72 Takipçi

126 Takip Edilen

Aaditya Singh (@aaditya6284) 's Twitter Profile Photo

Super excited to have this out! Was great to work on this with Yusuf Kocyigit supervised by Dieuwke Hupkes and figure out the best post-hoc methods for identifying eval contamination + measure its effects on performance. A short 🧵

Ekin Akyürek (@akyurekekin) 's Twitter Profile Photo

Why do we treat train and test times so differently? Why is one “training” and the other “in-context learning”? Just take a few gradients during test-time — a simple way to increase test time compute — and get a SoTA in ARC public validation set 61%=avg. human score! ARC Prize

Why do we treat train and test times so differently?

Why is one “training” and the other “in-context learning”?

Just take a few gradients during test-time — a simple way to increase test time compute — and  get a SoTA in ARC public validation set 61%=avg. human score! <a href="/arcprize/">ARC Prize</a>
ahmet salih gundogdu (@asalihgundogdu) 's Twitter Profile Photo

I am looking for Machine Learning Intern for the Spring or Summer terms at the AI Institute for scaling our robot policy learning stack. Apply here and DM me! jobs.lever.co/bostondynamics…

Jurik Juraska (@jurikjuraska) 's Twitter Profile Photo

🌐 Meet MetricX-24, our SOTA machine translation evaluation metric and a successor to the successful MetricX-23. 🚀 Now open-source in PyTorch/Transformers! 🎉 Ready to take this top performer in the WMT24 Metrics Shared Task for a spin? 🔗 Code: github.com/google-researc…

Jacob Andreas (@jacobandreas) 's Twitter Profile Photo

Ekin Akyürek (Ekin Akyürek) builds tools for understanding & controlling algorithms that underlie reasoning in language models. You’ve likely seen his work on in-context learning; I'm just as excited about past work on linguistic generalization & future work on test-time scaling.

Ekin Akyürek (<a href="/akyurekekin/">Ekin Akyürek</a>) builds tools for understanding &amp; controlling algorithms that underlie reasoning in language models. You’ve likely seen his work on in-context learning; I'm just as excited about past work on linguistic generalization &amp; future work on test-time scaling.
Najoung Kim 🫠 (@najoungkim) 's Twitter Profile Photo

Pulling this opportunity on research agent evaluation up one more time! The official title of the position will be "Senior research technician". Feel free to email either Sebastian Schuster or me directly if you have any questions. Link for more detailed info and where to apply in 🧵

Eleftheria Briakou (@ebriakou) 's Twitter Profile Photo

🗺️ Are we making our #LLMs multilingual, or anglocentric? Much work brings languages closer to English, but that comes at the cost of crucial #cultural nuance. HyoJung Han tackles this trade-off with surgical steering, adapting LLMs to cultural contexts at inference time.

Google DeepMind (@googledeepmind) 's Twitter Profile Photo

This is Gemini 3: our most intelligent model that helps you learn, build and plan anything. It comes with state-of-the-art reasoning capabilities, world-leading multimodal understanding, and enables new agentic coding experiences. 🧵

Jeff Dean (@jeffdean) 's Twitter Profile Photo

I’m really excited about our release of Gemini 3 today, the result of hard work by many, many people in the Gemini team and all across Google! 🎊 We’ve built many exciting new product experiences with it, as you’ll see today and in the coming weeks and months. You can find it

I’m really excited about our release of Gemini 3 today, the result of hard work by many, many people in the Gemini team and all across Google! 🎊

We’ve built many exciting new product experiences with it, as you’ll see today and in the coming weeks and months.

You can find it
Najoung Kim 🫠 (@najoungkim) 's Twitter Profile Photo

My lab at BU is recruiting PhD students and possibly a postdoc this year! We study humans & machines, centered around topics like meaning, generalization, evaluation methods and design, and the nature of computation and representation that underlie language and cognition. 🫴🫴

My lab at BU is recruiting PhD students and possibly a postdoc this year!

We study humans &amp; machines, centered around topics like meaning, generalization, evaluation methods and design, and the nature of computation and representation that underlie language and cognition.

🫴🫴