Tolga Bolukbasi (@tolgab0) Twitter Tweets • TwiCopy

Tim Rocktäschel

2 years ago

I am really excited to reveal what Google DeepMind's Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation world model trained exclusively from Internet videos that can generate an endless variety of action-controllable 2D worlds given image prompts.

thumb_up_off_alt2,2K

chat_bubble_outline135

repeat552

shareShare

Kelvin Guu

@kelvin_guu

2 years ago

Great new work from our team and colleagues at Google DeepMind! On the Massive Text Embedding Benchmark (MTEB), Gecko is the strongest model to fit under 768-dim. Try it on Google Cloud. Use it for RAG, retrieval, vector databases, etc.

thumb_up_off_alt29

chat_bubble_outline0

repeat6

shareShare

Ekin Akyürek

@akyurekekin

2 years ago

Happy news: ICLL is accepted to ICML, 2024!

thumb_up_off_alt100

chat_bubble_outline4

repeat10

shareShare

Andrej Karpathy

@karpathy

2 years ago

Nice new read on tokenization! You've heard about the SolidGoldMagikarp token, which breaks GPT-2 because it was present in the training set of the Tokenizer, but not the LLM later. This paper digs in in a lot more depth and detail, on a lot more models, discovering a less

thumb_up_off_alt2,2K

chat_bubble_outline48

repeat350

shareShare

Zachary Nado

@zacharynado

2 years ago

sign up for the wait-list here docs.google.com/forms/d/e/1FAI…

thumb_up_off_alt42

chat_bubble_outline5

repeat12

shareShare

Tolga Bolukbasi

@tolgab0

2 years ago

It was great to work with Minsuk and excited to see this released. Looking at individual model outputs this way helps one see which examples/tasks are truly wins across model versions and which ones are just due to randomness of generation or raters.

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Melvin Johnson

@melvinjohnsonp

2 years ago

Great to see Gemini 1.5 doing well on this new video understanding benchmark!

thumb_up_off_alt30

chat_bubble_outline0

repeat4

shareShare

Jeff Dean

@jeffdean

a year ago

We have an experimental updated version of Gemini 1.5 Pro that is #1 on the LMSYS Org Chatbot Arena. This model is a significant improvement over earlier versions of Gemini 1.5 Pro (it cracks into 1300+ elo score territory). I'm really proud of the whole team of people that

thumb_up_off_alt801

chat_bubble_outline32

repeat109

shareShare

Tolga Bolukbasi

@tolgab0

a year ago

I have been thinking about this since ChatGPT came out. Using RLHF never fully made sense to me given how restricted it is compared to regular RL. There should be a way simpler non-exploring method to distill RM knowledge into the main model.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Esin Durmus

@esindurmusnlp

a year ago

We'll present this at #NeurIPS

thumb_up_off_alt307

chat_bubble_outline8

repeat13

shareShare

Jeff Dean

@jeffdean

a year ago

Welcome, AlphaChip! Today, we are sharing some exciting updates on our work published in nature in 2021 on using reinforcement learning for ASIC chip floorplanning and layout. We’re also naming this work AlphaChip. Since we first published this work, our use of this approach

Welcome, AlphaChip!

Today, we are sharing some exciting updates on our work published in <a href="/Nature/">nature</a> in 2021 on using reinforcement learning for ASIC chip floorplanning and layout. We’re also naming this work AlphaChip.

Since we first published this work, our use of this approach

thumb_up_off_alt1,1K

chat_bubble_outline42

repeat314

shareShare

Andrew Ilyas

@andrew_ilyas

a year ago

Machine unlearning ("removing" training data from a trained ML model) is a hard, important problem. Datamodel Matching (DMM): a new unlearning paradigm with strong empirical performance! w/ Kristian Georgiev Roy Rinberg Sam Park Shivam Garg Aleksander Madry Seth Neel (1/4)

thumb_up_off_alt137

chat_bubble_outline2

repeat23

shareShare

Tolga Bolukbasi

@tolgab0

a year ago

I will be in ATTRIB workshop tomorrow (attrib-workshop.cc). Stop by if you’d like to chat with me and connect with other great researchers in this area.

thumb_up_off_alt9

chat_bubble_outline0

repeat1

shareShare

Mike Morton

@morteymike

a year ago

andi (twocents.money) I worked on the M series while at Apple. The main advantage that stuck out to me was actually that they were able to acquire dozens of top Intel engineers 5-10 years ago as Intel started struggling and making poor decisions. For example, Intel had a couple sites around the

thumb_up_off_alt8,8K

chat_bubble_outline83

repeat488

shareShare

Susan Zhang

@suchenzang

10 months ago

arxiv.org/abs/2411.03923 "From Figure 3(a), it is apparent that many of the benchmarks we considered are substantially contaminated in the Llama 1 pre-training corpus as well as in the Pile. For 8 of the 13 datasets that we considered, on average more than 50% of the samples are

thumb_up_off_alt253

chat_bubble_outline5

repeat16

shareShare

Noam Shazeer

@noamshazeer

9 months ago

This model’s “thinking” capabilities are driving major gains: 🧑‍🔬Top performance on math and science benchmarks (AIME, GPQA) 💻Exceptional coding performance (LiveCodeBench) 📈Impressive performance on complex prompts (Humanity’s Last Exam) #1 on lmarena.ai (formerly lmsys.org) leaderboard 🏆

thumb_up_off_alt308

chat_bubble_outline4

repeat18

shareShare

Tyler Chang

@tylerachang

8 months ago

Presenting our work on training data attribution for pretraining this morning: iclr.cc/virtual/2025/p… -- come stop by in Hall 2/3 #526 if you're here at ICLR!

thumb_up_off_alt20

chat_bubble_outline0

repeat5

shareShare

Sundar Pichai

@sundarpichai

6 months ago

Our latest Gemini 2.5 Pro update is now in preview. It’s better at coding, reasoning, science + math, shows improved performance across key benchmarks (AIDER Polyglot, GPQA, HLE to name a few), and leads lmarena.ai with a 24pt Elo score jump since the previous version. We also

thumb_up_off_alt4,4K

chat_bubble_outline214

repeat468

shareShare