Mitchell Gordon(@MitchellAGordon) 's Twitter Profileg
Mitchell Gordon

@MitchellAGordon

ML Engineer @Google. Views are to be abandoned.

ID:3458495297

linkhttp://mitchgordon.me calendar_today27-08-2015 14:25:21

1,4K Tweets

672 Followers

259 Following

anton(@abacaj) 's Twitter Profile Photo

Fine tuning popping up on my timeline again. I’ve said it before but I was wasting a lot of time ft open source models only to get marginal returns for products that did not have pmf or strong growth. There is a lot of overhead to ft, it’s not one and done - data changes, you…

account_circle
Andrej Karpathy(@karpathy) 's Twitter Profile Photo

Returning from an experimental ~2 week detox from the internet. Main takeaway is that I didn't realize how unsettled the mind can get when over-stimulating on problems/information (like a stirred liquid), and ~2 weeks is enough to settle into a lot more zen state.

I'm struck by…

account_circle
Delip Rao e/σ(@deliprao) 's Twitter Profile Photo

If this were a science paper, you would expect a country that picks its science workforce at random as a “weak baseline” and a leading nation like the US to actively experiment towards state-of-the-art, or at least beat the baseline.

Not providing a guaranteed path for…

account_circle
Niklas Stoehr(@niklas_stoehr) 's Twitter Profile Photo

Can we localize the weights and mechanisms used by a language model to recite entire paragraphs of its training data?📄➡️🤖➡️📄
arxiv.org/pdf/2403.19851…

To find out, have a look at my Google AI intern project advised by Owen Lewis, Mitchell Gordon and Chiyuan Zhang.

Thread ⬇️

Can we localize the weights and mechanisms used by a language model to recite entire paragraphs of its training data?📄➡️🤖➡️📄 arxiv.org/pdf/2403.19851… To find out, have a look at my @GoogleAI intern project advised by Owen Lewis, @MitchellAGordon and Chiyuan Zhang. Thread ⬇️
account_circle
Anthropic(@AnthropicAI) 's Twitter Profile Photo

New Anthropic research paper: Many-shot jailbreaking.

We study a long-context jailbreaking technique that is effective on most large language models, including those developed by Anthropic and many of our peers.

Read our blog post and the paper here: anthropic.com/research/many-…

New Anthropic research paper: Many-shot jailbreaking. We study a long-context jailbreaking technique that is effective on most large language models, including those developed by Anthropic and many of our peers. Read our blog post and the paper here: anthropic.com/research/many-…
account_circle
AK(@_akhaliq) 's Twitter Profile Photo

Localizing Paragraph Memorization in Language Models

Can we localize the weights and mechanisms used by a language model to memorize and recite entire paragraphs of its training data? In this paper, we show that while memorization is spread across multiple layers and model

Localizing Paragraph Memorization in Language Models Can we localize the weights and mechanisms used by a language model to memorize and recite entire paragraphs of its training data? In this paper, we show that while memorization is spread across multiple layers and model
account_circle
David(@dzhng) 's Twitter Profile Photo

Introducing `deep-seek` - an open source research agent designed as an internet scale retrieval engine.

It's a new approach to the current wave of answer engines. Instead of giving you one answer, deep-seek will retrieve an extremely comprehensive list of enriched results.

account_circle
Mitchell Gordon(@MitchellAGordon) 's Twitter Profile Photo

the world used to be glued together by code

now it'll be glued together by code and LLMs

what a bright future we have

account_circle
Gašper Beguš(@begusgasper) 's Twitter Profile Photo

I think we accidentally discovered a very effective benchmark for intelligence:

Ask your preferred LLM if it can draw a (theoretical) syntactic tree analysis of a sentence.

Very few (if not only one) models can do this. But those that can, do it with a high degree of…

I think we accidentally discovered a very effective benchmark for intelligence: Ask your preferred LLM if it can draw a (theoretical) syntactic tree analysis of a sentence. Very few (if not only one) models can do this. But those that can, do it with a high degree of…
account_circle
Adam Karvonen(@a_karvonen) 's Twitter Profile Photo

Chess-GPT is a 50M parameter LLM playing at 1500 Elo. When it starts on a random board, its win rate drops from 70% to 17%. Does that mean it can't generalize?

No! In fact, we can restore much of its performance with one trick. We can also edit its internal board state.

🧵

account_circle
Nando de Freitas 🏳️‍🌈(@NandoDF) 's Twitter Profile Photo

There appears to be a mismatch between publishing criteria in AI conferences and 'what actually works'. It is easy to publish new mathematical constructs (e.g. new models, new layers, new modules, new losses), but as Apple's MM1 paper concludes:

1. Encoder Lesson: Image…

There appears to be a mismatch between publishing criteria in AI conferences and 'what actually works'. It is easy to publish new mathematical constructs (e.g. new models, new layers, new modules, new losses), but as Apple's MM1 paper concludes: 1. Encoder Lesson: Image…
account_circle