Mitchell Wortsman (@mitchnw) Twitter Tweets • TwiCopy

Anthropic

a year ago

New Anthropic research paper: Scaling Monosemanticity. The first ever detailed look inside a leading large language model. Read the blog post here: anthropic.com/research/mappi…

thumb_up_off_alt2,2K

chat_bubble_outline67

repeat555

shareShare

We recently open-sourced a relatively minimal implementation example of Transformer language model training in JAX, called NanoDO. If you stick to vanilla JAX components, the code is relatively straightforward to read -- the model file is <150 lines. We found it useful as a

thumb_up_off_alt278

chat_bubble_outline3

repeat60

shareShare

Vaishaal Shankar

@vaishaal

a year ago

I am really excited to introduce DataComp for Language Models (DCLM), our new testbed for controlled dataset experiments aimed at improving language models. 1/x

thumb_up_off_alt278

chat_bubble_outline7

repeat80

shareShare

Josh Gardner

@jpgard

a year ago

Thrilled to share our paper “Large-Scale Transfer Learning for Tabular Data via Language Modeling,” introducing TabuLa-8B: a foundation model for prediction on tabular data. (with Juan C Perdomo + Ludwig Schmidt) 📖 arxiv.org/abs/2406.12031 🌐 huggingface.co/collections/ml… [long🧵]

thumb_up_off_alt22

chat_bubble_outline3

repeat4

shareShare

Anthropic

@anthropicai

a year ago

Introducing Claude 3.5 Sonnet—our most intelligent model yet. This is the first release in our 3.5 model family. Sonnet now outperforms competitor models on key evaluations, at twice the speed of Claude 3 Opus and one-fifth the cost. Try it for free: claude.ai

thumb_up_off_alt7,7K

chat_bubble_outline424

repeat1,1K

shareShare

Anthropic

@anthropicai

a year ago

We're also launching a preview of Artifacts on claude.ai. You can ask Claude to generate docs, code, mermaid diagrams, vector graphics, or even simple games. Artifacts appear next to your chat, letting you see, iterate, and build on your creations in real-time.

thumb_up_off_alt1,1K

chat_bubble_outline36

repeat195

shareShare

Tomer Porian

@tomerporian

a year ago

🧵1/8 We resolve the discrepancy between the compute optimal scaling laws of Kaplan (exponent 0.88, Figure 14, left) et al. and Hoffmann et al. (“Chinchilla”, exponent 0.5). Paper: arxiv.org/abs/2406.19146 Data + Code: github.com/formll/resolvi…

thumb_up_off_alt170

chat_bubble_outline6

repeat33

shareShare

Vaishaal Shankar

@vaishaal

a year ago

We have released our DCLM models on huggingface! To our knowledge these are by far the best performing truly open-source models (open data, open weight models, open training code) 1/5

thumb_up_off_alt288

chat_bubble_outline9

repeat64

shareShare

Katie Everett

@_katieeverett

a year ago

We've gotten some great questions about the notion of alignment in our width-scaling parameterization paper! arxiv.org/abs/2407.05872 A deep dive into the alignment metric and intuition 🧵 [1/16]

thumb_up_off_alt69

chat_bubble_outline4

repeat17

shareShare

Katie Everett

@_katieeverett

a year ago

Come chat with me and Lechao Xiao at our ICML poster session 1:30-3pm CEST (Vienna time) today at Hall C 4-9 #2500 and see how our theory lets all parameterizations perform hyperparameter transfer! arxiv.org/abs/2407.05872

Come chat with me and <a href="/Locchiu/">Lechao Xiao</a> at our ICML poster session 1:30-3pm CEST (Vienna time) today at Hall C 4-9 #2500 and see how our theory lets all parameterizations perform hyperparameter transfer!
arxiv.org/abs/2407.05872

thumb_up_off_alt30

chat_bubble_outline6

repeat3

shareShare

Ross Wightman

@wightmanr

10 months ago

OpenCLIP passed 10K stars on GitHub this week. A big milestone for any open-source project. 🍻 to the many collaborators that made that possible. Coincidentally, I pushed a new release with a port of the largest multi-lingual SigLIP -- a SO400M/16 @ 256x256 that appeared on

thumb_up_off_alt149

chat_bubble_outline4

repeat16

shareShare

Anthropic

@anthropicai

10 months ago

Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use. Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text.

thumb_up_off_alt10,10K

chat_bubble_outline484

repeat1,1K

shareShare

Akari Asai

@akariasai

9 months ago

🚨 I’m on the job market this year! 🚨 I’m completing my Allen School Ph.D. (2025), where I identify and tackle key LLM limitations like hallucinations by developing new models—Retrieval-Augmented LMs—to build more reliable real-world AI systems. Learn more in the thread! 🧵

🚨 I’m on the job market this year! 🚨
I’m completing my <a href="/uwcse/">Allen School</a> Ph.D. (2025), where I identify and tackle key LLM limitations like hallucinations by developing new models—Retrieval-Augmented LMs—to build more reliable real-world AI systems. Learn more in the thread! 🧵

thumb_up_off_alt825

chat_bubble_outline26

repeat118

shareShare

Ofir Press

@ofirpress

9 months ago

I'm on the academic job market! I develop autonomous systems for: programming, research-level question answering, finding sec vulnerabilities & other useful+challenging tasks. I do this by building frontier-pushing benchmarks and agents that do well on them. See you at NeurIPS!

thumb_up_off_alt230

chat_bubble_outline9

repeat39

shareShare

Alex Li

@alexlioralexli

9 months ago

I'm presenting our #NeurIPS2024 work on Attention Transfer today! Key finding: Pretrained representations aren't essential - just using attention patterns from pretrained models to guide token interactions is enough for models to learn high-quality features from scratch and

thumb_up_off_alt160

chat_bubble_outline2

repeat21

shareShare

Alex Li

@alexlioralexli

4 months ago

Excited to be presenting at #ICLR2025 at 10am today on how generative classifiers are much more robust to distribution shift. Come by to chat and say hello!

thumb_up_off_alt91

chat_bubble_outline2

repeat6

shareShare

Mike A. Merrill

@mike_a_merrill

3 months ago

Many agents (Claude Code, Codex CLI) interact with the terminal to do valuable tasks, but do they currently work well enough to deploy en masse? We’re excited to introduce Terminal-Bench: An evaluation environment and benchmark for AI agents on real-world terminal tasks. Tl;dr

thumb_up_off_alt220

chat_bubble_outline14

repeat57

shareShare

Cade Gordon

@cadegordonml

3 months ago

Excited to share that I'll be joining Paul Jankura to work on pretraining science! I've chosen to defer my Stanford PhD, where I'm honored to be supported by the Hertz Fellowship. There's something special about the science, this place, and these people. Looking forward to joining

thumb_up_off_alt779

chat_bubble_outline42

repeat10

shareShare

Anthropic

@anthropicai

3 months ago

Introducing the next generation: Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is our most powerful model yet, and the world’s best coding model. Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.

thumb_up_off_alt20,20K

chat_bubble_outline723

repeat3,3K

shareShare

Ludwig Schmidt

@lschmidt3

3 months ago

Very excited to finally release our paper for OpenThoughts! After DataComp and DCLM, this is the third large open dataset my group has been building in collaboration with the DataComp community. This time, the focus is on post-training, specifically reasoning data.

thumb_up_off_alt1,1K

chat_bubble_outline20

repeat208

shareShare

Mitchell Wortsman

Anthropic

Peter J. Liu

Vaishaal Shankar

Josh Gardner

Anthropic

Anthropic

Tomer Porian

Vaishaal Shankar

Katie Everett

Katie Everett

Ross Wightman

Anthropic

Akari Asai

Ofir Press

Alex Li

Alex Li

Mike A. Merrill

Cade Gordon

Anthropic

Ludwig Schmidt