Matthew Carrigan (@carrigmat) 's Twitter Profile
Matthew Carrigan

@carrigmat

@huggingface engineer. I'm the reason your LLM frontend has a jinja2cpp dependency. Sometimes yells about housing and trans rights instead of working
He/him

ID: 1383027729721913344

linkhttps://github.com/rocketknight1 calendar_today16-04-2021 12:01:35

2,2K Tweet

13,13K Followers

392 Following

Matthew Carrigan (@carrigmat) 's Twitter Profile Photo

Every single robot coming out of Hugging Face is aesthetically astounding in a different way. This is peak performance. This is the future of open-source

Every single robot coming out of <a href="/huggingface/">Hugging Face</a> is aesthetically astounding in a different way. This is peak performance. This is the future of open-source
Barack Obama (@barackobama) 's Twitter Profile Photo

At a time when people are understandably focused on the daily chaos in Washington, these articles describe the rapidly accelerating impact that AI is going to have on jobs, the economy, and how we live. axios.com/2025/05/28/ai-…

Dana Aubakirova (@daubakirovaa) 's Twitter Profile Photo

Today, we are introducing SmolVLA: a 450M open-source vision-language action model. Best-in-class performance and inference speed! And the best part? We trained it using all the open-source LeRobot datasets in the Hugging Face hub! But how? 🫳🏀

Lysandre (@lysandrejik) 's Twitter Profile Photo

I have bittersweet news to share. Yesterday we merged a PR deprecating TensorFlow and Flax support in transformers. Going forward, we're focusing all our efforts on PyTorch to remove a lot of the bloating in the transformers library. Expect a simpler toolkit, across the board.

I have bittersweet news to share.

Yesterday we merged a PR deprecating TensorFlow and Flax support in transformers.

Going forward, we're focusing all our efforts on PyTorch to remove a lot of the bloating in the transformers library. Expect a simpler toolkit, across the board.
François Chollet (@fchollet) 's Twitter Profile Photo

KerasHub is a collection of over 70 popular pretrained model architectures -- LLMs, VLMs, image generation models, etc -- that work with JAX, TF, PyTorch. They all support HuggingFace checkpoints -- you can load any HF model with them for the corresponding architecture.

KerasHub is a collection of over 70 popular pretrained model architectures -- LLMs, VLMs, image generation models, etc -- that work with JAX, TF, PyTorch.

They all support HuggingFace checkpoints -- you can load any HF model with them for the corresponding architecture.
James Barry (@jamesarbarry) 's Twitter Profile Photo

If any Irish speakers are interested in helping annotate some of the BLEnD examples to Irish, please let me know (can be as little as a few examples). We aim to have Irish included in the next release. huggingface.co/datasets/nayeo…

Lysandre (@lysandrejik) 's Twitter Profile Photo

"The great unbloating" of transformers continues. Over the past few weeks, 10+ PRs were merged, aiming to simplify code across the library. This brought in refactors for Attention, the Cache, a new linter. We're improving type hints everywhere, and are checking type checkers.

"The great unbloating" of transformers continues.

Over the past few weeks, 10+ PRs were merged, aiming to simplify code across the library.

This brought in refactors for Attention, the Cache, a new linter. We're improving type hints everywhere, and are checking type checkers.
Guilherme Penedo (@gui_penedo) 's Twitter Profile Photo

We have finally released the 📝paper for 🥂FineWeb2, our large multilingual pre-training dataset. Along with general (and exhaustive) multilingual work, we introduce a concept that can also improve English performance: deduplication-based upsampling, which we call rehydration.

We have finally released the 📝paper for 🥂FineWeb2, our large multilingual pre-training dataset.

Along with general (and exhaustive) multilingual work, we introduce a concept that can also improve English performance: deduplication-based upsampling, which we call rehydration.
Thomas Wolf (@thom_wolf) 's Twitter Profile Photo

We are so excited to announce a new open-source challenge in collaboration with Proxima Fusion : unlocking fusion with AI If you haven't followed, fusion is how the sun make energy and is –in the long term– our best bet on a clean, safe, and virtually limitless energy In the

We are so excited to announce a new open-source challenge in collaboration with <a href="/proximafusion/">Proxima Fusion</a> : unlocking fusion with AI

If you haven't followed, fusion is how the sun make energy and is –in the long term– our best bet on a clean, safe, and virtually limitless energy

In the
Matthew Carrigan (@carrigmat) 's Twitter Profile Photo

I hear wildly conflicting reports about the impact of the number of CCDs on Epyc CPU bandwidth. Does anyone have a range of CPUs available to measure? If so, how many CCDs do you actually need to get the theoretical ~500GB/s socket bandwidth? Do access patterns matter?

Matthew Carrigan (@carrigmat) 's Twitter Profile Photo

New thrill sport: Mushroom foraging where you don't have an identification key, you just show photos of them to a language model. Disagreeing with the LM's assessment is STRICTLY against the spirit of the sport

Nathan Lambert (@natolambert) 's Twitter Profile Photo

My latest post: The American DeepSeek Project Build fully open models in the US in the next two years to enable a flourishing, global scientific AI ecosystem to balance China's surge in open-source and an alternative to building products ontop of leading closed models.

My latest post: The American DeepSeek Project

Build fully open models in the US in the next two years to enable a flourishing, global scientific AI ecosystem to balance China's surge in open-source and an alternative to building products ontop of leading closed models.
Matthew Carrigan (@carrigmat) 's Twitter Profile Photo

I asked Deepseek about RoPE because I realized I never understood why you split the vectors into pairs of values before rotating, and it told me to go try it with triples instead and to have fun with the gimbal lock hell of non-commutative 3D rotations I was about to experience