Benjamin Warner (@benjamin_warner) 's Twitter Profile
Benjamin Warner

@benjamin_warner

R&D @answerdotai

ID: 379499290

linkhttp://benjaminwarner.dev calendar_today25-09-2011 02:28:16

434 Tweet

2,2K Takipรงi

415 Takip Edilen

Orion Weller @ ICLR 2025 (@orionweller) 's Twitter Profile Photo

XLM-R has been SOTA for 6 years for multilingual encoders. That's an eternity in AI ๐Ÿคฏ Time for an upgrade. Introducing mmBERT: 2-4x faster than previous models โšก while even beating o3 and Gemini 2.5 Pro ๐Ÿ”ฅ + open models & training data - try it now! How did we do it? ๐Ÿงต

XLM-R has been SOTA for 6 years for multilingual encoders. That's an eternity in AI ๐Ÿคฏ

Time for an upgrade. Introducing mmBERT: 2-4x faster than previous models โšก while even beating o3 and Gemini 2.5 Pro ๐Ÿ”ฅ

+ open models & training data - try it now!

How did we do it? ๐Ÿงต
Horace He (@chhillee) 's Twitter Profile Photo

Apologies that I haven't written anything since joining Thinking Machines but I hope this blog post on a topic very near and dear to my heart (reproducible floating point numerics in LLM inference) will make up for it!

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

Excited to announce that Sophont has raised $9.22M in combined pre-seed+seed rounds! ๐Ÿš€๐Ÿ”ฅ Led by Kindred Ventures, with participation from @delphi_ventures Upfront Ventures AICONIC VENTURES also @jeffdean, @logankilpatrick, clem ๐Ÿค— (via Factorial Capital), Lukas Biewald & others

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

At MedARC we are building a comprehensive suite of medical LLM evals, and we already have tons of volunteers and lots of great progress! The project started less than a week ago! Are there other medical LLM evals we should include?

At <a href="/MedARC_AI/">MedARC</a> we are building a comprehensive suite of medical LLM evals, and we already have tons of volunteers and lots of great progress!

The project started less than a week ago!

Are there other medical LLM evals we should include?
tomaarsen (@tomaarsen) 's Twitter Profile Photo

๐Ÿ›’ RexBERT: ModernBERT except for E-commerce was just released by RAHUL BAJAJ et al! 4 base encoders (17M, 68M, 150M, 400M) trained on 2.3T tokens (with 350B E-commerce related tokens), easily outperforming base models on E-commerce tasks! Details in ๐Ÿงต

๐Ÿ›’ RexBERT: ModernBERT except for E-commerce was just released by <a href="/bajajra30/">RAHUL BAJAJ</a> et al! 

4 base encoders (17M, 68M, 150M, 400M) trained on 2.3T tokens (with 350B E-commerce related tokens), easily outperforming base models on E-commerce tasks! 

Details in ๐Ÿงต
Helen Toner (@hlntnr) 's Twitter Profile Photo

Many AI policy decisions are complicated. "Don't ban self-driving cars" is really not. Good new piece from Kelsey Piper, with a lede that pulls no punches:

Many AI policy decisions are complicated. "Don't ban self-driving cars" is really not. Good new piece from <a href="/KelseyTuoc/">Kelsey Piper</a>, with a lede that pulls no punches:
Nicholas Decker ๐Ÿณ๏ธโ€๐ŸŒˆ๐ŸŒ๐Ÿ‡บ๐Ÿ‡ฆ (@captgouda24) 's Twitter Profile Photo

This paper is one of the most astonishing feats of sustained data wizardry I have ever seen. Using data from Uber, they are able to estimate the roughness of every road in America and precisely estimate the value people place on it, and so much more. 1/

This paper is one of the most astonishing feats of sustained data wizardry I have ever seen. Using data from Uber, they are able to estimate the roughness of every road in America and precisely estimate the value people place on it, and so much more. 1/
Soumith Chintala (@soumithchintala) 's Twitter Profile Photo

MacStudio you ask? Apple Engineering's **actual** time spent on PyTorch support has't given me confidence that PyTorch Mac experience would get anywhere close to NVIDIA's any time soon, if ever. The Meta engineers continue to do a huge amount of heavy-lifting for improving the

Sophont (@sophontai) 's Twitter Profile Photo

Excited to share our first paper: Scaling Vision Transformers for Functional MRI with Flat Maps We introduce a new approach to training fMRI neuroimaging foundation models and demonstrate a strict dataset power scaling law!

Excited to share our first paper:

Scaling Vision Transformers for Functional MRI with Flat Maps

We introduce a new approach to training fMRI neuroimaging foundation models and demonstrate a strict dataset power scaling law!
Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

We've released our first Sophont paper! ๐Ÿ”ฅ We're bullish on the potential foundation models to help unlock novel clinical applications for brain & mental health. So we working on training better neuroimaging foundation models. We develop a novel approach for training fMRI

We've released our first <a href="/SophontAI/">Sophont</a> paper! ๐Ÿ”ฅ

We're bullish on the potential foundation models to help unlock novel clinical applications for brain &amp; mental health. 

So we working on training better neuroimaging foundation models. 

We develop a novel approach for training fMRI
Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

META RESEARCHERS WHO WERE LAID OFF: Hit me up if you wanna work on open-source LLMs and multimodal models for medicine and healthcare! This includes working on reasoning/RLVR/etc. and self-supervised training. It's very exciting research and very impactful too, come join!

Austin Huang (@austinvhuang) 's Twitter Profile Photo

Belated life update - I'm starting a company. No boundaries between low/high level systems engineering, research, design or product. No boundaries between AI compute and situated human collaboration and learning. More soon - DMs open if you want to connect.