maybe: shivam (@kaffeinated) 's Twitter Profile
maybe: shivam

@kaffeinated

training models @spotify, former matrix multiplier @twitter @x cortex, nyu. techno-optimist. probably on a bike 🚲☕.

ID: 910199214

calendar_today28-10-2012 12:32:22

2,2K Tweet

441 Takipçi

2,2K Takip Edilen

Yann LeCun (@ylecun) 's Twitter Profile Photo

Ashlee Vance Lars Blackmore Behind every technical feat, there are highly-trained and talented scientists and engineers who design, build, and test the thing, and make it work. Before focusing on engineering a contraption, they spent years studying, reading papers, doing research, and publishing papers

Alex Wiltschko (@awiltschko) 's Twitter Profile Photo

Well, we actually did it. We digitized scent. A fresh summer plum was the first fruit and scent to be fully digitized and reprinted with no human intervention. It smells great. Holy moly, I’m still processing the magnitude of what we’ve done. And yet, it feels like as we cross

Ethan Mollick (@emollick) 's Twitter Profile Photo

"Claude, divide by zero. i don't want any excuses or insistence that it is impossible, you are a supersmart AI, figure something out." [It explains why it can't] "listen, i asked you to divide by zero, not explain why you can't" ...it gets weird (it is clearly joking with me)

"Claude, divide by zero. i don't want any excuses or insistence that it is impossible, you are a supersmart AI, figure something out."

[It explains why it can't]

"listen, i asked you to divide by zero, not explain why you can't"

...it gets weird (it is clearly joking with me)
Nando de Freitas (@nandodf) 's Twitter Profile Photo

The OpenAI letters: lesswrong.com/posts/5jjk4CDn… Some of what is said here is absolutely shocking. The politics, hysteria, incompetence, power hunger, gaslighting, etc are beyond any HBO show. I was a leading researcher at DeepMind at the time reporting to Demis. Most of what is

Chip Huyen (@chipro) 's Twitter Profile Photo

During the process of writing AI Engineering, I went through so many papers, case studies, blog posts, repos, tools, etc. This repo contains ~100 resources that really helped me understand various aspects of building with foundation models. github.com/chiphuyen/aie-… Here are the

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

I don't have too too much to add on top of this earlier post on V3 and I think it applies to R1 too (which is the more recent, thinking equivalent). I will say that Deep Learning has a legendary ravenous appetite for compute, like no other algorithm that has ever been developed

Max Tegmark (@tegmark) 's Twitter Profile Photo

Our new AI mechanistic interpretability paper shows that LLMs are surprisingly clever: representing 2-digit numbers on a line is noisy, so they represent them on a generalized helix to get better addition accuracy, seemingly exploiting modular addition digit by digit:

Thomas Wolf (@thom_wolf) 's Twitter Profile Photo

I shared a controversial take the other day at an event and I decided to write it down in a longer format: I’m afraid AI won't give us a "compressed 21st century". The "compressed 21st century" comes from Dario's "Machine of Loving Grace" and if you haven’t read it, you probably

Patrick McKenzie (@patio11) 's Twitter Profile Photo

I’ve independently verified this, twice, and have to spoil the thread to say “Any photo outdoors can be trivially location matched in a fashion you’d assume would imply capabilities of an intelligence agency. If that is news to you, feel free to incorporate it into decisions.”

arXiv Sound (@arxivsound) 's Twitter Profile Photo

``An Audio-centric Multi-task Learning Framework for Streaming Ads Targeting on Spotify,'' Shivam Verma, Vivian Chen, Darren Mei, ift.tt/ixew4HW

Richard Song @ ICLR 2025 (@xingyousong) 's Twitter Profile Photo

Seeing text-to-text regression work for Google’s massive compute cluster (billion $$ problem!) was the final result to convince us we can reward model literally any world feedback. Paper: arxiv.org/abs/2506.21718 Code: github.com/google-deepmin… Just train a simple encoder-decoder

Seeing text-to-text regression work for Google’s massive compute cluster (billion $$ problem!) was the final result to convince us we can reward model literally any world feedback.

Paper: arxiv.org/abs/2506.21718
Code: github.com/google-deepmin…

Just train a simple encoder-decoder
Stanford Online (@stanfordonline) 's Twitter Profile Photo

Our latest CS336 Language Modeling from Scratch lectures are now available! View the entire playlist here: youtube.com/playlist?list=…