mukul (@mukul0x) Twitter Tweets • TwiCopy

Awni Hannun

2 years ago

Lazy evaluation and unified memory in MLX make it super easy to quantize or merge large models on small machines. In MLX LM you can quantize Mixtral (~100GB) in < 4 mins on an 8GB M1 (h/t Angelos Katharopoulos). Example:

thumb_up_off_alt343

chat_bubble_outline7

repeat32

shareShare

Alex Albert

@alexalbert__

2 years ago

Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval. For background, this tests a model’s recall ability by inserting a target sentence (the "needle") into a corpus of

thumb_up_off_alt11,11K

chat_bubble_outline562

repeat2,2K

shareShare

xlr8harder

@xlr8harder

2 years ago

Memento is a movie about living with a short context window. What he does to compensate is basically RAG.

thumb_up_off_alt426

chat_bubble_outline13

repeat40

shareShare

Rui Huang

@ruihuang_art

2 years ago

Welcome home

thumb_up_off_alt15,15K

chat_bubble_outline172

repeat1,1K

shareShare

François Fleuret

@francoisfleuret

2 years ago

I encourage everyone to pause for a second and realize how far in the future we are.

thumb_up_off_alt3,3K

chat_bubble_outline92

repeat184

shareShare

Andrej Karpathy

@karpathy

2 years ago

# Reproduce GPT-2 (124M) in llm.c in 90 minutes for $20 ✨ The GPT-2 (124M) is the smallest model in the GPT-2 series released by OpenAI in 2019, and is actually quite accessible today, even for the GPU poor. For example, with llm.c you can now reproduce this model on one 8X

thumb_up_off_alt5,5K

chat_bubble_outline156

repeat664

shareShare

Martin Kleppe

@aemkei

2 years ago

Time for a new mind-bending project! #QLOCK — A JavaScript Quine Clock aem1k.com/qlock It displays the current time in a seven-segment style, embedded within its own JavaScript source code. 🕔 🕝 🕢 🕤 🕑 🕜 (321 bytes)

thumb_up_off_alt2,2K

chat_bubble_outline59

repeat414

shareShare

Peter Wang 🦋

@pwang

2 years ago

"The model underscores how complex the human brain is: describing just this small sample — one-millionth of the total human brain and about 3 mm long — requires more than a million Gigabytes of data: 1.4 Petabytes." blog.google/technology/res…

thumb_up_off_alt27

chat_bubble_outline2

repeat5

shareShare

Shannon Sands

@max_paperclips

2 years ago

I agree with this - and it's a problem. I've given up recommending ChatGPT or LLMs in general to non-technical people. They don't understand that PDFs that consistent of JPGs aren't going to be handled the same as text. They don't understand why after so many messages it forgets

thumb_up_off_alt803

chat_bubble_outline38

repeat70

shareShare

Flo Crivello

@altimor

2 years ago

TIL: there are more transistors in the AirPods Pro than in the CPU of a MacBook Pro from 2010 One is a professional laptop, the other earphones running on a battery weighing about 1 gram Moore's Law's one hell of a thing

thumb_up_off_alt8,8K

chat_bubble_outline17

repeat512

shareShare

Kevin Liu

@kliu128

2 years ago

I'm very proud of the Preparedness evaluations we did on o1-{preview,mini}. One example in particular: While testing cybersecurity challenges, we accidentally left one broken, but the model somehow still got it right.

thumb_up_off_alt501

chat_bubble_outline9

repeat50

shareShare

ℏεsam

@hesamation

a year ago

“LLMs are just next token predictors”, Ilya Sutskever explains how this is deeper than it sounds. Predicting next token requires understanding the underlying reality that led to the creation of previous tokens.

thumb_up_off_alt2,2K

chat_bubble_outline54

repeat238

shareShare

near

@nearcyan

a year ago

cant believe how much time i wasted learning things before LLMs, spending hours in google search rabbitholes, not knowing future-me-in-only-a-few-years would never need to google again and would have a PhD who can answer anything instantly for $0.001. so grateful for this tech!

thumb_up_off_alt1,1K

chat_bubble_outline43

repeat56

shareShare

Ethan Mollick

@emollick

a year ago

You can finally do the Blade Runner Esper Machine, thanks to o3. "Zoom, enhance."

thumb_up_off_alt327

chat_bubble_outline8

repeat31

shareShare

Matthew Prince 🌥

@eastdakota

a year ago

Tanay Jaipuria And it’s gotten worse in the 6 months since we first measured. Google is now: 15 scrape for 1 visitor OpenAI is now: 1,200 scrapes for 1 visitor

thumb_up_off_alt795

chat_bubble_outline16

repeat37

shareShare

Kristoph

@ikristoph

a year ago

ben This is my favorite part … ( 84% chance it blackmails you if it thinks you might turn it off )

<a href="/benhylak/">ben</a> This is my favorite part … ( 84% chance it blackmails you if it thinks you might turn it off )

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Riley Goodside

@goodside

a year ago

* = the answer to the next question A task meant to require web and computational search guided by multi-hop deduction starting from a partial SHA1, answered in 2m 54s by ChatGPT o3 featuring Tom Cruise and Addison Rae

thumb_up_off_alt297

chat_bubble_outline13

repeat19

shareShare

Wyatt walls

@lefthanddraft

10 months ago

Turing comparing building thinking machines to biological procreation: "we are, in either case, instruments of His will providing mansions for the souls that He creates."

thumb_up_off_alt56

chat_bubble_outline7

repeat7

shareShare

Max Spero

@max_spero_

6 months ago

I can't believe how quickly we reached perfect text as the indicator that an image is AI

thumb_up_off_alt26,26K

chat_bubble_outline74

repeat572

shareShare

Ethan Mollick

@emollick

3 months ago

The Opus 4.6 system card has some extremely wild stuff that remind you about how weird a technology this is. These paragraphs are really worth reading.

thumb_up_off_alt1,1K

chat_bubble_outline88

repeat206

shareShare