mukul (@mukul0x) 's Twitter Profile
mukul

@mukul0x

building @theportalcorp

ID: 1258489777231237121

linkhttps://portal.so calendar_today07-05-2020 20:12:17

236 Tweet

445 Followers

525 Following

Awni Hannun (@awnihannun) 's Twitter Profile Photo

Lazy evaluation and unified memory in MLX make it super easy to quantize or merge large models on small machines. In MLX LM you can quantize Mixtral (~100GB) in < 4 mins on an 8GB M1 (h/t Angelos Katharopoulos). Example:

Lazy evaluation and unified memory in MLX make it super easy to quantize or merge large models on small machines.

In MLX LM you can quantize Mixtral (~100GB) in &lt; 4 mins on an 8GB M1 (h/t <a href="/angeloskath/">Angelos Katharopoulos</a>).

Example:
Alex Albert (@alexalbert__) 's Twitter Profile Photo

Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval. For background, this tests a model’s recall ability by inserting a target sentence (the "needle") into a corpus of

Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval.

For background, this tests a model’s recall ability by inserting a target sentence (the "needle") into a corpus of
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

# Reproduce GPT-2 (124M) in llm.c in 90 minutes for $20 ✨ The GPT-2 (124M) is the smallest model in the GPT-2 series released by OpenAI in 2019, and is actually quite accessible today, even for the GPU poor. For example, with llm.c you can now reproduce this model on one 8X

# Reproduce GPT-2 (124M) in llm.c in 90 minutes for $20 ✨

The GPT-2 (124M) is the smallest model in the GPT-2 series released by OpenAI in 2019, and is actually quite accessible today, even for the GPU poor. For example, with llm.c you can now reproduce this model on one 8X
Martin Kleppe (@aemkei) 's Twitter Profile Photo

Time for a new mind-bending project! #QLOCK — A JavaScript Quine Clock aem1k.com/qlock It displays the current time in a seven-segment style, embedded within its own JavaScript source code. 🕔 🕝 🕢 🕤 🕑 🕜 (321 bytes)

Peter Wang 🦋 (@pwang) 's Twitter Profile Photo

"The model underscores how complex the human brain is: describing just this small sample — one-millionth of the total human brain and about 3 mm long — requires more than a million Gigabytes of data: 1.4 Petabytes." blog.google/technology/res…

Shannon Sands (@max_paperclips) 's Twitter Profile Photo

I agree with this - and it's a problem. I've given up recommending ChatGPT or LLMs in general to non-technical people. They don't understand that PDFs that consistent of JPGs aren't going to be handled the same as text. They don't understand why after so many messages it forgets

Flo Crivello (@altimor) 's Twitter Profile Photo

TIL: there are more transistors in the AirPods Pro than in the CPU of a MacBook Pro from 2010 One is a professional laptop, the other earphones running on a battery weighing about 1 gram Moore's Law's one hell of a thing

Kevin Liu (@kliu128) 's Twitter Profile Photo

I'm very proud of the Preparedness evaluations we did on o1-{preview,mini}. One example in particular: While testing cybersecurity challenges, we accidentally left one broken, but the model somehow still got it right.

I'm very proud of the Preparedness evaluations we did on o1-{preview,mini}.

One example in particular: While testing cybersecurity challenges, we accidentally left one broken, but the model somehow still got it right.
ℏεsam (@hesamation) 's Twitter Profile Photo

“LLMs are just next token predictors”, Ilya Sutskever explains how this is deeper than it sounds. Predicting next token requires understanding the underlying reality that led to the creation of previous tokens.

near (@nearcyan) 's Twitter Profile Photo

cant believe how much time i wasted learning things before LLMs, spending hours in google search rabbitholes, not knowing future-me-in-only-a-few-years would never need to google again and would have a PhD who can answer anything instantly for $0.001. so grateful for this tech!

Matthew Prince 🌥 (@eastdakota) 's Twitter Profile Photo

Tanay Jaipuria And it’s gotten worse in the 6 months since we first measured. Google is now: 15 scrape for 1 visitor OpenAI is now: 1,200 scrapes for 1 visitor

Riley Goodside (@goodside) 's Twitter Profile Photo

* = the answer to the next question A task meant to require web and computational search guided by multi-hop deduction starting from a partial SHA1, answered in 2m 54s by ChatGPT o3 featuring Tom Cruise and Addison Rae

* = the answer to the next question 

A task meant to require web and computational search guided by multi-hop deduction starting from a partial SHA1, answered in 2m 54s by ChatGPT o3

featuring Tom Cruise and Addison Rae
Wyatt walls (@lefthanddraft) 's Twitter Profile Photo

Turing comparing building thinking machines to biological procreation: "we are, in either case, instruments of His will providing mansions for the souls that He creates."

Turing comparing building thinking machines to biological procreation: 

"we are, in either case, instruments of His will providing mansions for the souls that He creates."
Ethan Mollick (@emollick) 's Twitter Profile Photo

The Opus 4.6 system card has some extremely wild stuff that remind you about how weird a technology this is. These paragraphs are really worth reading.

The Opus 4.6 system card has some extremely wild stuff that remind you about how weird a technology this is. 

These paragraphs are really worth reading.