Chris Krempel (@nudelbrot) Twitter Tweets • TwiCopy

Chris Krempel

@nudelbrot

+ Follow

I'm hacking transformers on arch btw.

ID: 80350379

linkhttps://ohmytofu.ai calendar_today06-10-2009 17:07:14

1,1K Tweet

136 Followers

354 Following

Chris Krempel

@nudelbrot

2 years ago

Never seen this feature rolled out. (#chatgpt)

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Chris Krempel

@nudelbrot

2 years ago

Listen, you do not need LangSmith or a ML observability platform for tracing agents. All you need is OpenTelemetry (e.g. jaeger-all-in-one) and a few lines of python

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Chris Krempel

@nudelbrot

2 years ago

Using closed source AI (OpenAI , Anthropic etc) vs Open Modes is like buying a CD vs playing in a band

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Anthropic is scaling Sparse Autoencoders to their Sonnet model. IMO model interpretability is one of the most exciting research directions RN. (Discl: the many unknowns are acknowledged by the field, they are most certain that they don’t understand much) transformer-circuits.pub/2024/scaling-m…

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Chris Krempel

@nudelbrot

a year ago

Started embedding the Hugging Face FineWeb dataset - wishing for some compute grand 😂

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Chris Krempel

@nudelbrot

a year ago

Fast - high learning rate - pre training runs over many epochs (45) at low batch sizes (16k tok) can give you a pretty good estimate of how your actual slow, high batch size - 0.5 million tok -run will behave. Pre train. GPT2/3 medium here. The preview runs are only 15 min each.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Chris Krempel

@nudelbrot

10 months ago

Creeepy 🤯

thumb_up_off_alt1

chat_bubble_outline1

repeat0

shareShare

Chris Krempel

@nudelbrot

8 months ago

What's the current best bang4buck to buy a desktop GPU machine that at least matches A6000 in memory bandwidth (768 GB/sec). Mac studios are only at 500 GB/sec, NV Spark even lower.

thumb_up_off_alt2

chat_bubble_outline2

repeat0

shareShare