Eric Steinberger (@ericsteinb) Twitter Tweets • TwiCopy

Eric Steinberger

2 years ago

Nat is a great sparring partner, coach and supporter. He has consistently pushed us to be even more ambitious while remaining practical. We are incredibly fortunate to have him as our major backer and now also as a board member at Magic.

thumb_up_off_alt141

chat_bubble_outline4

repeat2

shareShare

Eric Steinberger

@ericsteinb

2 years ago

Thanks Noam! :)

thumb_up_off_alt29

chat_bubble_outline0

repeat0

shareShare

Eric Steinberger

@ericsteinb

2 years ago

Daniel is one of the most ambition-provoking people on Earth. Always pushes pushes pushes. That's how it should be.

thumb_up_off_alt84

chat_bubble_outline3

repeat3

shareShare

Eric Steinberger

@ericsteinb

2 years ago

fixed a bug I've been hunting for days! Yay

thumb_up_off_alt56

chat_bubble_outline5

repeat2

shareShare

Eric Steinberger

@ericsteinb

2 years ago

Very excited to welcome Andrej Karpathy as Magic's latest investor!

thumb_up_off_alt1,1K

chat_bubble_outline39

repeat46

shareShare

Eric Steinberger

@ericsteinb

a year ago

Excited to partner with Google Cloud as we scale up on H100s and GB200s.

thumb_up_off_alt295

chat_bubble_outline7

repeat10

shareShare

Eric Steinberger

@ericsteinb

a year ago

Excited to partner, thanks Sequoia Capital for helping fund the scale-up :)

thumb_up_off_alt42

chat_bubble_outline1

repeat0

shareShare

Noam Brown

@polynoamial

a year ago

This blog post by Magic does a great job highlighting the weaknesses of popular long-context evals and introduces HashHop as an alternative. Very impressive work from the Magic team and congrats on the new funding!

thumb_up_off_alt291

chat_bubble_outline5

repeat19

shareShare

Taelin

@victortaelin

a year ago

You know something is good when it aces existing tests and has to invent its own benchmark to flex. HashHop is a step forward, and I hope it becomes the norm, replacing the stupid "Needle In A Haystack" test for benchmarking long context windows.

thumb_up_off_alt210

chat_bubble_outline10

repeat10

shareShare

Eric Steinberger

@ericsteinb

a year ago

Excited to partner with NVIDIA!

thumb_up_off_alt131

chat_bubble_outline3

repeat3

shareShare

Eric Steinberger

@ericsteinb

a year ago

We're growing our Applied Team to work on post-training LTM2-medium (and once done pretraining, LTM2-large) into a useful assistant and capable agent.

thumb_up_off_alt116

chat_bubble_outline7

repeat4

shareShare

METR

@metr_evals

a year ago

How close are current AI agents to automating AI R&D? Our new ML research engineering benchmark (RE-Bench) addresses this question by directly comparing frontier models such as Claude 3.5 Sonnet and o1-preview with 50+ human experts on 7 challenging research engineering tasks.

thumb_up_off_alt849

chat_bubble_outline15

repeat172

shareShare

Eric Steinberger

@ericsteinb

6 months ago

We're hiring for a new team aiming to train AI SWEs to robustly complete long-horizon work on a no-restrictions computer via the GUI. Today's models excel at small, Olympiad-type coding tasks but struggle in complex codebases and aren't easy to integrate into existing enterprise

thumb_up_off_alt280

chat_bubble_outline13

repeat16

shareShare