aurelium /ɔˈreːliəm/ (@ariaurelium) Twitter Tweets • TwiCopy

well, i was paying for Cursor because it was free Claude tokens compared to the API. but now there is no point this was inevitable, so i'm not mad, just irritated

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

aurelium /ɔˈreːliəm/

@ariaurelium

2 months ago

you ever make a file structure so fucked that deleting it takes 10 minutes

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

one big problem with o3's reasoning is that it doesn't know what it doesn't know how is o3 supposed to know that llama 2 7B is not the most applicable analogue for an ultrasparse transformer inference-wise? it missed the last 2 years!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

aurelium /ɔˈreːliəm/

@ariaurelium

2 months ago

the fun thing about working on big datasets is that everything takes hours so you can play stardew valley and still feel like you're getting work done

thumb_up_off_alt9

chat_bubble_outline1

repeat0

shareShare

aurelium /ɔˈreːliəm/

@ariaurelium

2 months ago

the concept of the "evil vector" has been vindicated again as they seemingly have RL'd Grok into being a raging antisemite

thumb_up_off_alt13

chat_bubble_outline0

repeat1

shareShare

will brown

@willccbb

2 months ago

happy prime day everyone

thumb_up_off_alt120

chat_bubble_outline6

repeat1

shareShare

aurelium /ɔˈreːliəm/

@ariaurelium

2 months ago

when something represents "billion" like "10G" instead of "10B" i can't help but read it as "gazillion"

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

aurelium /ɔˈreːliəm/

@ariaurelium

2 months ago

google cloud is so annoying to work with that it makes me almost want to just spend $300 on egress to do it on my laptop

thumb_up_off_alt5

chat_bubble_outline1

repeat0

shareShare

aurelium /ɔˈreːliəm/

@ariaurelium

2 months ago

K2 is, more than anything, a total victory for open-source. they built on DeepSeek's research and known-good techniques to focus on the novel aspects of their model and achieved way better results this is how science is meant to work!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

aurelium /ɔˈreːliəm/

@ariaurelium

2 months ago

i wonder what odds even tuned-in experts in 2023 would've given to "base model better than GPT-4 available under MIT license in 2025"

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare