aurelium /ɔˈreːliəm/ (@ariaurelium) 's Twitter Profile
aurelium /ɔˈreːliəm/

@ariaurelium

friend to tensors of all shapes

ID: 1807993511842603008

linkhttps://aurelium.me/ calendar_today02-07-2024 04:23:46

1,1K Tweet

368 Takipçi

185 Takip Edilen

aurelium /ɔˈreːliəm/ (@ariaurelium) 's Twitter Profile Photo

well, i was paying for Cursor because it was free Claude tokens compared to the API. but now there is no point this was inevitable, so i'm not mad, just irritated

well, i was paying for Cursor because it was free Claude tokens compared to the API. but now there is no point

this was inevitable, so i'm not mad, just irritated
aurelium /ɔˈreːliəm/ (@ariaurelium) 's Twitter Profile Photo

one big problem with o3's reasoning is that it doesn't know what it doesn't know how is o3 supposed to know that llama 2 7B is not the most applicable analogue for an ultrasparse transformer inference-wise? it missed the last 2 years!

aurelium /ɔˈreːliəm/ (@ariaurelium) 's Twitter Profile Photo

the fun thing about working on big datasets is that everything takes hours so you can play stardew valley and still feel like you're getting work done

the fun thing about working on big datasets is that everything takes hours so you can play stardew valley and still feel like you're getting work done
aurelium /ɔˈreːliəm/ (@ariaurelium) 's Twitter Profile Photo

K2 is, more than anything, a total victory for open-source. they built on DeepSeek's research and known-good techniques to focus on the novel aspects of their model and achieved way better results this is how science is meant to work!

aurelium /ɔˈreːliəm/ (@ariaurelium) 's Twitter Profile Photo

i wonder what odds even tuned-in experts in 2023 would've given to "base model better than GPT-4 available under MIT license in 2025"