Cagri Demir (@adnancagri) 's Twitter Profile
Cagri Demir

@adnancagri

Founder anticipai.com | ex-Research Intern @Intel, AI/ML CS Ph.D candidate @ml_labs_irl, @WeAreTUDublin | @CentralBank_TR, @kocuniversity

ID: 116555280

linkhttps://twitter.com/adnancagri calendar_today22-02-2010 22:06:21

1,1K Tweet

352 Takipçi

2,2K Takip Edilen

Casper Hansen (@casper_hansen_) 's Twitter Profile Photo

I hate to be that guy, but you can't retroactively update the license. There will now forever be an Apache 2.0 licensed version of QvQ that you can git checkout

I hate to be that guy, but you can't retroactively update the license. There will now forever be an Apache 2.0 licensed version of QvQ that you can git checkout
Yuchen Jin (@yuchenj_uw) 's Twitter Profile Photo

I love Nvidia and Jensen, but their presentation of numbers bothers me: - vague terms like "AI TOPS" - compare FP4 on 5090 with FP8 on 4090 - show FP4 FLOPS and claim a $3,000 box runs a 200B model - plot graph mixing FP16, BF16, FP8, FP4, as if FP1 is usable in 2 years Why

I love Nvidia and Jensen, but their presentation of numbers bothers me:

- vague terms like "AI TOPS"
- compare FP4 on 5090 with FP8 on 4090
- show FP4 FLOPS and claim a $3,000 box runs a 200B model
- plot graph mixing FP16, BF16, FP8, FP4, as if FP1 is usable in 2 years

Why
Arnaud Bertrand (@rnaudbertrand) 's Twitter Profile Photo

This is pretty hilarious in retrospect. In India in 2023, Altman was asked how if a small, smart team with a budget of $10 million could build something substantial within AI. His reply: "It’s totally hopeless to compete with us on training foundation models"

Alexander Doria (@dorialexander) 's Twitter Profile Photo

I feel this should be a much bigger story: DeepSeek has trained on Nvidia H800 but is running inference on the new home Chinese chips made by Huawei, the 910C.

I feel this should be a much bigger story: DeepSeek has trained on Nvidia H800 but is running inference on the new home Chinese chips made by Huawei, the 910C.
shako (@shakoistslog) 's Twitter Profile Photo

Ending a Claude instance that helped you deal with some real shit in your life when the context has become too long and it's starting to go haywire.

Ending a Claude instance that helped you deal with some real shit in your life when the context has become too long and it's starting to go haywire.
Thomas Wolf (@thom_wolf) 's Twitter Profile Photo

After 6+ months in the making and burning over a year of GPU compute time, we're super excited to finally release the "Ultra-Scale Playbook" Check it out here: hf.co/spaces/nanotro… A free, open-source, book to learn everything about 5D parallelism, ZeRO, fast CUDA kernels,

After 6+ months in the making and burning over a year of GPU compute time, we're super excited to finally release the "Ultra-Scale Playbook"

Check it out here: hf.co/spaces/nanotro…

A free, open-source, book to learn everything about 5D parallelism, ZeRO, fast CUDA kernels,
Paul Graham (@paulg) 's Twitter Profile Photo

The quiet, apolitical majority of professors must be so bummed. First wokeness infected the institutions where they worked and they lost the ability to speak freely about many subjects, and now they're collateral damage in air strikes by the Republicans.

Paul Graham (@paulg) 's Twitter Profile Photo

If you didn't speak out against *both* woke censorship and Trump's attempts to suppress criticism of Israel at universities, you're not a genuine free speech advocate. You're just a partisan hack who pretends to care about free speech when it suits you.

Paul Graham (@paulg) 's Twitter Profile Photo

A founder asked my advice about combining a startup with having small children. I told him family is more important than business, and to put his kids first and cram the startup into the remaining time.

Paul Graham (@paulg) 's Twitter Profile Photo

No one ever, when they're old, feels they spent too much time with their kids. But there are plenty of people who feel they spent too little, and this must be the bitterest kind of regret.