Peb Ruswono Aryan (@pebaryan) 's Twitter Profile
Peb Ruswono Aryan

@pebaryan

wd:Q57167805

ID: 147141559

linkhttp://about.me/peb calendar_today23-05-2010 09:25:31

8,8K Tweet

573 Followers

455 Following

Peb Ruswono Aryan (@pebaryan) 's Twitter Profile Photo

Next weekend hack, trying to run bigger LLM (e.g., gpt4-oss 20B) on mixed multi GPU settings: 2-way (x16+x4) and 3-way (x16+x4+x1). All budget GPUs (PCIe 3, under €50 used, except for 1070ti and P4)

Peb Ruswono Aryan (@pebaryan) 's Twitter Profile Photo

Lookie here! The RX470 8G got a companion and together they can run gpt4-oss 20B MXFP4 MoE using llama.cpp for 12-13 tps (on Windows11, on Linux should be 2-3 times higher). each repurposed mining GPU bought for the price of 1 pizza

Lookie here! The RX470 8G got a companion and together they can run gpt4-oss 20B MXFP4 MoE using llama.cpp for 12-13 tps (on Windows11, on Linux should be 2-3 times higher). each repurposed mining GPU bought for the price of 1 pizza
Peb Ruswono Aryan (@pebaryan) 's Twitter Profile Photo

For some reason, this combo (1070ti,1060,P106-100) doesn't want to go POST. So I had to turn off P106 and replace it with Tesla P4

For some reason, this combo (1070ti,1060,P106-100) doesn't want to go POST. So I had to turn off P106 and replace it with Tesla P4
Peb Ruswono Aryan (@pebaryan) 's Twitter Profile Photo

Should I try all AMD with additional R9 Fury? I'm not sure if the current power supply setup works though.. but let's try that later

Peb Ruswono Aryan (@pebaryan) 's Twitter Profile Photo

Strange, this setup passed POST but showing black screen during os boot so I plugged the HDMI to the motherboard but now the only GPU listed are only iGPU and the R9 Fury

Strange, this setup passed POST but showing black screen during os boot so I plugged the HDMI to the motherboard but now the only GPU listed are only iGPU and the R9 Fury
Peb Ruswono Aryan (@pebaryan) 's Twitter Profile Photo

R9 Fury is so power hungry and the vram is too tight. Rx470 cannot even save it and only with rtx3060 it can run gpt4-oss 20B without spilling. I wonder how bad Tesla K80 for this setup

Peb Ruswono Aryan (@pebaryan) 's Twitter Profile Photo

RTX 3060 might not be able to be combined with other cards on a (x16/x4/x1) config because the X1 slot is covered by its body

RTX 3060 might not be able to be combined with other cards on a (x16/x4/x1) config because the X1 slot is covered by its body
Peb Ruswono Aryan (@pebaryan) 's Twitter Profile Photo

RTX 3060 + Tesla P4 running gpt-oss 20B with Vulkan only reaches 40tps token generation. CUDA run crashes llama.cpp, now rebuilding llama.cpp from the latest commit

Peb Ruswono Aryan (@pebaryan) 's Twitter Profile Photo

llama.cpp build complete. CUDA version now runs at 58tps! Not bad! more detailed result here: gist.github.com/pebaryan/0f501…