Nicholas Malaya(@nicholasmalaya) 's Twitter Profileg
Nicholas Malaya

@nicholasmalaya

Computational Scientist, AMD. To Exascale, and beyond!

ID:1895743321

linkhttps://nicholasmalaya.github.io/ calendar_today23-09-2013 01:15:15

2,7K Tweets

987 Followers

922 Following

LLNL Computing(@Livermore_Comp) 's Twitter Profile Photo

Join us on the road to El Capitan! Our multi-part article series continues with installments 6 through 10. πŸŽ‰

computing.llnl.gov/livermore-comp…

Join us on the road to #exascale El Capitan! Our multi-part article series continues with installments 6 through 10. πŸŽ‰ computing.llnl.gov/livermore-comp… #HPC #supercomputer #computing
account_circle
GigaIO(@giga_io) 's Twitter Profile Photo

AMD and GigaIO invite you to shatter the 8-GPU server ceiling without changing a single line of code.

Learn how can you shorten time-to-results when training large models, while saving power and space in your .

Join us on April 10 at 9am PT.
bit.ly/49foKei

@AMD and GigaIO invite you to shatter the 8-GPU server ceiling without changing a single line of code. Learn how can you shorten time-to-results when training large models, while saving power and space in your #datacenter. Join us on April 10 at 9am PT. bit.ly/49foKei
account_circle
HaxRob(@haxrob) 's Twitter Profile Photo

Andres Freund, the principal software engineer at Microsoft who discovered the xz backdoor really does deserve a big pat on the back. πŸ‘

The outcome could have been much, much worse.

account_circle
Jeff(@science_dot) 's Twitter Profile Photo

It will be my great pleasure to present this award at ISC HPC. This is a great paper on a great topic.

account_circle
Cheese(@System360Cheese) 's Twitter Profile Photo

Well...... I am quite underwhelmed by GB200.....
Nvidia cut FP64 massively and it's quite depressing....
A single Blackwell GPU only gets about 20-22.5TF of FP64 Tensor (And no I am not counting Sparsity numbers here) which means that the Vector FP64 is only 10-11TF......

Well...... I am quite underwhelmed by GB200..... Nvidia cut FP64 massively and it's quite depressing.... A single Blackwell GPU only gets about 20-22.5TF of FP64 Tensor (And no I am not counting Sparsity numbers here) which means that the Vector FP64 is only 10-11TF......
account_circle
fclc(@FelixCLC_) 's Twitter Profile Photo

Looks like FP64 was borderline culled on B100

using the attached table, that's 90 SPARSE Tflops of FP64 using 2 GPUs, so 'peak' of ~22.5 Tflops per device

lop the typical GPGPU Max/Peak ratio on and we're probably looking towards ~15Tflops of dense, double precision MatMul

Looks like FP64 was borderline culled on B100 using the attached table, that's 90 SPARSE Tflops of FP64 using 2 GPUs, so 'peak' of ~22.5 Tflops per device lop the typical GPGPU Max/Peak ratio on and we're probably looking towards ~15Tflops of dense, double precision MatMul #HPC
account_circle
Ian Foster(@ianfoster) 's Twitter Profile Photo

Bruce Schneier on 'How Public AI can Strengthen Democracy' -- 'Federally funded foundation AI models would be provided as a public service, similar to a health care private option. They would not eliminate opportunities for private foundation models, but they would offer a

account_circle
Erisa Hasani(@erisahasani3) 's Twitter Profile Photo

A little update: I will be joining AMD for this summer as a PhD tech intern, here in Austin. Very glad I won't be relocating and and it seems that I will be on a fantastic team!

account_circle
Greg Diamos(@GregoryDiamos) 's Twitter Profile Photo

We are hiring an HPC (MPI / OpenAI Triton) Engineer at Lamini. Apply here:

jobs.lever.co/laminiai/af688…

We are inventing and building the largest AMD LLM training system in the world. Join us in strongly scaling to 1000s of GPUs and beyond.

account_circle