GPU-Accelerated Computing with Nanos Unikernels

amelius · on Nov 1, 2022

Can anyone explain why cuda and related libs are so huge? They take more space than my entire OS. Tensor computations as used in most deep learning networks are not very involved at first sight, so I'm a bit confused.

my123 · on Nov 1, 2022

Binary slices for each GPU generation. So that it becomes huge quite quickly, despite use of compression.

amelius · on Nov 1, 2022

Any way to remove the unneeded parts?

my123 · on Nov 1, 2022

You might want to look at nvprune:

https://docs.nvidia.com/cuda/cuda-binary-utilities/index.htm...

However:

> The input file must be either a relocatable host object or static library (not a host executable), and the output file will be the same format

And:

> Note that this means that libcublas_static70.a will not run on any other architecture, so should only be used when you are building for a single architecture

But please don't use this for binaries distributed in the wild, would result in a lot of people not being able to run your program...

stevefan1999 · on Nov 1, 2022

Unfortunately it doesn't seems like they will be supporting AMD nor Intel's line of GPUs that much. I'm still using CUDA and I know it is still the dominating player but I just want to have another sidekick as a backup if say like RDNA3 got super successful, considering that their next consumer grade GPUs can have at most a whooping 24GB of VRAM and I can't wait to see it explode to 48GB or more in more serious, "industrial" cards.

paulgb · on Nov 1, 2022

Cool. Can the GPU be attached to more than one VM at a time? I’m curious what the approach to sharing the GPU is.