Can anyone explain why cuda and related libs are so huge? They take more space than my entire OS. Tensor computations as used in most deep learning networks are not very involved at first sight, so I'm a bit confused.
> The input file must be either a relocatable host object or static library (not a host executable), and the output file will be the same format
And:
> Note that this means that libcublas_static70.a will not run on any other architecture, so should only be used when you are building for a single architecture
But please don't use this for binaries distributed in the wild, would result in a lot of people not being able to run your program...
Unfortunately it doesn't seems like they will be supporting AMD nor Intel's line of GPUs that much. I'm still using CUDA and I know it is still the dominating player but I just want to have another sidekick as a backup if say like RDNA3 got super successful, considering that their next consumer grade GPUs can have at most a whooping 24GB of VRAM and I can't wait to see it explode to 48GB or more in more serious, "industrial" cards.