For exmaple, if your model contains arbitrary Python code, you'd pack it using [1] and then you could load it from another language using [2]. In this case, Carton transparently spins up an isolated Python interpreter under the hood to run your model (even if the rest of your application is in another language).
You can take it one step further if you're using certain DL frameworks. For example, you can create a TorchScript model in Python [3] and then use it from any programming language Carton supports without requiring python at runtime (i.e. your model runs completely in native code).
Seems almost too good to be true, but I really hope it's not. How does it handle things like CUDA dependencies? Can it somehow make those portable too? Or is GPU acceleration not quite there yet?
It uses the NVIDIA drivers on your system, but it should be possible to make the rest of CUDA somewhat portable. I have a few thoughts on how to do this, but haven't gotten around to it yet.
The current GPU enabled torch runners use a version of libtorch that's statically linked against the CUDA runtime libraries. So in theory, they just depend on your GPU drivers and not your CUDA installation. I haven't yet tested on a machine that has just the GPU drivers installed (i.e without CUDA), but if it doesn't already work, it should be very possible to make it work.