More

vpanyam · on Sept 28, 2023

I'm definitely open to it if there's interest (or if someone wants to help), but I don't have plans to implement Windows support myself at the moment.

The currently supported platforms [1] were mostly driven by environments I've seen at various tech companies.

I do have active plans to support inference from WASM/WebGPU so maybe that could be a good entrypoint to Windows support.

--

[1] Currently, the supported platforms are:

* `x86_64` Linux and macOS

* `aarch64` Linux (e.g. Linux on AWS Graviton)

* `aarch64` macOS (e.g. M1 and M2 Apple Silicon chips)

* WebAssembly (metadata access only for now, but WebGPU runners are coming soon)

vpanyam · on Sept 28, 2023

That's a good question! There's an FAQ entry on the homepage that touches on this, but let me know if I can improve it:

> ONNX converts models while Carton wraps them. Carton uses the underlying framework (e.g. PyTorch) to actually execute a model under the hood. This is important because it makes it easy to use custom ops, TensorRT, etc without changes. For some sophisticated models, "conversion" steps (e.g. to ONNX) can be problematic and require additional validation. By removing these conversion steps, Carton enables faster experimentation, deployment, and iteration.

> With that said, we plan to support ONNX models within Carton. This lets you use ONNX if you choose and it enables some interesting use cases (like running models in-browser with WASM).

More broadly, Carton can compose with other interesting technologies in ways ONNX isn't able to because ONNX is an inference engine while Carton is an abstraction layer.

WorldMaker · on Sept 28, 2023

> This lets you use ONNX if you choose and it enables some interesting use cases (like running models in-browser with WASM)

If someone already has an ONNX model, there's already an in-browser capable ONNX runtime: https://onnxruntime.ai/docs/get-started/with-javascript.html...

(It does use some parts compiled to WASM under the hood, presumably for performance.)

Dayshine · on Sept 28, 2023

ONNX runtime doesn't convert models, it runs them, and it has bindings in several languages. And most importantly it's tiny compared to the whole python package mess you get with TF or pytorch.

If carton took a TF/pytorch model and just dealt with the conversion into a real runtime, somehow using custom ops for the bits that don't convert, that would be amazing though.

ZeroCool2u · on Sept 28, 2023

There's an ONNX runtime, but to use the runtime you do need to convert your model into ONNX format first. You can't just run a TF of PyTorch model using the ONNX runtime directly. (At least last time I checked.) Unfortunately this conversion process can be a pain and there needs to be an equivalent operator in ONNX for each op in your TF/Torch execution graph.

vpanyam · on Sept 28, 2023

In addition to the benefits mentioned in the sibling comment, zip files let you seek to and access individual files in the archive without extracting all files (vs tar files for example).

This lets us do things like fetch model metadata [1] for a large remote model, by only fetching a few tiny byte ranges instead of the whole model archive.

It also means you can include sample data (images, etc) with your model and they're only fetched when necessary (for example with stable diffusion: https://carton.pub/stabilityai/sdxl)

[1] https://carton.run/docs/metadata

vpanyam · on Sept 28, 2023

Yes, that's a use case Carton supports.

For exmaple, if your model contains arbitrary Python code, you'd pack it using [1] and then you could load it from another language using [2]. In this case, Carton transparently spins up an isolated Python interpreter under the hood to run your model (even if the rest of your application is in another language).

You can take it one step further if you're using certain DL frameworks. For example, you can create a TorchScript model in Python [3] and then use it from any programming language Carton supports without requiring python at runtime (i.e. your model runs completely in native code).

[1] https://carton.run/docs/packing/python

[2] https://carton.run/docs/loading

[3] https://carton.run/docs/packing/torchscript

ZeroCool2u · on Sept 28, 2023

Seems almost too good to be true, but I really hope it's not. How does it handle things like CUDA dependencies? Can it somehow make those portable too? Or is GPU acceleration not quite there yet?

vpanyam · on Sept 28, 2023

Thanks :)

It uses the NVIDIA drivers on your system, but it should be possible to make the rest of CUDA somewhat portable. I have a few thoughts on how to do this, but haven't gotten around to it yet.

The current GPU enabled torch runners use a version of libtorch that's statically linked against the CUDA runtime libraries. So in theory, they just depend on your GPU drivers and not your CUDA installation. I haven't yet tested on a machine that has just the GPU drivers installed (i.e without CUDA), but if it doesn't already work, it should be very possible to make it work.

jcrash · on Sept 28, 2023

That’s awesome! Thanks for making this

vpanyam · on Oct 19, 2022

I'm working on the second part of my Nerf Dart Missile Defense system project [0].

Building a robot that can track nerf darts and shoot them out of the air has a lot of interesting technical challenges so it's a fun project :) I also get to learn a lot about the process of making videos.

The second part was almost ready a few months ago, but then I had to redo a lot of stuff and I lost steam for a bit.

Hopefully I'll have a second (more well-put-together) video out soon!

[0]: https://www.youtube.com/watch?v=wF-f_AdCxl0

vpanyam · on Aug 23, 2022

More details: https://help.twitch.tv/s/article/partner-exclusivity-faq

vpanyam · on March 21, 2022

Wow, thank you! I’m glad you enjoyed it :)

vpanyam · on March 21, 2022

First off, thank you for taking the time to write this. I really appreciate the feedback.

I agree with some of your points and in fact I was originally going to post the video with “[part 1]” in the title, but I decided to leave it out and rename it when I post part 2. I figured this was fine since I say “this is the first part of a series” in the first 30 or 40 seconds of the video (and also included it in the HN post description).

I do agree with the idea of having a focused cohesive theme to a video, but I didn’t really find a natural spot to cut it that didn’t make the video boring or not provide the motivation/context of the end goal (eg “firing an electronic airsoft gun from a computer” isn’t conveying what I want to convey).

The goal of this series is to show the process in detail instead of just a high level overview. I’m hoping to post a high level overview video at the end that’s more appealing to people who don’t necessarily want to dig through all the details.

I think future videos will be a little more scoped because they don’t need to include an in depth project overview. Kinda a focused story within the context of a larger backdrop.

I do agree with the point about audience and I think that’s important in general. However, I think there are several things that seem simple and commonplace, but may actually be important to talk about briefly to make the content more accessible for people who don’t have context on a particular area I’m diving into.

Thanks again for the feedback and I’m glad you enjoyed the video!

vpanyam · on March 21, 2022

Thank you so much!

vpanyam · on March 21, 2022

Thank you! I don’t have any particular outcomes in mind, but we’ll see what happens!