Maybe I'm missing something here, isn't this largely achieved by ONNX already? [...

vpanyam · on Sept 28, 2023

That's a good question! There's an FAQ entry on the homepage that touches on this, but let me know if I can improve it:

> ONNX converts models while Carton wraps them. Carton uses the underlying framework (e.g. PyTorch) to actually execute a model under the hood. This is important because it makes it easy to use custom ops, TensorRT, etc without changes. For some sophisticated models, "conversion" steps (e.g. to ONNX) can be problematic and require additional validation. By removing these conversion steps, Carton enables faster experimentation, deployment, and iteration.

> With that said, we plan to support ONNX models within Carton. This lets you use ONNX if you choose and it enables some interesting use cases (like running models in-browser with WASM).

More broadly, Carton can compose with other interesting technologies in ways ONNX isn't able to because ONNX is an inference engine while Carton is an abstraction layer.

WorldMaker · on Sept 28, 2023

> This lets you use ONNX if you choose and it enables some interesting use cases (like running models in-browser with WASM)

If someone already has an ONNX model, there's already an in-browser capable ONNX runtime: https://onnxruntime.ai/docs/get-started/with-javascript.html...

(It does use some parts compiled to WASM under the hood, presumably for performance.)

Dayshine · on Sept 28, 2023

ONNX runtime doesn't convert models, it runs them, and it has bindings in several languages. And most importantly it's tiny compared to the whole python package mess you get with TF or pytorch.

If carton took a TF/pytorch model and just dealt with the conversion into a real runtime, somehow using custom ops for the bits that don't convert, that would be amazing though.

ZeroCool2u · on Sept 28, 2023

There's an ONNX runtime, but to use the runtime you do need to convert your model into ONNX format first. You can't just run a TF of PyTorch model using the ONNX runtime directly. (At least last time I checked.) Unfortunately this conversion process can be a pain and there needs to be an equivalent operator in ONNX for each op in your TF/Torch execution graph.