That seems like a naive take. If any of your local VMs are internet connected and are compromised, side channel attacks could be used to exfiltrate data from other VMs or the host.
When your agent performs 20 tasks saving seconds here and there becomes a very big deal. I cannot even begin to describe how much time we've spent on optimising code paths to make the overall execution fast.
Last week I was on a call with a customer. They where running OpenAI side-by-side with our solution. I was pleased that we managed to fulfil the request under a minute while OpenAI took 4.5 minutes.
The LLM is not the biggest contributor to latency in my opinion.
Thanks! While I agree with you on "saving seconds" and overall latency argument, according to my understanding, most agentic use cases are asynchronous and VM boot up time may just be a tiny fraction of overall task execution time (e.g., deep research and similar long running tasks in the background).
That's what the author is claiming. Practically, VM-level strong fault isolation cannot be achieved without isolation support from the hardware aka virtualization.
Hardware without something like SR-IOV is straight up going to be unshareable for the foreseeable future; things like ring buffers would need a whole bunch of coordination between kernels to share. SR-IOV (or equivalent) makes it workable, an IOMMU (or equivalent) then provides isolation.
You could have a “nanokernel” which owns the ring buffers and the other kernels act as its clients… or for a “primary kernel” which owns the ring buffers and exposes an API the other kernels could call. If different devices have different ring buffers, the “primary kernel” could be different for each one.
reply