More

girfan · 2025-12-18T15:01:45 1766070105

Cool post. Have you looked at slicing a single GPU up for multiple VMs? Is there anything other than MIG that you have come across to partition SMs and memory bandwidth within a single GPU?

namibj · 2025-12-18T15:36:40 1766072200

Last I checked MIG was the only one that made hard promises about especially memory bandwidth; as long as your memory access patterns aren't secret and you have enough trust in the other guests not being highly unfriendly with their cache usage behavior, you should be able to get away with much less strict isolation. Think docker vs. VMs-with-dedicated-cores.

But I thought MIG did do the job of chopping a GPU that's too big for most individual users into something that behaves very close to a literal array of smaller GPUs stuffed into the same PCIe card form factor? Think how a Tesla K80 was pretty much just two GK210 "GPUs" on a PLX "PCIe switch" which connects them to each other and to the host. Obviously trivial to give one to each of two VMs (at least if the PLX didn't interfere with IOMMU separation or such.... for mere performance isolation it certainly sufficed (once you block a heavy user from power budget throttling the sibling, at least).

tptacek · 2025-12-18T16:19:11 1766074751

Can you pass a MIG device into a KVM VM? The team we worked with didn't believe it was possible (they suggested we switch to VMWare); the MIG system interface gives you a UUID, not a PCI BDF.

moondev · 2025-12-18T17:59:22 1766080762

Kubevirt has some examples passing a vpgu into kvm

https://kubevirt.io/user-guide/compute/host-devices/

tptacek · 2025-12-18T18:29:54 1766082594

Right, vGPUs are explicitly set up to generate BDF addresses that can be passed through (but require host driver support; they're essentially paravirtualized). I'm asking about MIG.

namibj · 2025-12-19T08:16:29 1766132189

https://docs.nvidia.com/datacenter/tesla/mig-user-guide/supp... says GPU passhtrough is supported on MIG...

my123 · 2025-12-19T01:10:17 1766106617

There's a MIG vGPU mode usable for this

tptacek · 2025-12-19T01:19:10 1766107150

Have you used it? How does it work? How do you drive it? We tried a lot of different things. Is it not paravirtualized, the way vGPUs are?

my123 · 2025-12-24T07:36:21 1766561781

It works with SR-IOV instead of mdev afaik

Still needs some host SW to drive it but actually does static partitioning

IIRC it's usable through using the MIG-marked vGPU types

ben_s · 2025-12-18T15:24:16 1766071456

Thanks! I haven't looked deeply into slicing up a single GPU. My understanding is that vGPU (which we briefly mention in the post) can partition memory but time-shares compute, while MIG is the only mechanism that provides partitioning of both SMs and memory bandwidth within a single GPU.

girfan · 2025-11-13T00:56:51 1762995411

This seems very interesting. Timely, given that Yann LeCun's vision also seems to align with world models being the next frontier: https://news.ycombinator.com/item?id=45897271

lofties · 2025-11-13T00:59:52 1762995592

An established founder makes claims X is the new frontier. X receives hundreds of millions in funding. Other less established founders claim they are working on X too. VCs suffering from terminal FOMO pump billions more into X. X becomes the next frontier. The previous frontiers are promptly forgotten about.

whizzter · 2025-11-13T01:41:39 1762998099

I think it's a bit confusing when it comes to terminology, this seems more graphics focused while I suspect that a 10 year plan as mentioned by YLC probably revolves around re-architecting AI systems to be less reliant on LLM style nets/refinements and better understand the world in a way that isn't as prone to hallucinations.

bee_rider · 2025-11-13T05:44:08 1763012648

What’s it going to do, take away funds from the otherwise extremely prudent AI sector?

girfan · on June 1, 2023

Right now it seems legal to scrape Reddit. But given their trajectory of making the API fairly expensive to use, do you think it's likely that they would also limit/prohibit scraping (assuming apps like Apollo start scraping as an alternative)?

xur17 · on June 1, 2023

My understanding is that scraping of public websites is generally legal, isn't it?

vcanales · on June 1, 2023

Legal, but probably against ToS

Semaphor · on June 1, 2023

That ToS is meaningless if you scrape logged out.

girfan · on June 1, 2023

Even if that is the case, it does not say much about their ability to evaluate what category of users (e.g., those coming from Apollo or their first-party clients) is generating more views/interactions and indirectly more "value" on their platform.

girfan · on Sept 20, 2022

> You can't load one so/dll multiple times in some sort of container

I believe you can do that with `dlmopen` in separate link maps. I have worked with multiple completely isolated Python interpreters in the same process that do not share a GIL using that approach.

unnah · on Sept 21, 2022

Thank you for the hint about dlmopen! I had a problem that can be solved by loading multiple copies of a DLL, and it looks like reading manpages of the dynamic linker would have been a better approach than googling with the wrong keywords.

girfan · on Sept 21, 2022

That's great!

There are a few cases where `dlmopen` has issues, for example, some libraries are written with the assumption that there will only be one of them in the process (their use of globals/thread local variables etc.) which may result in conflicts across namespaces.

Specifically, `libpthread` has one such issue [1] where `pthread_key_create` will create duplicate keys in separate namespaces. But these keys are later used to index into `THREAD_SELF->specific_1stblock` which is shared between all namespaces, which can cause all sorts of weird issues.

There is a (relatively old, unmerged) patch to glibc where you can specify some libraries to be shared across namespaces [2].

[1]: https://sourceware.org/bugzilla/show_bug.cgi?id=24776#c13

[2]: https://patchwork.ozlabs.org/project/glibc/patch/20211010163...

jupp0r · on Sept 20, 2022

IIRC glibc is limited to 16 namespaces though.

girfan · on Sept 21, 2022

Currently it is, yes. I am not sure how fundamental it is. I tried patching glibc to support more (128 in my case) and it seemed to work fine.

girfan · on Aug 10, 2022

This is a great analysis; thanks for writing.

I have also been working on running multiple Python interpreters in the same process by isolating them in different namespaces using `dlmopen` [1]. The objective on a high level is to receive requests for some compute intensive operations from a TCP/HTTP server and dispatch them on to different workers. In this case, a thin C++ shim receives the requests and dispatches them on to one of the Python interpreters in a namespace. This eliminates contention for the GIL amongst the interpreters and can exploit parallelism by running each interpreter on a different set of cores. The data obtained from the request does not need to be copied into the interpreter because everything is in the same address space; similarly the output produced by the Python interpreter is also just passed back without any copies to the server.

[1] https://www.man7.org/linux/man-pages/man3/dlmopen.3.html

girfan · on June 11, 2022

Can you please share pointers to some of the podcasts etc. you listened to? Looking for something similar for non-Bio expert people.

girfan · on May 30, 2022

This looks like a cool project. Is there any support (or plan to support) I/O through kernel bypass technologies like RDMA? For example, the client reads the objects using 1-sided reads from the server given it knows which address the object lives in. This could be really benefitial for reducing latency and CPU load.

seedless-sensat · on May 31, 2022

Similarly, I really liked the ideas in paper accelerating Memcached using an eBPF cache layer in the NIC interrupt: https://www.usenix.org/conference/nsdi21/presentation/ghigof...

romange · on May 30, 2022

I do not know much about RDMA. Our goal is to provide a memory store that is fully compatible with Redis/Memcached protocols so that all the existing frameworks could work as before. I am not sure how RDMA fits this goal.

girfan · on Jan 19, 2021

Cool product! Any plans to support Azure Functions?

jayair · on Jan 19, 2021

Thanks! We do but it's a bit further down the roadmap.

girfan · on Feb 14, 2020

Here is something that has helped me a lot. It's a paper on how to read academic papers. You can extract the general idea and apply to many other technical reading material.

https://web.stanford.edu/class/ee384m/Handouts/HowtoReadPape...