Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> You also need to account for significant kernel CPU use, like hardware interrupt handlers (e.g., NIC rx) running on some cores.

In a lot of loads, it doesn't matter that much, because the application uses way more CPU than packet handling... But if it does matter, you really want to get things lined up as much as possible. Each CPU core gets one NIC queue pinned and one application thread pinned and the connections correctly mapped so the fast path never communicates to other cores. I'm not tuned into current NICs though, I don't know if you can get 128 queues on NICs now. If you have two sockets, then you also have the fun of Non-Uniform PCI-E Access...

When I was working on this (for a tcp mode HAProxy install), the NICs did 16 queues, and I had dual 12 or 14-core CPUs, so socket 0 got all its cores busy, and socket 1 just got a couple work threads and was mostly idle. Single socket, power of 2 cores is a lot better to optimize, but I had existing hardware to reuse.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: