Presuming that this is a server that has One (public) Job, couldn't you:
1. dedicate a NIC to the application;
2. and have the userland app open a packet socket against the NIC, to drink from its firehose through MMIO against the kernel's own NIC DMA buffer;
...all without involving the kernel TCP/IP (or in this case, UDP/IP) stack, and any of the accounting logic squirreled away in there?
(You can also throw in a BPF filter here, to drop everything except UDP packets with the expected specified ip:port — but if you're already doing more packet validation at the app level, you may as well just take the whole firehose of packets and validate them for being targeted at the app at the same time that they're validated for their L7 structure.)
I think DPDK does something like this. The NIC is programmed to aim the packets in question at a specific hardware receive queue, and that queue is entirely owned by a userspace program.
A lot of high end NICs support moderately complex receive queue selection rules.
I mean, under the scheme I outlined, the kernel is still going to do that by default. It's not that the NIC's driver is overridden or anything; the kernel would still be reading the receive buffer from this NIC and triggering per-packet handling — and thus triggering default kernel response-handling where applicable (and so responding to e.g. ICMP ARP messages correctly.)
The only thing that's different here, is that there are no active TCP or UDP listening sockets bound to the NIC — so when the kernel is scanning the receive buffer to decide what to do with packets, and it sees a TCP or UDP packet, it's going to look at its connection-state table for that protocol+interface, realize it's empty, and drop the packet for lack of consumer, rather than doing any further logic to it. (It'll bump the "dropped packets" counter, I suppose, but that's it.)
But, since there is a packet socket open against the NIC, then before it does anything with the packet, it's going to copy every packet it receives into that packet socket's (userspace-shared) receive-buffer mmap region.
1. dedicate a NIC to the application;
2. and have the userland app open a packet socket against the NIC, to drink from its firehose through MMIO against the kernel's own NIC DMA buffer;
...all without involving the kernel TCP/IP (or in this case, UDP/IP) stack, and any of the accounting logic squirreled away in there?
(You can also throw in a BPF filter here, to drop everything except UDP packets with the expected specified ip:port — but if you're already doing more packet validation at the app level, you may as well just take the whole firehose of packets and validate them for being targeted at the app at the same time that they're validated for their L7 structure.)