I wonder if many of those string copies could be removed completely with a change in code structure, and the associated use of various fixed-length-buffers; an OS kernel doesn't seem to me like a piece of software where string manipulation is a significant part of its job (besides pathname handling.)
If we're going to put large amounts of complexity into the kernel, then why not use a pool allocator to manage string allocations? Pool allocators are low-level enough that they you can understand what's happening when they go wrong (unlike complex GCs), but make string handling code very simple.
When the pool is freed, all strings in it get freed. Generally you arrange pools to be per-"request" (in a web server), but it might be per-syscall in a kernel. At the end of the syscall, all strings get freed in one go.
Pools can also have parent/child relations, so when a parent pool is freed, all child pools are freed.
This is the kind of stuff you want running in a service out side of a micro kernel, but as we are stuck with systemd/gnu/linux systems, this is what we have...
I could further into the rant, I'm ...stunned by parsing logic crippled as type specific functions.
Why does `strscpy_truncate(dst,src,count)` exist at all ? it seem like a generic buffer/array `take(dst,src,count)`. And how problematic would be to use a compiled regex to parse "major:minor", or proper logic by seeking for ':' and read before after that point.