Turn on Jumbograms. I ran iSCSI for ZFS served out via NFS on distinct cards so I had less contention for the disc fetch against the NFS serve, and it worked "ok" but that was FreeBSD (partly) and 9000 MTU definitely made a difference. Possibly right-sizing the MTU to be bigger than the blocksize is a tuning distinction but 9k jumbo definitely improved things.
Why send 4 packets when one will do? Same volume of data, less switch burden to latch it through.
After years in enterprise storage with endless performance testing: there's almost no point. Modern CPUs and NICs barely benefit. In 2005 when we had dual-core CPUs that were constantly buried and NICs with 0 offload, it made a ton of sense - heck we had dedicated iSCSI HBAs (QLA4010 represent!).
That being said, if you've got three servers and two VLANs with no worries about the jumbos ever escaping: I guess? But if you see even a 5% performance increase, I'll be shocked. On the flip side you're one misconfiguration away from endless troubleshooting if those jumbos escape.
It'd be great to have an MTU of say 64 KiB or greater.
Although I guess you'd also need a longer than 32-bit CRC to detect all the possible 3 bit errors past 11 kB frame size. A 40-bit CRC would be sufficient, at least up to 188 kB frame size or so.
If we were redoing Ethernet I wouldn't mind removing the CRC completely. If you want end to end reliability you should do it in the layer above ethernet. If you want link per link packet validation in a network we've already been layering advanced FEC algorithms at the physical layer for high speed Ethernet. The advantage of the latter being it's both optional and replaceable without requiring even more dynamic bits or redundant functionality in the layer 2 packet. Then on MTU make it a 32 bit field instead of a 16 bit field in case anyone wants to make hardware that supports more than 64k in the future.
Some people say for TCP, smaller packets give better acknowledgement pacing. ISCSI is kinda over local, single switch links for me, but general purpose TCP streamed data it may well be "smaller is better" for rate estimates and window management
MSS (maximum segment size) is the term at the TCP layer. Each end of a connection can (and usually does) declare its MSS in a TCP option in the first packet it sends.
Advertised MSS, interface MTU, and route MTU can all constrain packet sizing.
Using large-MTU routes for internal destinations can work well.
Low enough we had viable mounts before, but the retransmit counts were big. I don't have the host any more, moved to an ix system truenas. Probably I should have looked harder on the provider side.
One can then use local disk as a ZIL to improve IOps.
When "Hybrid Storage Pools" storage pools were first introduced in 2008, when flash was still really expensive, this was a clever way of balancing speed and bulk storage with budget constraints:
Nowadays flash is cheap/er, so all-flash storage is much more popular, with many storage products able to do tiered storage where (c)older data bits are shuffled from fast-expensive flash to slow-cheaper spinning rust.
I hope the author really does mean "home lab" and not "home production". Having run my own personal disk array for over two decades, this is like the opposite of what I've come to want. The simpler and more straightforward things can be, the better. Otherwise when things fail (and they will fail, despite that redundancy (or even perhaps because of it)), you'll end up with circular dependencies that make diagnosing and fixing things quite painful.
Why send 4 packets when one will do? Same volume of data, less switch burden to latch it through.