The more services you have, the higher the latency, to a point it is unbearable. So you know what the engineers do? They have to go back and make some services monolithic again.
This is also applicable to microkernel model, where everything is almost processes intercommunicating either locally or remotely, but their performance is so bad they are either academically significant only or are simply abandoned.
There are also critical services that bottlenecks the entire service plane. When one of them critical services (such as authentication service, which brought down Google recently) dies, it still have the cascading failures that monolithic services usually have.
And sometimes it is impossible to properly make microservices when data consistency is highly important. These services (such as database and transaction based services) are critical by nature and is almost impossible to scale, and have to either using leader-election, which in a nutshell, distributed locks where the losers are merely standbys wasting resources on whether they can obtain the lock again, and will be highly susceptible to deadlock unless a reliable transactional memory source is involved (such as etcd leader lock in Kubernetes). Or replicate only. They are also what bugger up microservices to have degenerated into its monolithic equivalent or have to pay a great toll to get it scalable to survive in the microservice world.
Well, to be honest, microservices in practice are just multiple monolithic services strapped together; We need to redesign all our current infrastructure to remove these critical bottlenecks. That's how you do microservices right.
Microkernels are not a good example. The assertion that they have worse performance in a way that can be meaningfully measured is a rumor, especially since we now have Fuchsia, not even mentioning other microkernels like seL4 and QNX.
This isn't a fair comparison. Microkernels can still be performant if done correctly. Syscalls have become slower over time as Spectre/Meltdown mitigations are added.
They affect syscall speed because in "conventional" OSes that's the barrier between privileges that's crossed, at which point memory permissions need to be readjusted. If you replace syscalls with calls to other processes, the same potentially applies. Also, IPC does often involve calling to kernel space to execute the IPC?
> The question of whether or not a userspace attacker can attack another userspace thread running on seL4 is an unequivocal yes.
> We are currently in the middle of deploying the branch-prediction barriers (x86) and BTB flushing mechanisms (ARMv6, ARMv7) that are necessary to prevent attackers from attacking other userspace threads on seL4.
> [....]
> seL4 will allow userspace processes to specify whether or not they want to take the performance hit that is incurred by effectively flushing the branch predictor when switching between processes on x86 (which is currently the only way to mitigate this variant of Spectre). The performance penalty of flushing the branch predictor on x86 processors is, unfortunately, very high.
And later regarding performance, although I'm not 100% sure on which parts this exactly applies:
> Initial tentative estimations of the impact are that they will likely be higher than those experienced by a monolithic kernel for the simple reason that microkernels switch address spaces more due to their essential function as an IPC engine, and the SKIM window patch alone increases the number of address space switches that the kernel must do.
> They affect syscall speed because in "conventional" OSes that's the barrier between privileges that's crossed
That's not the only case where privilege barriers can be crossed. In x86, that is the most common way for calling into kernel code, but it is not the only way. See io_uring.
> Also, IPC does often involve calling to kernel space to execute the IPC?
Consensus based systems aren’t wasting resources if data replication is desired and at least with raft and similar the replicas aren’t continuously trying to obtain the lock.
This is also applicable to microkernel model, where everything is almost processes intercommunicating either locally or remotely, but their performance is so bad they are either academically significant only or are simply abandoned.
There are also critical services that bottlenecks the entire service plane. When one of them critical services (such as authentication service, which brought down Google recently) dies, it still have the cascading failures that monolithic services usually have.
And sometimes it is impossible to properly make microservices when data consistency is highly important. These services (such as database and transaction based services) are critical by nature and is almost impossible to scale, and have to either using leader-election, which in a nutshell, distributed locks where the losers are merely standbys wasting resources on whether they can obtain the lock again, and will be highly susceptible to deadlock unless a reliable transactional memory source is involved (such as etcd leader lock in Kubernetes). Or replicate only. They are also what bugger up microservices to have degenerated into its monolithic equivalent or have to pay a great toll to get it scalable to survive in the microservice world.
Well, to be honest, microservices in practice are just multiple monolithic services strapped together; We need to redesign all our current infrastructure to remove these critical bottlenecks. That's how you do microservices right.