Microservices Architecture on Google Cloud

westoque · on Nov 22, 2021

> It is extremely difficult to change a monolith’s technology or language or framework because all components are tightly coupled and dependent on each other. As a result, even relatively small changes can require lengthy development and deployment times.

I disagree with this so much. I have personally worked with Rails application monoliths and Node.js microservices and I can tell you that making changes on the monolith is muliple times easier mostly depending on the code structure. I would take a properly structured monolith any day. This not only includes code/features but also deployments. Adding more services introduces more complexity in the deployment architecture as well.

A good example of this is just by looking at the GitLab codebase https://gitlab.com/gitlab-org/gitlab, it's a monolith but has good abstractions/structure vs say the Google Microservices Demo app https://github.com/GoogleCloudPlatform/microservices-demo which is not tightly coupled but introduces more complexity from implementation to deployment.

qaq · on Nov 22, 2021

It's obvious BS driven by cloud marketing because in monolith case you pay them for way fewer services. In Micro Services case a single client request will generate a number of downstream requests to multiple services driving the cloud provider's profits up. You also have a higher chance of using their highest margin tracing/telemetry offerings.

da39a3ee · on Nov 22, 2021

Not sure whether you just meant that as humorous cynicism, but I think it has more to do with pain experienced by teams working on monoliths and them thinking “there has to be a better way”.

qaq · on Nov 22, 2021

It's orders of magnitude easier to architect modular monolith than to build, test, deploy and manage a distributed system (e.g. microservices).

ehnto · on Nov 22, 2021

Well made modularity gets you many of the same benefits as microservices, without any of the deployment complexity. The articles argument about changing architecture and language being difficult seems like a red herring, because you shouldn't want to be changing architecture or language particularly frequently.

I'd really wager that you should be building your code to last the full life of the product from the start, with adaptability to changing scope and requirements being delivered by that modularity.

You of course don't always know the full scope of the project from day one, so you don't typically build software like you would build a bridge, with a fully holistic plan. But you can totally account for the future when you initial build a monolith.

KronisLV · on Nov 22, 2021

> Adding more services introduces more complexity in the deployment architecture as well.

This is true, but having multiple services or even instances that can horizontally scale gives you more leeway as far as resiliency against errors goes. For example, the issues in the monolith at my current day job could have a way lower priority/impact, if they didn't risk bringing down the entire application should the application server/instance fail - instead, if there were N instances, the load could just be handled by the other instances.

Actually, one of my first blog posts talks about how scalable and modular monoliths might be a way to leverage the best of both worlds: https://blog.kronis.dev/articles/modulith-because-we-need-to...

Of course, if you actually have the operational capacity to support microservices, that may be a good investment in some particular systems - if a part of the whole system would start to misbehave, its impact could be far more limited, for example, due to resource constraints or rate limits that should be in place. Monitoring could also become somewhat easier and you could figure out where any problem lies, except for the most complex Byzantine failures.

> I have personally worked with Rails application monoliths and Node.js microservices and I can tell you that making changes on the monolith is muliple times easier mostly depending on the code structure.

However, where monoliths fail in my eyes, is the strong coupling. For example, i currently need to migrate the very same monolith to far newer frameworks and runtimes, which is essentially impossible to ship, because a large number of edge cases and specific functionality all break when this is done.

So instead of being able to ship the parts that'd work (say, the RESTful API and the front end), i'm blocked by the things that don't work (report functionality, PDF functionality, file handling functionality, database migration functionality), so for approx. the past month it has probably looked like i'm not generating much business value, due to struggling with all of this. I wrote more about the problems with this here on HN: https://news.ycombinator.com/item?id=29204841

Youden · on Nov 22, 2021

I think you missed the "technology or language or framework" part.

Their argument is that if you have a monolith and want to say move from Ruby to TypeScript, you basically have to rewrite the whole thing and migrate in one go, which is a massive pain. If you have microservices and want to do the same, you can move one service at a time.

necovek · on Nov 22, 2021

But that ain't true either.

You can even move from one monolith to the other. Eg. a single DB store, and you just move a set of APIs (this set of APIs live in Ruby monolith, and it can talk to TypeScript monolith). Just put a reverse proxy in front (which you're likely to already run), and route requests to the proper service. Sure, some things will be trickier, but they'd be tricky anyway (eg. if you want to move user management APIs which tie into privilege handling, it's going to be finicky no matter how you slice it).

The main driver for SOA (and microservices by extension) is to allow independent iteration of components by independent teams. On small teams, it's usually not a win at all because you introduce a lot of overhead and you need to really be on point regarding backwards compatibility (or statelessness) and API specs.

Sure, monoliths do allow unrestricted use of (a particular set of) suboptimal patterns, but I've seen one too many "microservices" which have "custom" protocols that are only ever changed in sync on both sides of the microservice (server and clients).

blackoil · on Nov 22, 2021

It is possible but not as simple as you suggest. Take a email/notification service being part of monolith, changing that to a new language / arch. is lot more complex then a similar micro-service connected via Kafka/GRPC.

necovek · on Nov 22, 2021

Depends on how it's structured in the monolith. If it's well encapsulated — eg. one API call to trigger a notification that goes to a job-runner which sends the actual notification, which is the standard way of doing these in the monoliths I've seen — it's going to be similarly simple/hard.

But that was exactly my point: how hard it is depends on the code structure that has no relationship to whether you are in a SOA or a monolith architecture.

Sure, some (bad) patterns are hard in SOA which are trivial in a monolith, but they are bad patterns regardless. What matters is that you replace bad patterns with good patterns.

thrashh · on Nov 22, 2021

IMO the only reason why people run into trouble is because they code as if everything completes instantly. For example, they’ll write to disk assuming the operation is quick, because that’s what they’re used to. You see this all the time in all sorts of software when it freezes up and errors cascade.

People just code like everything works all the time. Then when they want to integrate an external service, they have to rewrite everything.

If you understand that network calls have to made and electrical signals have to pass down a bus when you write either a monolith or a microservice, going back and swapping services is super easy.

tomnipotent · on Nov 22, 2021

> Take a email/notification service being part of monolith

The only extra work is updating the monolith to call the extracted service, which isn't that hard.

mrighele · on Nov 22, 2021

> technology or language or framework

Note what they are talking about. They are right: of course if you want to change your Rails monolith to a Spring Boot java project it will be extremely difficult. The point is, how often do you want to do that ? Maybe in google it happens often, but for most smaller companies that is "very rarely"

stevefan1999 · on Nov 22, 2021

The more services you have, the higher the latency, to a point it is unbearable. So you know what the engineers do? They have to go back and make some services monolithic again.

This is also applicable to microkernel model, where everything is almost processes intercommunicating either locally or remotely, but their performance is so bad they are either academically significant only or are simply abandoned.

There are also critical services that bottlenecks the entire service plane. When one of them critical services (such as authentication service, which brought down Google recently) dies, it still have the cascading failures that monolithic services usually have.

And sometimes it is impossible to properly make microservices when data consistency is highly important. These services (such as database and transaction based services) are critical by nature and is almost impossible to scale, and have to either using leader-election, which in a nutshell, distributed locks where the losers are merely standbys wasting resources on whether they can obtain the lock again, and will be highly susceptible to deadlock unless a reliable transactional memory source is involved (such as etcd leader lock in Kubernetes). Or replicate only. They are also what bugger up microservices to have degenerated into its monolithic equivalent or have to pay a great toll to get it scalable to survive in the microservice world.

Well, to be honest, microservices in practice are just multiple monolithic services strapped together; We need to redesign all our current infrastructure to remove these critical bottlenecks. That's how you do microservices right.

rackjack · on Nov 22, 2021

Microkernels are not a good example. The assertion that they have worse performance in a way that can be meaningfully measured is a rumor, especially since we now have Fuchsia, not even mentioning other microkernels like seL4 and QNX.

aliswe · on Nov 22, 2021

Yes Fuchsia, also why the downvotes? strange!

junon · on Nov 22, 2021

> This is also applicable to microkernel model

This isn't a fair comparison. Microkernels can still be performant if done correctly. Syscalls have become slower over time as Spectre/Meltdown mitigations are added.

detaro · on Nov 22, 2021

Don't you also need Spectre/Meltdown mitigation on microkernel IPC if an untrusted process is involved?

junon · on Nov 22, 2021

Those mitigations affect syscall speed. Syscalls are not required for IPC.

detaro · on Nov 22, 2021

They affect syscall speed because in "conventional" OSes that's the barrier between privileges that's crossed, at which point memory permissions need to be readjusted. If you replace syscalls with calls to other processes, the same potentially applies. Also, IPC does often involve calling to kernel space to execute the IPC?

And from a quick google, sel4 did indeed add such mitigations: https://research.csiro.au/tsblog/crisis-security-vs-performa...

> The question of whether or not a userspace attacker can attack another userspace thread running on seL4 is an unequivocal yes.

> We are currently in the middle of deploying the branch-prediction barriers (x86) and BTB flushing mechanisms (ARMv6, ARMv7) that are necessary to prevent attackers from attacking other userspace threads on seL4.

> [....]

> seL4 will allow userspace processes to specify whether or not they want to take the performance hit that is incurred by effectively flushing the branch predictor when switching between processes on x86 (which is currently the only way to mitigate this variant of Spectre). The performance penalty of flushing the branch predictor on x86 processors is, unfortunately, very high.

And later regarding performance, although I'm not 100% sure on which parts this exactly applies:

> Initial tentative estimations of the impact are that they will likely be higher than those experienced by a monolithic kernel for the simple reason that microkernels switch address spaces more due to their essential function as an IPC engine, and the SKIM window patch alone increases the number of address space switches that the kernel must do.

junon · on Nov 22, 2021

> They affect syscall speed because in "conventional" OSes that's the barrier between privileges that's crossed

That's not the only case where privilege barriers can be crossed. In x86, that is the most common way for calling into kernel code, but it is not the only way. See io_uring.

> Also, IPC does often involve calling to kernel space to execute the IPC?

Often. Not always.

dilyevsky · on Nov 22, 2021

Consensus based systems aren’t wasting resources if data replication is desired and at least with raft and similar the replicas aren’t continuously trying to obtain the lock.

isubasinghe · on Nov 22, 2021

Isn't sel4 super fast?

recursivedoubts · on Nov 22, 2021

pet theory: microservices is a psyop by big tech to make deploying and maintaining software so insanely difficult that future potential competitors are too tied up trying to keep the cloud equivalent of "hello world" afloat to present any real threat

analyst74 · on Nov 22, 2021

Some companies adopt micro-services to turn impossible problems into merely hard problems; some companies adopt micro-services to turn simple problems into hard problems.

dboreham · on Nov 22, 2021

There's a simpler explanation : Conway's Law.

adamors · on Nov 22, 2021

Yeah, I don't understand why HN commenters refuse to understand this.

The real advantages of microservices (I would say the only advantage but YMMV) is that it greatly simplifies development work, CI etc. when your organisation reaches a certain point.

Even at 200 devs, working on just one monolith has a lot of friction. Open a pull request for something trivial and you'll have to wait hours for the CI to finish.

Aqueous · on Nov 22, 2021

ever tried to scale monolithic development beyond 3 teams? the point of micro services is simple: to make technical infrastructure resemble the human organization it is designed to support. that’s it.

destitude · on Nov 22, 2021

And I've been at a company where they did exactly this and turned out to be a complete mess and nearly bankrupted the company (it was even a unicorn).. Without the proper efficient layout and oversight of how a microservice architecture should work for your system you are better off reducing the engineers working on the monolith before telling everyone to switch it over to microservices. Otherwise if you can guarantee the above criteria then yes, I'd agree.

dikei · on Nov 22, 2021

I think many developers like microservices partly because they get to do green-field projects constantly: just start a fresh service when you need a new functionality, etc.

Of course, in time, this might spiral into an uncontrollable mess without discipline, but it feels good at the beginning.

bigus · on Nov 22, 2021

The comparison between microservices vs monolith was too simplistic and should have mentioned how much harder it is to deal with inconsistencies and distribution when working microservices. You want to be a certain company size before going all in on microservices.

throwaway984393 · on Nov 22, 2021

I don't believe that company size has anything to do with whether microservices work. That's the line people are sold, but companies large and small both suck at implementing microservices, and companies large and small can both implement monoliths successfully. I think it really has to come down to the individual case of a given product/project. One product may work way better as one architecture versus another, regardless of how many parts it may involve, due to the context, customers, use cases, organizational model, support model, regulations, etc.

As a general rule, we tend to use the org chart to define our architectures (Conway's Law), but many org charts are just fucked up. We should absolutely fight an architectural design that is mimicking a poor organizational model.

danjac · on Nov 22, 2021

Another problem with Conway's Law is that even if you create a perfect microservices architecture based on your org chart, all it takes is a couple reorgs and/or acquisitions before that architecture is no longer based on reality.

brickbrd · on Nov 22, 2021

Can I find a better reading somewhere else on this?

bigus · on Nov 22, 2021

Martin Fowler has some great comparisons about the two styles: https://www.martinfowler.com/articles/microservice-trade-off...

throwaway984393 · on Nov 22, 2021

That's the microservice take; they should also read Monolith First: https://www.martinfowler.com/bliki/MonolithFirst.html

throwaway984393 · on Nov 22, 2021

A lot of the products I've seen built that attempt to be made of microservices are actually just a really inefficient monolith. A lot of different "services" that are all completely inter-dependent, or all use one database, or use libraries of the other "services", or use the exact same patterns/tools/templates/build systems/frameworks, or hard-code configuration values from other "services", or are actually stateful. My favorite thing is when one team can't deploy a change because they're waiting on another team to deploy a change.

dustintrex · on Nov 22, 2021

The "shared database microservices" antipattern is particularly pernicious. I've seen this a couple of times at enterprises where a mandate comes down from up high to microservice all the things, so they just split the frontend side of monolith into a couple of notionally separate apps but keep the old backend for all state. Ta-dah, microservices!

newprint · on Nov 22, 2021

Microsoft published FAR better free book - .NET Microservices: Architecture for Containerized .NET Applications https://docs.microsoft.com/en-us/dotnet/architecture/microse... It is excellent reading for designing microservices in general, despite the fact that it targets .NET

latch · on Nov 22, 2021

I don't think you can have a serious conversation about microservices without talking about poison messages and ordering guarantees (both with respect to each other and not). What does Pub/Sub provider to deal with those?

(If you're interviewing somewhere and they say they have microservices, ask how they deal with poison messages. The answer can be telling).

thatsamonad · on Nov 22, 2021

I’m not sure if this fully addresses your comment but, similar to AWS SQS, Google Cloud Pub/Sub supports both message ordering[0] and dead letter topics[1].

[0](https://cloud.google.com/pubsub/docs/ordering#receiving_mess...)

[1](https://cloud.google.com/pubsub/docs/handling-failures#setti...)

aliswe · on Nov 22, 2021

Whats a good answer? I want more information ...

latch · on Nov 22, 2021

Tongue-in-cheek answer (but really what I believe):

Don't use microservices.

Short answer:

It's extremely difficult and, there isn't always a good solution. Different approaches are used on a _per consumer_ basis. That's the gist of what you want to hear.

Long answer:

In some cases ordering and lost messages don't matter so much, so any failed messages can just be thrown out.

For some queues lost messages matters, but not ordering. So you can put the message at the back of the queue (possibly delayed if that's support), so you don't block other messages.

Partitioning helps isolated poison messages to a vertical (e.g., customer_id).

Always adding id and timestamp to each message can let consumers behave more smartly (e.g. throwing away out of order or already processed messages).

But when none of that helps and ordering and delivery matter (which I think will make up most cases for any non-trivial system)...I'd love to know too. Good alerts to wake up developers?

Then there's how to scale this and making it reliable What's the ordering guarantee across N consumers? What if a consumer nack's a message. What if there's prefetching?

qaq · on Nov 22, 2021

Paying MS overhead is basically justifiable if you will need to scale your product development team for a single product to some ungodly size so not really applicable to 99.99% of the projects. The original hard constraint of not having a scalable enough RDBMS no longer exist you can use Spanner, CockroachDB etc.

gravypod · on Nov 22, 2021

I highly recommend this topic: https://www.youtube.com/watch?v=-UKEPd2ipEk

jillesvangurp · on Nov 22, 2021

Google does make dealing with dockerized services pretty easy. I'm CTO for a small startup. I actually know how to use Kubernetes, Terraform and have used a lot of configuration languages like puppet, ansible, and chef in the past fifteen years as well.

However, what these have in common is that using them properly can more or less become a full time job for a few months. It's never simple or easy. I've been on multiple teams where somebody was doing that stuff full time for months on end. I don't have that kind of spare time. Either I do product development or I do devops but I can't do both and I certainly can't take two months out of my schedule for this stuff. So, I tend to look for solutions that minimize the amount of devops I need to do to the absolute bare minimum. Additionally, we are a bootstrapped setup, which means I also look for cost savings.

Google cloud run is awesome for this. Last year I wanted to setup CI/CD for a simple service and have it run in Google cloud so we could point our web app at it. This took around fifteen minutes from start to finish. My starting point was a git repository that already had a Docker file in it. It took me a few mouse clicks to get cloud run to generate a cloud build and deploy the first version of that in an autoscaling cloud run environment. Awesome. Love it. The best part is that this setup cost us close to 0$/month while we were developing for the next six months. It had so few requests that we stayed below the freemium tier. It even does websockets now. We actually applied to get access to the beta of that.

Later we ran into some limitations with background threads with cloud run. It aggressively throttles running containers when they are not serving a request. So, I decided to spin up a vm with the same docker container. Unlike AWS, spinning up a vm that runs a docker container is stupidly easy. I just grabbed the same docker container that we used for cloud run and passed that to the vm configuration (in the UI) and it launched in one go with a container optimized OS and our container came up after a few seconds. Every UI screen in Google cloud has a "copy this as a gcloud command line" as a well; so that became the basis for our CD (Github actions). We simply defined a github action that updates the vm. Prototype what you want in a UI and then copy and adapt the command for automation. Great stuff.

Fast forward a year and I had a need to finally make this a bit more proper production environment. So, I bought a wildcard certificate, created a load balancer with it, defined an instance group with an instance template similar to the vm I prototyped earlier and I ended up with a nice auto-scaling service that we can deploy with zero down time. The hardest part was a bit of trial and error to figure out what I needed to do exactly. Took me about a day to get this right. Again, we have a github action that updates this; so full CI/CD.

If you run monoliths, this stuff goes a long way. There's more to our setup of course but mostly I manage to not spend most of my days on devops topics. We will at some point out grow our current setup and hire a full time devops person to scale our setup a bit more responsibly. But actually, this setup already ticks most of my boxes. It's simple enough that I don't actually care about automating how it is created. It's flexible enough that it is easy to tweak. And it has things like logging, health checks, monitoring, alerting, etc. that is relatively easy to manage as well although maybe a little bare-bones.

If/when we move to kubernetes, we'll end up roughly quadrupling our cost. Right now that would not really solve a problem I have. But it's probably a valid next step to take. Until then, monoliths and KISS are what we do.

ZeroCool2u · on Nov 22, 2021

Google Cloud Run is so underrated. The only thing you absolutely need to know to deploy is to use the port assigned via the environmental variable Run provides your container and just be aware that your container is stateless. Everything else is up to you. Like you point out, it's completely portable in a way that rarely anything on AWS is.