It is possible to build balls of mud either with microservices or larger-sized s...

jrockway · on March 6, 2019

> when "easier testing" (from OP) is said to be an advantage of microservices specifically...I am not sure what to say.

I think the service boundary ends up being a very good place to inject "fakes", because that boundary is not artificial like it is when you fake out parts of a monolith. The RPC service has two methods and that is the only way _anything_ can interact with the real service, so faking out those two RPC calls let you write focused tests easily.

Obviously you can have service boundaries in monolithic applications, but they are easy to ignore "just this once". By having an API boundary enforced in production, you avoid these problems (or the workarounds become more creative, but that's easier to say no to).

The average monolithic app sitting in production is quite difficult to test because no thought is given to internal APIs. Things rendering HTML make literal database queries, and so the only way to test things it so just run the whole monolith against a local database. That ends up being slow and flaky.

Basically, microservices forces code to do less. When code does less, it's easier to test. When any HTML page can write to your database, you have a mess on your hands. That is totally orthogonal to microservices, but microservices enforce your API contract in production, which I find valuable.

philwelch · on March 7, 2019

> The RPC service has two methods and that is the only way _anything_ can interact with the real service, so faking out those two RPC calls let you write focused tests easily.

Except now you have to cover cases like “the service is down”, “the service is too latent”, “the actual outputs from the service differ from the documentation”, “the service is behind a reverse proxy that mutates headers in a surprising way”, “you are behind a reverse proxy that mutates headers in a surprising way”, etc.

jrockway · on March 7, 2019

Yes, this is a set of problems that you need to care about. This is why there are so many Kubernetes-types working on "the service mesh". They all do a little bit too much for my taste (linkerd wants to provide its own Grafana dashboard and touch every TCP connection so as to measure it, Istio wants to molest _everything_ in your cluster and make even basic network connections be a full-fledged Kubernetes API object), but there are more reasonable solutions available. Envoy, specifically, is very good. You can use it for outgoing or incoming requests, and it can be configured to, say, retry retryable gRPC requests (solving your "the service is down" issue, at least if only one replica is down). It also scales cleanly to the amount of complexity you want; you can define your clusters in a file... or write a program that speaks a certain gRPC API so it can discover them for you automatically.

Latency is always going to be a problem and moving things to another computer certainly doesn't decrease it. Everything these days has pretty good support for observability; opentracing to inspect slow requests, prometheus to see how things are doing in general. You can get a handle on it and it doesn't cost much. My team is moving from a PHP monolith that has so much framework code that an empty HTTP response takes 100ms minimum to generate. None of our microservices are that slow, even when 3 or 4 backends and gRPC-web <-> http translation are involved. But it does set an upper bound and that's a reasonable concern to have.

Monolithic apps are not freed from latency; they read from disk, they talk to a database server, etc. So application developers already have this under control (or have filed it away in the "don't care" bucket); for example, every function in Go that does I/O probably takes a context. The context times out and it cancels your local operation just as easily as it cancels a remote operation. So I don't think this is a new concern, or one that people should be too afraid of, other than getting that last bit of performance out of the system.

As for proxies inside your cluster intercepting traffic to other pods and mutating headers in surprising ways... I recommend not running one of those. (Yes, those magical service meshes are some of those. If you don't know why you need one, I recommend living without one until you know you need one. It may be never!)

If you don't control your network, you won't have good luck talking to services on the network. It is orthogonal to microservices; you will be using the network more, so one that's bad will hurt you more. But in general, if you control your infrastructure and the nodes on it, you won't run into a "reverse proxy that mutates headers in a surprising way”. If there is one of those, I recommend killing it rather than not splitting up logical services into separate jobs.

williamallthing · on March 7, 2019

FYI Linkerd's Grafana dashboards are purely additive. If you don't like them, ignore them.

We also work hard to make Linkerd incremental. It doesn't "touch every TCP connection" so much as "touch every TCP connection that you explicitly tell it to".

jrockway · on March 9, 2019

Not sure I agree. Running things in production is not free. There is resource usage, attack surface, learning curve, and the chance for something to go wrong. Especially prometheus; it is a high resource usage service using 2G of RAM on nodes that only have 3. I am happy to have one of those in my cluster, but an extra one "just because" actually costs something.

Pods are not free either, at least not on Amazon. You get 18 pods per node (at least on t3.medium, which is what I use), and daemonsets quickly eat into that and make every additional node less useful as you increase cluster capacity. In a world where you're already running aws-node, kube-proxy, jaeger-agent, prometheus-node-exporter, and fluentd, you have to be judicious about the value of additional per-node services. I see the benefit in linkerd, but not all the extra stuff it comes with. Having an envoy cluster do gRPC load-balancing between services is enough; yes, you can't tcpdump the streams, it doesn't transparently add TLS, it doesn't configure itself through Kubernetes objects, and it doesn't quite insert the observability that linkerd does... but it does work well and comes with less tooling and resource cost.

theptip · on March 7, 2019

I like this way of thinking about things. I do wonder if this is one of those areas where the utility/reward is very unevenly distributed across team skill levels; if your team is very disciplined, and has strong alignment on architecture and internal APIs, then there are some significant benefits to testing a monolithic system (e.g. I can mock time and still test the full monolithic system end-to-end).

However if the team is undisciplined (or the org is just too big to achieve tight alignment), then having some enforced architectural boundaries (bounded contexts) inside which the complexity is capped will at least limit the scope of poor architecture inside a specific service, and generally puts a floor on unintended coupling.

There's another dimension too though; I think that microservices require a much higher level of devops / CI/CD maturity to do well. So the maximum value might come from poorly aligned but very ops-savvy orgs, whereas the minimal value would come from highly aligned and disciplined orgs that don't have a strong devops/automation skillset.

Not sure which dimension is more important though.

JamesBarney · on March 7, 2019

I think the article nails it on the head in the first paragraph

>Change their database schema.

>Release their code to production quickly and often.

>Use development tools like programming languages or data stores of their choice.

>Make their own trade-offs between computing resources and developer productivity.

>Have a preference for maintenance/monitoring of their functionality.

Every single one of those is an organizational problem. Microservices are fantastic for solving the problem of "very large organization trying to manage multiple releases from multiple teams with lots of interconnecting dependencies". But they solve that problem with a giant flaming chainsaw with greased up handle. It works but don't use it unless you have to.

dagss · on March 7, 2019

You have an assumption that the initial design/division of responsibility into microservices was perfect.

If not, then no, the unintended coupling is not limited to within a service, it will find its way to the API between the services.

dagss · on March 6, 2019

IMO if you are writing backend code where it is difficult to insert fakes at any point where you need to test, then all bets are off anyway. Doesn't matter if you use monoliths or microservices. If you can't abstract out some dependency and insert a fake without actually hitting the network layer then you shouldn't be writing backend code.

It is more about test-driven development than anything else.

You need a policy to do micro-services in the first place. You can also have policies (enforced by peer review) to have well-tested code and internal API boundaries. It doesn't come by itself, but neither does microservices.

At the end of the day you need development processes driven primarily by automated tests and good coders. Bad coders will make a total spaghetti mess of microservices too, in fact the consequences of microservices can be catastrophic and crippling if the developers don't properly understand how to write distributed systems etc. (trust me I've seen it happen).

Difference is, when you have that micro-service spaghetti mess, refactoring it to something with clean nice boundaries can be harder than with a monolith -- since the position of the boundaries tend to be less malleable. (This varies depending on how the microservices are tested & deployed, and also, how "malleable" and how "hard" a boundary is is really orthogonal.)

If micro-services does anything, it is to raise the bar for what developers can succeed in working with it. Put those same developers at work on a monolith though, and I think the results are the same. It's a filter for good developers, more than the things you say, IMO.

jrockway · on March 6, 2019

Like I said, the value of microservices is that your design is enforced at run time, not just at policy-setting or code review time. There is a lot of value in that; coupling issues can't just "sneak" in.

dagss · on March 7, 2019

But behold the coupled mess you get if your up front design (the boundaries between the microservices) wasn't perfectly chosen or isn't robust against spec changes.

Can easily happen for instance when architects that are too removed from the actual code are choosing the boundaries. Then developers have to work around it.

There will still be coupling issues. It just goes over the network, and is a lot harder to refactor.

https://xkcd.com/2044/

I think this is something like a No True Scotsman fallacy. Somehow people see a tightly coupled mess distributed over the network "not real microservices". Ok, but the people who made it intended to make real microservices, and that is what is important.

danmaz74 · on March 6, 2019

> It is entirely orthogonal. Of course a well-designed microservice is easier to test than a monolithic ball of mud, but a well designed monolith is also easier to test than a poorly designed spaghetti of microservices.

I would also add that a well-designed monolith is inherently easier to test than a functionally equivalent well-designed microservices architecture.

colonelpopcorn · on March 6, 2019

It just goes to show how few folks truly test in isolation.

quickthrower2 · on March 6, 2019

I don’t always, but I’ve never had the luxury of working on a greenfield project. With legacy code it’s a mammoth task to pull apart the spaghetti (more like a drawer of tangled headphones) to get the ideal test.

jimmy1 · on March 6, 2019

A better way of stating it, "It goes to show how few folks truly value testing in isolation" that is, having good tests in isolation is seen as a nice ideal to pursue, but delivering working code first trumps all.

"How do you know it works?" is a common reply to this line of reasoning, obviously we all have an idea of what "working" looks like, it's just whether or not we have gone the extra step of formalizing it in the form of tests.

stingraycharles · on March 6, 2019

Could you elaborate on what you mean with this? It seems like you are hinting at something which I apparently did not pick up.

danmaz74 · on March 6, 2019

Testing components in isolation is useful to ensure that your components work as expected. But it will never ensure that your whole system works as expected, unless the interaction between your components is so simple as to be obviously correct (which, in my experience, is almost never the case).

mlthoughts2018 · on March 6, 2019

> “It is as if some people think "good code" and "microservices" are synonyms. No. They are orthogonal.”

I disagree very strongly, and it is also part of why I believe monorepos are generally a mistake.

Microservices are a natural extension of things like decoupling and Single Responsibility Principle.

Just because you superficially could achieve similar effects with gargantuan amounts of tooling and imposed conventions in a monolith class or something is absolutely no type of refutation of the fact that modularity and separated boundaries between encapsulations of units of behavior represent a better way to organize and structure the design.

It is no different and there is no leakiness to the same abstraction when you move to discuss services instead of classes or source code units, or polyrepos v monorepos. The abstraction definitely can become leaky if taken too far in other domains, it just happens that the abstraction is depicting precisely the same organizational complexity properties in the case of source code -> service boundaries -> repository boundaries.

dagss · on March 6, 2019

Well, but some services composed of microservices turn into a distributed spaghetti monolith though.

They only "naturally decouple" if you draw the lines between units correctly in the first place. And if you are able to do that well, that's the most important step in making any code turn out well regardless of the size of the services and how much of the boundaries are in the same class/repo/process/service. It also correlates with the "tendency to cheat" within a single service.

There ARE real advantages to micro-services, sure. But you trade them against an ability to quickly refactor if it turns out you drew the lines completely wrong initially. Or perhaps you end up with something that becomes very complex that could in fact have been short-circuited and replaced by 5% as much code by looking at the problem from an entirely different angle -- which you never do because of the pattern that has settled in how the micro-services were divided.

(At the end of the day, the code that is simplest to maintain is the one you don't need to run..)

So I maintain that it's a trade-off.

This seems relevant: https://xkcd.com/2044/

mlthoughts2018 · on March 6, 2019

What you propose ends up sounding like this from my point of view:

Salad is generally better for your health than red meat. However, some people eat so much salad, with so much dressing, that it ends up being worse than red meat. Meanwhile, with great care about meal planning and moderation, some people stay pretty healthy eating red meat.

Therefore red meat is actually healthier than salad.

arkh · on March 6, 2019

Currently trying to debug some microservice based system: I read your text with red meat being the microservices.

Things are always perfect when reading a simple blog post presenting the happy path. How to check what services are down, how to react to the fact, how to come back? Nah. The fact your µs RAM access just became ms network access? Don't care. Someone just decided to change the interface of their microservice so 2 others are not compatible anymore? That's just "microservice done bad". Being able to see the flow of things and add breakpoint where you need it? Nope.

It is funny to see this kind of problem when you've already experienced them in the embedded world with software components in cars or just in distributed computing.

Most application will never see the kind of scale where adding the kind of code and tools overhead have any RoI. So you end with products released too late or products so brittle you may has well not have launched.

mlthoughts2018 · on March 7, 2019

> “Currently trying to debug some microservice based system: I read your text with red meat being the microservices.”

This doesn’t make much sense, unless you’re debugging normal-to-high quality code microservices, and still find the code to be worse than average case monolith services.

dagss · on March 8, 2019

He is explaining it further down! For instance, he cannot set a breakpoint and follow execution flow (because suddenly the flow resumes in another microservice)

It seems from your comment that you assume one can always work with only one service, and not need to consider the whole system of services acting together. That is naive.

mlthoughts2018 · on March 8, 2019

I do not assume you can work only with one service. But what you brought up from the parent comment about breakpoints makes no sense.

It’s like complaining that someone mocked out a complex submodule in a unit test, so your breakpoint descends into a mock instead of the real thing. You’re mistakenly wanting the wrong thing.

Testing that spans service boundaries is a known entity. Most of the time you want to be testing one service in isolation and mocking out any dependency calls it makes to other services.

But in cases when you want to do integration or acceptance testing involving multiple live services, that’s fine too. You could for instance run the suite via something like docker-compose.

But if you want the debugger to step through the internals of some effectively third party dependency, that’s just a poor approach to debugging. You need to mock that away and isolate whether the third party entity (whether it’s an installed package, separate service, whatever) is really to blame before descending to debugging in that entity.

Imagine if someone is debugging a data processing pipeline task. It makes a service call to a remote database. You really think your debugger should follow the service call and step through the database’s code? That’s a terrible way to debug. That example extends perfectly well no matter what the service call is into, whether it’s local or remote, whether it’s in the same language or runtime or not...

dagss · on March 6, 2019

Well in that context, red meat is obviously healthier if you know up front that the people in question are going to pile on dressing...

Context is everything.

I am mainly advocating that it depends on the kind of coders on the teams, how many teams, how sure you are about the up front design / boundaries between services (I have seen such boundaries drawn VERY wrong, so wrong that nothing else ever mattered), how sure you are about the spec, etc

Start with monolith and refactor smaller services gradually as the design solidifies...

balfirevic · on March 6, 2019

> Well in that context, red meat is obviously healthier if you know up front that the people in question are going to pile on dressing...

I think you're being too charitable accepting this analogy at all - somehow microservices are presented as something obviously and inherently better (salad) versus non-microservice approach (red meat).

If we're going down the route of silly analogies which are terrible way to argue anything, how about this:

Non-microservice architectures are normal diet of meat, fish, vegetables, grains and sugar which you can keep under control if you have any idea of what you're doing. Microservices are gluten-free diet - very popular for no good reason, it makes everything harder and you should only pursue it if you have very good reason to and you understand the cons.

mlthoughts2018 · on March 7, 2019

> “somehow microservices are presented as something obviously and inherently better (salad) versus non-microservice approach (red meat).”

Yes, this is called the Single Responsibility Principle, in this case applied to service architecture. More generally it is a property of modularity and decoupling.

All else equal then satisfying these properties is better than not satisfying them.

The all else equal assumption clearly holds in practice, where people write equally awful code in both cases and so microservices introduces no additional tech debt yet it does introduce SRP and modularity benefits.

Could you find specific examples of monolith services with small enough tech debt that they outperform some specific other example using microservices? Of course.

Does this matter for reasoning more generally about which pattern is better ceteris paribus? Very little, probably not at all.

dagss · on March 8, 2019

I am not convinced that microservices always causes less coupling as you claim.

Sometimes the coupling just jumps into the network/API layer. (Why would it not?) This can happen unless your initial divison into services was perfect (and if you indeed have that much foresight, there would be no reason why a monolith would accrue tech debt either, there would be no temptation to add debt).

The main difference is that when you discover that the initial divison into "Responsibilities" were wrong, it is easier to change and come up with another set of "Responsibilities" in a monolith and deploy the refactored service as a unit.

You talk as if you can just initially define the Single Responsibility then things will be fine. But where I have seen real failure is in identifying those initial responsibilities and choosing the wrong way to look at the total system.

My experience is with monoliths having less coupling and I suspect that the cause is that monoliths are easier to refactor as the requirements change; refactoring the very structure of the service mesh while keeping things running is such a big task that one is more tempted to start adding hack in the API layer.

Yes, one is then violating the Single Responsibility Principle. But if an organization sits there and needs to change the requirements within some deadline -- it is not going to spend 3x the cost and time because a hack violates some principle -- and the alternative is the wrong service taking on some extra work.

If you want to retort "but then they are doing microservices wrong" then I say No True Scotsman. And one could say exactly the same about monolith tech debt too..

mlthoughts2018 · on March 8, 2019

> “The main difference is that when you discover that the initial divison into "Responsibilities" were wrong, it is easier to change and come up with another set of "Responsibilities" in a monolith and deploy the refactored service as a unit.”

This is generally not true in my experience, because the degree of implementation-sharing and reliance on common leaked abstractions is so high in monolith codebases.

Through great concerted effort, some highly disciplined teams might not fall into that ubiquitous problem of monoliths and for those exceedingly rare teams your way of thinking could work. But this is so rare it is inapplicable when considering which approach to use in general cases.

I’ll also say that I’ve worked on several monolith services and several microservices stored in dozens to hundreds of separate repos. The tooling cost to make either pattern work at scale was the same, but refactoring was so much easier with polyrepos that each isolated services. Just spin up a new repo and redraw the service boundaries.

Finally, many times services become associated with a fixed, versioned API, and must support backward compatibility for long periods of time. In these cases, redrawing service boundaries is usually not desirable regardless of initial mistakes, until you hit a point when you can release a new major version of the services. In the polyservice / polyrepo case, this is very easy, and the repos and separated code for v2 need not have anything to do at all with v1, and can be developed entirely in parallel, with mocked out assumptions of service boundaries or reliance on legacy v1 stuff.

dagss · on March 8, 2019

Actually I wonder if your reasoning is circular..

If you saw a coupled mess of microservices with a lot of technical debt you would probably say that it is not "Microservices" because it is violating the Single Responsibility Principle all over the place. They just tried to do microservices -- but didn't manage to -- so do you then count it as a failure of microservices thinking?

If not then propose a new architectural alternative: The Debt-Free SRP Monolith!!

Sadly, organizations cannot choose to either make a SRP Microservice system or a Debt-Free SRP Monolith. They can only attempt. And I am yet to be convinced to attempting Microservices is that correlated with achieving SRP.

mlthoughts2018 · on March 8, 2019

I don’t think it’s circular at all. I’m saying that if the level of tech debt is held equal between both a microservice design and an analogous monolith design, then the fact that the microservice design has greater properties of decoupling, isolation and modularity make it de facto better.

Obviously if the baseline levels of tech debt or poor implementation are not equal, all bets are off.