Great article. Main takeaway: microservices is not the only option when managing big codebases. In a parallel universe I imagine that the coolest trend in software development right now is a tool for monoliths: all code in a single repo, independent deployable components, contracts in the boundaries and mockable dependant components where needed. As opposed to our universe in which building microservices is the non-official way to go.
What you describe is microservices developed in a monorepo, and a lot of companies (including the one I currently work at) have gone this route.
Some people might disagree, but imo the cult of microservices does not require 1 repo per microservice.
The tools you describe are build-graph management tools (bazel pants buck etc) and rpc tools (gRPC + protobufs, cap'n proto) and they are indeed pretty cool, albeit to a niche crowd.
I think the key difference here is that there is no network in between components in a componentized monolith, each component runs the entire “monorepo”
Whether there’s actually network between components is something a platform team can handle based on their best judgement. Having collections of containers that always run together is a common pattern.
This negative-space definition of "monolith" is unhelpful to the point of irrelevance. It's unreasonable, in the sense that adopting it gives us nothing to reason about, as with the comment above. By such a standard the last monolithic in-service system was a Burroughs mainframe ca. 1975. I've got statically linked binaries that would fail this definition.
Even the plainest Rails application depends on network traffic, including to communicate with parts of itself. It cannot function without an operating system, which is also talking to parts of itself via network protocols, and this runs on a server whose internal buses are themselves a distributed system.
It's networks, all the way down, and a heads-in-the-sand attitude doesn't help us reason about performance, reliability, scalability, maintainability et cetera.
Put this in a "Falsehoods programmers believe ...": calling a stateless function in a stack-based language to compute an immutable result won't lead to a network call.
Monolithic applications are defined by something they are, not something they don't do, and what they are is a single unit of code for development and deployment purposes that includes everything necessary to fulfil an entire system's purpose. The issue of intentionally crossing a network boundary, and when, and why, is an dependent topic in comparative systems architecture, but it's analytically orthogonal.
* Perform larger refactorings or renamings without considering deployment staggering or API versioning
* testing locally is far easier
* Debugging in production is far easier
* Useful error stack traces are included for free
* Avoid (probable in my experience, at least in larger security software organizations) dependency on SecOps to make network changes to support a refactoring or introducing new components
If an organization is or will pursue a FedRAMP certification, as I understand it, that organization must propose and receive approval every time data may hit a network. Avoiding the network in that case may be the difference between a 50-line MR that's merged before lunch and a multi-week process involving multiple people.
FWIW, I think that gRPC/protobufs have pretty compelling answers to each of the historically-valid complaints you've listed here.
- cpu cycle overhead: this is valid if the overhead is very high or very important. otherwise, most companies would love to trade off cpu cycles for dev productivity.
- refactorings/renamings without deployment staggering. protobufs were specifically designed with this in mind, insofar as they support deprecating fields and whatnot. However, writing a deprecatable-API is a skill, even with protos. If you have many clients and want to redo everything by scratch, you will have problems.
- "testing locally" (which I take to mean integration testing locally) is the only one that requires some imagination to solve, assuming all your traffic is guarded by short-term-lease certs issued by vault or something similar. But even this is quite achievable.
- error stack traces included for free: may I introduce you to context.abort(). It's not a stack trace by default, but you can actually wrap the stack trace into the message if you so-care to. opentracing isn't quite free, in a performance sense, but in a required-eng-time-to-setup-and-maintain-sense, it is pretty cheap.
- dependency on secops to make network changes: I've never encountered this, but I bet you that a good platform team can provide a system where application teams effectively don't need to worry about this. It's impossible to overcome this challenge in an existing company that's used to doing things this way, though.
The original poster's point was CPU and network overhead. A local procedure/function call or message-send takes on the order of one or up to a few nanoseconds. Depending on how you organize things, an IPC is going to be in the microsecond or even millisecond range. That's a lot of orders of magnitude. It's also latency that you just aren't going to get back, no matter what extra resources you throw at it. [1][2]
In the early naughties, a rewrite of very SOA/microservice-y BBC backend system I re-architected as a monolith became around 1000x faster. [3]
In addition, in-process calls are essentially 100% reliable. Network calls, and various processes attached to them, not so much (see [1], again). The BBC system not just became a lot faster, it also became roughly 100 times more reliable, and that's probably low-balling it a bit. It essentially didn't fail for internal reasons after we learned about Java VM parameters. And it was less code, did more, and was easier to develop for.
Ah gotcha, thank you for locking-in on the issue. You're absolutely right that network hops introduce overhead (I was intending to wrap i/o blocking on network calls under the banner of cpu cycles, adjacent to serialization)
Like any other design decision, there's a trade-off here. (see my other comments in this tree, about how many 9's in reliability/latency you're targeting).
If you're working in an environment where sub-5ms latency to the 4th or 5th 9 is critical, inter-machine communication is not for your application, period.
Reliability, as an orthogonal concern, is one that has improved incredibly since the early aughts. The "transport" and error-handling layer of open-source RPC frameworks has improved by orders of magnitude. I'd recommend taking a long look at the experiences of companies built on gRPC.
It's much easier to build a reliable SOA-esque system today than it was even 5 years ago. It's been an area of rapid progress.
However, I find the way you framed these trade-offs decidedly...odd, in terms of "who needs that kind of super-high performance and reliability????", as if achieving that were only possible through herculean effort that just isn't worth it for most applications.
The fact of the matter is that a local message-send is also a helluva lot easier than any kind of IPC. Also easier to deploy, as it comes in the same binary so is already there and easier to monitor (no monitoring needed).
So the trade-off is more appropriately framed as follows: why on earth would you want to spend significant extra effort in coding, deployment and monitoring, for the dubious "benefit" of frittering away 3-6 orders of magnitude of performance and perfect reliability?
Of course there can be benefits that outweigh all these negatives in effort and performance/reliability, but those benefits have to be pretty amazing to be worth it.
> as if achieving that were only possible through herculean effort
I encourage you to reread my comments, I'm not suggesting anywhere that high-performance requires exceptional effort.
In fact, I'm actively admitting that for applications where high-performance is required, IPCs/RPCs are not an option.
> just isn't worth it for most applications
Performance is valuable, but it's one dimension of value.
My premise is that, given the maturity of RPC frameworks and network tooling in 2020, most already-networked applications can afford to trade the performance hit of additional hops on the backend.
Whether what you get in exchange for that performance hit is valuable?
That is mostly a function of the quality of your eng platform.
> a local message-send is also a helluva lot easier [on the programmer?] than any kind of IPC
This strongly depends on your engineering org, although it seems like this is the point that's hardest to imagine for some people.
If you're on a team that depends on the availability of data maintained by N other teams,
(given the maturity of RPC Frameworks and network tooling in 2020, again)
It is much easier to apply SLOs and SLAs to an interface that's gated by an RPC service.
> spend significant extra effort in coding, deployment and monitoring
The extra effort here is made completely negligible by the existence of a decent platform team.
FWIW, I wouldn't be able to imagine it if I haven't experienced it myself.
> benefits have to be pretty amazing to be worth it
I still think you're overestimating some of the costs (see above). FWIW, I've worked in an RPC-oriented environment for years now, and reliability has never been a concern. Our platform team is pretty good, but we are not a Google-esque company (200 engineers, including eng managers)
The performance trade-off has been demonstrably worthwhile, because we've used it to purchase a degree of team independence that would not have been otherwise possible.
>In fact, I'm actively admitting that for applications where high-performance is required, IPCs/RPCs are not an option.
But you're framing it as "...for applications where high-performance is required", as if taking the performance, expressiveness and reliability hits should obviously be the default, unless you have very special circumstances.
My point is, and continues to be, that it's the other way around: you should go for simplicity, reliability and performance unless you have, and can demonstrate you have, very special requirements.
Thrift or protobuf is a huge step up from the alternatives, but you still have a lot of overhead. Generics are limited and you're essentially forced to "defunctionalise the continuation" everywhere: any time you want to pass a callback around you have to turn it into a command object instead.
> Can you share some examples of the generics problem and "defunctionalizing the continuation"?
Well, the generics problem is that you don't have generics. So you just can't define a lot of general-purpose functions in gRPC, and have to make a specific version of them instead. Even something like "query for objects like this and then apply this transform to the results" just can't be done, because there's no way to pass the transformation over the wire, so you have to come up with a datastructure to represent all the transformations that you want to do instead. "Defunctionalizing the continuation" is the technique for doing that, https://www.cis.upenn.edu/~plclub/blog/2020-05-15-Defunction... is an example, but it's a manual process that requires creative effort each time.
> Does google's `any` package help with the generics problem you describe? (Acknowledging that it's obviously clunky)
Not really, because you don't have the type information at compile time. Erased generics are fine in a well-typed language, but just using an any type you can't even do something like: a function that takes two values of the same type.
If you call a piece of functionality from your own single deployable that you are refactoring, it’s much more like refactoring a function call than if it were an independent micro-service across a network.
I think there used to be, before "off-the-shelf" RPC frameworks, service discovery, and the like were mature. There still are, for very small companies.
In 2020, if you have an eng count of >50: you use gRPC, some sort of service discovery solution (consul, envoy, whatever), and you basically never have to think about the costs of network hops. Opentracing is also pretty mature these days, although in my experience it's never been necessary when I can read the source of the services my code depends on.
Network boundaries are really useful for enforcing interface boundaries, because we should trust N>50 programmers to correctly-implement bounded contexts as much as we trust PG&E to maintain the power grid.
That being said, if you have a small, crack team, bounded contexts will take you all the way there and you don't need network boundaries to enforce them.
It depends on your speed requirements and whether calls are being sent async or not. Also keep in mind that even with internal apis, an api call is usually multiple network boundaries (service1 --> service2 (potential DNS lookup) --> WAF/security proxy --> Firewall --> Load balancer --> SSL handshake --> server/container firewall --> server/container). Then you get into whether the service you're calling calls other apis etc. You can quickly burn 50ms or more with multiple hops. If you're trying to return responses within 200ms you now have very little margin.
Acknowledging that there are indeed many hops, I think it might be a bit disingenuous to say 50ms is easy to burn, depending on what p-value we're talking about.
IIRC, a typical round trip service call at my current place of work (gRPC with protobufs, vault/ssl for verification, consul for dns, etc) carries a p99 minimum latency (i.e. returning a constant) of around 2ms.
A cold roundtrip obviously takes longer (because DNS, ssl, etc).
It depends on how many 9's you want within 10ms, but there are various simple tricks (transparent to the application developer) that a platform team can apply to get you there.
As a sidenote on calling other APIs, my anecdata suggests that most companies microservice call graphs are at most 3-4 services deep, with the vast majority being 1-2 services deep.
This doesn't show the call graph, but it does demonstrate how many companies end up building a handful of services that gatekeep the core data models, and the rest simply compose over those services: https://twitter.com/adrianco/status/441883572618948608/photo...
Ignoring some of the deployments and dependencies related aspects of microservices Vs monolith, one aspect that has me convinced against my own ideals is that a micro-service has a "strong" boundary in that it is actually difficult and effortful for a developer to cross it.
This in turn has a positive effect in maintaining proper boundary and putting the right amount of thought about the interfaces and responssability of each component.
It's all tradeoffs. You get a stronger boundary, but you also get a distributed system.
Also, the first try of drawing boundaries will always be varying degrees of wrong. If you have very strong boundaries at this stage, iterating on them, moving responsibilities around, can be harder.
Also, with the right tooling it's definitely possible to harden monolith internal boundaries to a comparable level.
I can see though how many smaller companies would not be in a position to build that tooling.
Anyway... there is no either / or here, as I've explained in another comment. What if you have components within a monolith, but each component has its own database, for example? What if test suites are completely isolated, so that tests for component A can not access code in component B?
You can get pretty strong boundaries with a few comparably simple tricks.
A decent module system can achieve the same thing without all of the network overhead. Even just something like maven multi-module projects makes it hard enough to accidentally cross the boundaries.
The lack of compile time in the ruby world really makes it difficult to do a lot of work there. :P
There's a nice Ruby trick btw where you put significant precalculations in constants, since the value of a constant gets computed during program startup it still allow you to do work "up front" instead of during a web request.
I just want to caveat this as it is not a Ruby construct, it's part of Ruby web servers. Because they are long-living Ruby processes they are only loading files once (I suppose it's similar to compiling). This means it runs all globally-scoped code, which class definitions and constants (generally) are. That's actually what Ruby bootloaders like Spring and Zeus are doing on your dev machine to speed up the load time when you use Rails commands. They cache all that globally run stuff in their own process. It's also why they run into a bunch of issues when you have logic in your constant definitions.
Yep, that's a good trick. At a previous PHP shop we had a large amount of static XML configuration (well it was generated, but not that often). Converting it all to PHP arrays and including it was significantly faster than parsing the XML on each request, and then PHP would cache that result too. Re-running the XML->PHP tool just caused it to re-include/cache these giant arrays of static config. It worked great. I mean, arguments about whether that was a good design or not aside...
(edit to reply since I can't reply to a reply to a reply)
Yep, it is very common in lisp/smalltalk environments to dump the state of the world to disk and re-load it later. This is one of those tricks that gets relearned every generation. :)
For bonus credit apply this analogy to docker images. :)
I did this in PHP once as well. I had to code a coupon lookup site where people entered coupon codes and they were verified against a database. I forget how many coupons there were... pretty sure it was less than 100,000.
Anyway, I coded it up in my local dev environment. Unfortunately, it turned out that I'd been mislead and the actual deployment environment didn't have a database server available.
In desperation and facing a deadline, I dumped all the lookup values into an array in a PHP file. As you said, it was really quite performant. The first request after starting the server was a bit slow (but not too bad... still < 10 seconds I think) and after that things were golden.
I felt a bit dirty, but things worked and we got paid.
What? Don't global variables exist in pretty much every language on the planet? This isn't a "trick", it is a bad practice which should be avoided in any language.
Imagine describing globals in C, Python, or JavaScript, or static fields in Java or C# as a "neat trick"...
Where they are not code? I think you meant that Ruby class is defined at runtime with sequential imperative or functional code. Ha! You can build class from Ruby code in a string too. A lot of choice.
Reminds me of OSGi. Great idea, poor adoption, mostly due to the complexity of the problem domain. Microservices are a lot worse in that regard, yet remain a whole lot more popular, sadly.
That's what tools like https://github.com/nrwl/nx try to facilitate. Mainly facilitating separating the code in boundaries like backend, frontend, and common libraries.
I haven't really used nx myself :-p. But I would love to use something similar for other frameworks or languages.
A number of years ago, I worked on a team (~20 engineers in total) that successfully carved off two relatively independent portions of a large Rails app using engines. I'm happy to see that Shopify is also using that strategy.
I'm curious to know more what sorts of challenges they have around managing dependencies across engines — I think what we were doing was fairly vanilla Rails, and we didn't have the opportunity to run into those sorts of issues.
The answer to that question could probably fill another blog post :D
Long story short, Rails and dependency inversion equals lots of friction. The whole framework is built on the assumption that it's OK to access everything from everywhere, and over the years we've built lots of tooling on top of those assumptions.
We also have a GraphQL implementation that is pretty closely coupled to the active record layer and _really_ wants to reach into all the components directly.
All of those problems can be overcome, but this is definitely an area where we have to working against "established" Rails culture, and our own assumptions from the past.
Do you envision any extension points to the way engines are implemented that could better enforce boundaries? In our engines, there was nothing that referenced another engine's resources, leaving the main application to handle route mapping and ActiveRecord associations between app models and engine's models.
I feel like the use-case for engines has long been around supporting framework like functionality (Devise, Spree, etc), but I wonder if there are changes to be made that better support modularization for large apps.
- same database
- same runtime
- same deployment
- same repository
That said, I don't think this is an either/or. It's a spectrum. you can have components within the same runtime and repository that have separate databases, or components that are using the same database but live in separate repos, etc.
From one monolithic app towards fully separated microservices is a spectrum, and I think developers should be enabled to move freely around that spectrum.
I think components are the better option. Because it allows for separation of concerns without introducing deployment ...or worse : political complexity.
I worked on a large Rails monolith a few years back with a similarly-sized team and we took the "components+engines" approach too.... and it was a bit of a nightmare, honestly. It sort of felt like the worst of both worlds, relative to monoliths or microservices.
I strongly suspect, but cannot prove, that we would have been better off simply transitioning to "macroservices" -- breaking the monolith up into several (as opposed to dozens) of reasonably sized pieces.
• We were encouraged to componitize everything. When I left, we were up to a few dozen components, and the number was climbing rapidly. I'm not sure if the approach itself was the problem, or if the flux during the transition period was the real pain point.
• We had no real enforcement of interfaces between components. It was so easy to break things in other peoples' components.
• Theoretically that breakage would be caught by tests. But to catch that breakage, you needed to run the complete test suite (30-60 minutes) rather than simply testing your own component
• Essentially, it felt like we were suffering all the disadvantages of microservices, with the exception of coordinating deployments; from a devops perspective it was still just a single monolithic deployment
• We still had many of the problems associated with monoliths, such as slow deployments, long test suite times, and extremely high per-instance RAM usage
• Various small tooling and debugging issues related to using Rails but going too far "off the Rails"
I'm looking forward to digging into the linked article and learning how Shopify solved those issues. They seem to have quite a bit of engineering firepower at their disposal. Our management did not allow us to dedicate a lot of resources to internal engineering concerns like this.
(We essentially had one guy figuring it all out himself, and due to internal politics he was forbidden from considering a microservices or "macro services / multiple monoliths" approach. He was talented and did the best he could, considering)
> When I left, we were up to a few dozen components, and the number was climbing rapidly.
I should have included this in the blog post: The number of components _needs_ to be kept small. Shopify's main monolith is 2.8 million lines of code in 37 components, and I'd actually like to get that number _down_.
I like to compare this to the main navigation that we present to our merchants. It's useful if it has 8 entries. It's not useful if it has 400.
In a way, components are the main navigation to our code base. A developer should be able to look at what's in our "components" folder and get a general impression of what the system's capabilities are.
That's an excellent (and hard-earned, I'm sure!) insight. Thank you.
I like to compare this to the main navigation that we
present to our merchants. It's useful if it has 8
entries. It's not useful if it has 400.
Yeah, we essentially wound up with a "junk drawer" of components. I could see a lot of companies, like ours, making that mistake -- turning all the things into components.
As you said in the article, one of the benefits of components for you was that it truly forced you to think about a proper separation of concerns. In hindsight, that's an area where we really missed the mark for a variety of reasons, some methodology-related.
We practiced a rather strict version of Scrum. Management paid a lot of attention to our velocity from week to week: we needed to rack up those story points.
But, outside of the tiny team dedicated to the component effort, there were no story points to be had for supporting that effort. Therefore we were in fact incentivized not to support it. I remember one sprint where I did some refactoring work in order to achieve a better separation of concerns. It negatively affected our velocity for the week and that was noticed.
So, we were receiving a schizophrenic message from management. We were all to support the component effort.... but on our own time, apparently?
He might be referring to Shopify GMV (Gross Merchandise Volume —the value of commerce facilitated by the platform), which is probably approaching $100B per year.
No shopify market cap is $100bn; which is much higher than I expected. So I looked up their revenue, which is $1.6bn and they have an income deficit. so...
I think "100 billion business" means that the business (Shopify) is valued at $100 billion. (I'm not sure if that's true, that's just how I interpreted it)
Yes, you're right. If only they sprinkled some Beans over everything all their scale problems would just disappear. Thousands of happy developers would have all worked on one big happy Spring application with no problems whatsoever.
Reading all those blogs from Shopify show that they spend a lot of time fighting a slow language.
It reminds me of Facebook and their Hack stuff, it's pretty much the same in what Shopify is getting into, they have something slow and really big and not way to get out of it so they just poor money to make it fast even if it means only the syntax resemble the original language.
Some compagnies faced the same problem, quick quick release something to iterate fast ( Rail / Python) but then after when it gets too big you're in real troubles and stuck with it. Twitter, Youtube, Facebook all had that problem.
Everyone knows that only the best enterprise programmers apply to java positions, and when they somehow manage to pry themselves through the screening, they will help you bicker undecisively for hours when adding even the smallest functionality because they love bringing everything to the table at once instead of even considering to produce value in order to prove themselves as a valuable and knowledgeable part of the team and not the work they do. So it's a win from the start. Especially if you have 400 people in your team with 54 well-documented gatherings under their belts.
we are actually doing precisely the same thing at instacart (breaking our 1+ million lines of code monolith into discrete components, which we call "domains"), and typing the boundaries and as much of the internals of these domains as possible with sorbet types.
this has the benefit of ruby dynamicism (fast development within domains, you can use all the nice railsy tooling, activerecord, and all the libraries we've built over the years), with type safety at the boundaries (we've also put in timeouts, thread separation, and error handling at the boundaries).
the additional benefit for using sorbet is that it makes making typed RPC calls (over twirp or graphql) much easier as you can introspect the boundaries trivially.
really cool to see other companies evolving similarly given the same starting conditions!
It's certainly related. In very general terms, I would say splitting a Rails app into multiple engines is the same pattern as umbrella applications.
However, there are more interesting specifics here about things like all engines sharing a database, but having exclusive ownership of tables, as well as splitting HTTP routing over multiple engines etc.
I think you'll also find a lot of conceptual overlap with Phoenix Contexts; they'll generally all start as part of the same monolith/app but are sufficiently discrete that you can separate them out more easily than the Rails situation in TFA.
Am I the only one who has a distaste for this phrase "component based development"? It just seems like a fancy way of saying object oriented programming without an overarching design pattern.
that's correct, at least for us a domain encapsulates many response types and dozens of different APIs that wrap various datastores, business logic, etc.
we've actually taken the pattern of making the classes relatively stateless, and explicitly passing around typed state through these explicit apis. it's not really the same design pattern and imo conceptually different.
hah, it's not hell but it's not entirely pleasant either. a _lot_ of that is tests, which is essentially how contracts and safety is enforced in ruby (at least prior to types).
Ours kind of organically grew over time, but as I've been keeping it alive for the last few years I have a pretty good idea of how I would start it fresh.
You probably have some people in the company who either know much more about architecture than others, or are working on projects that are more interesting in terms of architecture. Find one of them, convince them to give a 15 min talk.
Announce the talk widely within the company, tell people to come to the new "architecture guild" slack channel you created to get the details / invites.
Schedule an hour to give plenty of time for discussions after the talk.
We're not using zoom, but google meet - but yes, these happen completely online now.
I find that people that are doing interesting stuff often _want_ to talk about it. However, a big part of Shopify culture is "do things, tell people" - it is definitely encouraged to spend time spreading context.
It's not directly part of any rewards framework, but one metric that goes into promotions is the area of impact. By giving a talk to the guild, you can have impact on a group that's larger than your team, potentially the whole organization. It counts.
But another reward is the positive feedback, interesting discussions and new connections that you make through this.
Have y'all seen any issues around autoloading of classes/modules in development? I've been working on a rails app composed of a handful of engines and I've noticed that every so often classes aren't loaded. 6 seems to be a lot better about it than 5 was.
Interesting article. We use a similar approach for our mobile apps to allow multiple teams to develop their own modules independently.
Can anyone speak to what the advantages and disadvantages to such an approach are as opposed to going full Kubernetes / Microservices? Is it that deploys are riskier and you can't scale separate pieces independently?
The application is a Rack application reusing some of the components of Rails, but it is not a conventional Rails application given it doesn't need most of the framework.
There’s so much truth in this. It’s full of lessons I tell clients at the outset of similar endeavors yet they often do not heed until they experience the pain first hand.
Interesting read. I've seen a component based rails architecture work wonders for cleaning up a codebase and allowing for the benefits of a SOA encapsulation while still keeping everything under a monolithic architecture (and avoiding the networking nasties). Not such a fan of sorbet though, but hopefully something better comes along.
I can imagine that this has been a huge effort, and kudos to the team, but this is a solved problem; there are ample methodologies to resolve the big ball of mud.
IMHO the shopify team could have saved a lot of time by getting some schooling about strategic DDD, and consulting one or more DDD experts to draw a first version of their context map.
I think it's kind of bad that we have this trend to use "walls" to enforce modularity. This whole thing about using "walls" to enforce "developer behavior" is, in my humble opinion, the wrong direction.
If you think about it, almost all lack of modularity comes from shared mutable variables. Segregate mutability away from the core logic of your system and the smallest function in your architecture will become as modular as a microservice.
Really, any function that is stateless can be moved anywhere at anytime and used anywhere without fear of it being creating a permanent foothold in the architectural complexity of the system. So if the code is getting to structured where you become afraid of moving things... do this rather than build classes and walls around all your subroutines.
Remember as long as that add function doesn't mutate shared state you know it has zero impact on any part of the system other than it's output... you can replace it or copy it or use it anywhere.... this is really all you need to do to improve modularity of your system.
>Again and again we pondered: How should components call each other?
I think this is what's tripping most people up. They think DI IOC and OOP patterns are how you improve modularity. It's not. Immutable functions are what improves modularity of your program. The more immutable functions you have and the smaller they are the more modular your program will be. Segregate IO and mutations into tiny auxiliary functions away from your core logic which is composed of pure immutable functions.
>Circular dependencies are situations where for example component A depends on component B but component B also depends on component A.
I've never seen circular dependencies happen with pure functions. It's rare in practice. I think it occurs with objects because when you want one method of an object you have to instantiate that object which has a bunch of other methods and in turn dependencies that could be circular to the current object you're trying to call it from. In essence this kind of thing tends to happen because when you call a method you're actually calling a group of methods and state within a class and upon all those dependencies as well increasing the chances of a circular dependency.
Still I've seen this issue occur with namespacing when you import files. Walls aren't going to segregate this from happening. You need to structure your dependencies as a tree.
> Really, any function that is stateless can be moved anywhere at anytime and used anywhere without fear of it being creating a permanent foothold in the architectural complexity of the system.
That's not really true. A pure function can still be coupled to a particular internal data representation. It can still assume particular invariants that you may not want to maintain. Namespacing functions together with the data structures they operate on is still a good idea, and helps with keeping a coherent model at each level - e.g. if your business logic is calling a function that's about the specific mechanics of encoding data for Redis, you're probably using the wrong abstraction.
Pushing mutability to the edges is good and useful but it's not the be-all and end-all of decoupling. Enforced walls are a much better idea than spending your discipline budget on maintaining decoupling by hand. A lot of the time a pure function can actually be decoupled completely from the datatypes it's operating on by using parametricity (and maybe a standard typeclass that the datatype it operates on conforms to), but you may not notice that unless you've got some module boundaries that nudge you to think about that kind of thing.
I think it's kind of bad that we have this trend to use hardware to enforce modularity. If it's a performance issue, sure break it up into more hardware. If it's just code modularity than by shifting to microservices you are adding additional complexity of maintaining multiple services on top of modularizing the system. In short it's overkill. This whole thing about using hardware to enforce "developer behavior" is stupid. You can use software to enforce developer behavior. Your operating system, your programming language is already "enforcing" developer behavior.
Additionally, your microservices are hard lines of modularization. It is very hard to change a module once it's been materialized because it's hardware.
If you think about it, almost all lack of modularity comes from shared mutable variables. Segregate mutability away from the core logic of your system and the smallest function in your architecture will become as modular as a microservice.
Really, any function that is stateless can be moved anywhere at anytime and used anywhere without fear of it being creating a permanent foothold in the architectural complexity of the system. So if the code is getting to structured where you become afraid of moving things... do this rather than go to microservices.
>We can more easily onboard new developers to just the parts immediately relevant to them, instead of the whole monolith.
Correct me if I'm wrong but don't folders and files and repos do this? Does this make sense to you that it has to be broken down into hardware?
>Instead of running the test suite on the whole application, we can run it on the smaller subset of components affected by a change, making the test suite faster and more stable.
Right because software could never do this in the first place. In order to test a quarter of my program in an isolated environment I have to move that quarter of my program onto a whole new computer. Makes sense.
>Instead of worrying about the impact on parts of the system we know less well, we can change a component freely as long as we’re keeping its existing contracts intact, cutting down on feature implementation time.
Makes sense because software contracts only exist as http json/graphql/grpc apis. The below code isn't a software contract it's only how old people do things:
int add(x: int, y: int)
Remember as long as that add function doesn't mutate shared state you know it has zero impact on any part of the system other than it's output... you can replace it or copy it or use it anywhere.... this is really all you need to do to improve modularity of your system.
Editing it on the other hand could have some issues. There are other ways to deal with this and simply copying the function, renaming and editing it is still a good solution. But for some reason people think the only way to deal with these problems is to put an entire computer around it as a wall. So whenever I need some utility function that's located on another system I have to basically copy it over (along with a million other dependencies) onto my system and rename it... wait a minute can't I do that anyway (without copying dependencies) if it was located in the same system?
>Again and again we pondered: How should components call each other?
I think this is what's tripping most people up. They think DI IOC and OOP patterns are how you improve modularity. It's not. Immutable functions are what improves modularity of your program. The more immutable functions you have and the smaller they are the more modular your program will be. Segregate IO and mutations into tiny auxiliary functions away from your core logic which is composed of pure immutable functions. That's really the only pattern you need to follow and some languages can enforce this pattern without the need of "hardware."
>Circular dependencies are situations where for example component A depends on component B but component B also depends on component A.
I've never seen circular dependencies happen with pure functions. It's rare in practice. I think it occurs with objects because when you want one method of an object you have to instantiate that object which has a bunch of other methods and in turn dependencies that could be circular to the current object you're trying to call it from. In essence this kind of thing tends to happen with exclusively with objects. Don't group one function with the instantiation of other functions and you'll be fine.
Still I've seen this issue occur with namespacing when you import files. Hardware isn't going to segregate this from happening. You need to structure your dependencies as a tree.
they use the term breaking down a monolith and "architecture" so from that you can derive that it's literally about using an entire VM or computer to enforce boundaries.
Folders and files are used in "monoliths" anyway. Nothing new to talk about that here. Are you implying that their monolith is just one big file and they're beginning the process of breaking that thing down into multiple files and different folders?
I don't know about you but that doesn't make any sense to me.
All right. I'm wrong. Didn't know this. Thanks for linking. Still can't exactly fault me on that. It's not easy to find the contextual blog post if this post doesn't easily say it's part of a series.
Still though, my expose is still relevant, those are some hard lines that can easily be gotten rid of if your functions were immutable and not part of a class.
Any internal private function is safe to use anywhere in the system as long as it's not attached to a class and it doesn't modify shared state. If your systems were modelled this way there would be no need to really think about modularization as your subroutines are already modular.
For example:
class A:
def constructor:
//does a bunch of random shit
def someMethodThatMutatesSomething() -> output
class B:
def someOtherFunctionThatNeedsClassA:
//cannot call someMethodThatMutatesSomethingwithout doing "a bunch of random shit" or even possibly modifying or breaking something else. Modularity is harder to achieve with this pattern.
somePureFunctionWithNoSideEffectsabove does not need any hard lines of protection. There is zero need to use the antics of "deconstructing a monolith" if you structured things this way. Functions like this can be exposed publicly for use by anyone with literally zero issues.
Shared muteable state and side effects is really the key thing that breaks modularity. Everyone misses it and comes up with strange ways to improve modularity by using "walls" everywhere. It's like cutting my car in half from left to right with a wall and calling it "modularization." When you find out that the engine in front actually needs the gas tank in back then you'll realize that the wall only produces more problems.
I think what's really unfortunate here is you started pretty pointed in what you were saying, and you've stayed pointed. It reads as confrontational.
It's unfortunate because you make a good point. Pure functions do not get the attention they deserve. However, no one will read that because you just sound like you're attacking for no real reason.
I'm only saying this because if you're this way here there is a solid chance you're like that in other areas of your life. What you have to say is important, but if you approach your conversations this way people won't listen.
Why did I take the time to write this? Because sometimes those closest to us won't give us the feedback we need.
Thanks. But this is the internet. I use a bit of aggression experimentally at times. Overall though, it sounds confrontational but I'm actually pretty factual and I never attacked anyone personally, it's all about the topic and idea. I actually admit when I'm wrong (see above, and who does that in life and on the internet?).
What's going on is I'm spending zero energy in attempting to massage the explanation with fake attempts to be nice. I'm just telling it like it is. Very few opportunities to do this in real life except on the internet.
In the company I work for do I spend time to tell my coworkers that pure functions are the key to modularity when classes and design patterns are ingrained in the culture? Do I tell them that their entire effort to move to microservices is motivated by hype and is really a horizontal objective with no actual benefit? No. I don't. People tend to dismiss things they don't agree with unless it's aggressively shoved in their face. They especially don't agree with ideas that go against the philosophies and and practices and they've been following for years and years.
Thus if I'm nice about it, I'm ignored, if I'm vocal and aggressive about it, I'm heard but it will also hurt my reputation. It's HN feel free to experiment just don't try it at work.
Yeah my attitude isn't the best, but honestly, if I was nice about it, less people would read this or think about it. By doing this on the internet I can raise a point while not ruining my rep. (And I'm not actually aggressive as there are no personal attacks unless someone said something personal about me)
Tell me, in your opinion, how would you get such a point across in a culture where the opposite is pretty ingrained? I'm down to try this, I can repost my original post with the errors corrected and a nicer tone to see the response.
I appreciate the point you're trying to make, but the truth is that you can make factual arguments without being so aggressive. Whether the aggression is targeted at a person doesn't really matter. It's unnecessary, disrespectful, and just feeds into the general toxicity that plagues our culture.
> Thus if I'm nice about it, I'm ignored, if I'm vocal and aggressive about it, I'm heard but it will also hurt my reputation.
I think the fact we are talking about your tone and not your points about functional programming speaks to this by itself. You weren't heard. You were felt, though.
> I'm not actually aggressive as there are no personal attacks
Aggression without a target is still aggression. If I aggressively take the recycling out, that aggression is still experienced by people around me. Probably my partner, who will inevitable have a little talk to me about it, lol.
> Tell me, in your opinion, how would you get such a point across in a culture where the opposite is pretty ingrained?
Engage in an intellectual conversion based off mutual respect. You will never change someones mind on the spot, intellectual people will often mull things over for a while. In the process you may learn a few things yourself. I've worked in places that excelled at this, where respectful discourse was promoted. Conversations revolved around facts, but respect was maintained.
Sidebar: Shopify doesn't really have microservices. They have a few services, but they are entire services which serve an entire business unit. They are the exception. When I worked there I worked on one such service. What I'd tell people is if you couldn't start a whole new company with the service you were building, don't build it as a service.
I think you missed my point. I'm saying when you aren't aggressive people tend not to want to intellectually engage with you. People are emotional creatures and what doesn't excite them emotionally they don't engage. I'm saying I used the aggression on purpose for my own ends, but I caveated by saying that no actual attack occurred.
I think you need to think deeper than the traditional "mutual respect" attitude and generally being nice. Not all great leaders acted this way either. It's very nuanced and complicated how to get people to change or listen. The internet is an opportunity to try things out rather then take the safe uncomplicated "nice" way that we usually try in the workplace.
>Engage in an intellectual conversion based off mutual respect. You will never change someones mind on the spot, intellectual people will often mull things over for a while. In the process you may learn a few things yourself. I've worked in places that excelled at this, where respectful discourse was promoted. Conversations revolved around facts, but respect was maintained.
Right except this is exceedingly rare. Most people do not act this way. Respect was maintained but the point is instantly forgotten and dismissed. Likely the respect covers up actual misunderstanding or disagreement. I find actual intense arguments open people up to say what they mean rather than cover up everything in gift wrapping.
Think about this way. The reason why Trump won the election is not because he was nice. The complexities of human relationships goes deeper then just "mutual respect" There are other ways to make things move. The internet is often an opportunity for you to try the alternative methods without much risk.
>I think the fact we are talking about your tone and not your points about functional programming speaks to this by itself. You weren't heard. You were felt, though.
The world moves through feelings. Not for all cases but oftentimes to get heard you need to get "felt" first.
> >I think the fact we are talking about your tone and not your points about functional programming speaks to this by itself. You weren't heard. You were felt, though.
> The world moves through feelings. Not for all cases but oftentimes to get heard you need to get "felt" first.
This is true, but you have options in terms of what feeling you're aiming for.
There is a world of difference in the response you're likely to get from "When Z you should do X because Y" vs. "We had a Z problem, it turns out that Y was the issue, so we did X."
The former will probably get you an "uh-oh" and the latter an "a-ha" or "hmm". Big difference.
Just because a function is pure doesn't mean there is zero-risk in exposing it publicly. You're conflating complexity in managing state with complexity in managing domain boundaries.
A tangled web of function calls can be very confusing to work with, regardless of purity.
From a purely structural standpoint there is no risk. But you are talking about something different. You use the word "confusion."
Confusion is an organizational issue that can be handled with social solutions like names, namespaces and things like that. You can compose functions to form higher order functions with proper naming to make sense of things. So for example if you have 30 primitive functions you can compose smaller components into 10 bigger functions in a higher layer and expose that as an api. This is more of a semantical thing as you can still use the lower level primitives as a library and chain those lower level functions to achieve the same goal as using the higher level api, the higher level functions just make it easier to reason about the complexity.
Confusion, Semantics and organization is in a sense a social issue that is solved by social solutions like proper naming, grouping and composing. I'm not dismissing these issues (they are important) but I'm saying they are in a different category.
Overall though the problem I am addressing is structural. There are real structural issues that occur if your functions are not pure. When 4 methods operate on shared state in a class all four methods become glued together. You cannot decompose or recompose these functions ever. They cannot be reused without instantiating all the baggage that comes with the class.
You can't talk about modularity without touching on shared mutable state. Shared mutable state is the fundamental primitive that eliminates modularity. You get rid of this, you're entire program is now modular.
None of the writing really gets deep into this so I assume the author doesn't know.
It's not "mansplaining" you social justice warrior. I don't even know the sex of the author and I don't care. Don't turn this into some sex based conflict. It's called explaining, and that's all it is.
I'm assuming you don't know about it either so I suggest you read my "explanation" as well.