More

exterm · 2026-01-14T22:25:18 1768429518

exterm · on Jan 1, 2024

I've encountered a lot of confusion on this topic and hope people find this article helpful.

Also if you find any mistakes I'd be happy to correct them!

exterm · on Sept 17, 2020

> When I left, we were up to a few dozen components, and the number was climbing rapidly.

I should have included this in the blog post: The number of components _needs_ to be kept small. Shopify's main monolith is 2.8 million lines of code in 37 components, and I'd actually like to get that number _down_.

I like to compare this to the main navigation that we present to our merchants. It's useful if it has 8 entries. It's not useful if it has 400.

In a way, components are the main navigation to our code base. A developer should be able to look at what's in our "components" folder and get a general impression of what the system's capabilities are.

JohnBooty · on Sept 18, 2020

That's an excellent (and hard-earned, I'm sure!) insight. Thank you.

    I like to compare this to the main navigation that we 
    present to our merchants. It's useful if it has 8 
    entries. It's not useful if it has 400.

Yeah, we essentially wound up with a "junk drawer" of components. I could see a lot of companies, like ours, making that mistake -- turning all the things into components.

As you said in the article, one of the benefits of components for you was that it truly forced you to think about a proper separation of concerns. In hindsight, that's an area where we really missed the mark for a variety of reasons, some methodology-related.

We practiced a rather strict version of Scrum. Management paid a lot of attention to our velocity from week to week: we needed to rack up those story points.

But, outside of the tiny team dedicated to the component effort, there were no story points to be had for supporting that effort. Therefore we were in fact incentivized not to support it. I remember one sprint where I did some refactoring work in order to achieve a better separation of concerns. It negatively affected our velocity for the week and that was noticed.

So, we were receiving a schizophrenic message from management. We were all to support the component effort.... but on our own time, apparently?

exterm · on Sept 17, 2020

you should read the first post in the series if you want to read about microservices. https://engineering.shopify.com/blogs/engineering/deconstruc...

exterm · on Sept 17, 2020

It's all tradeoffs. You get a stronger boundary, but you also get a distributed system.

Also, the first try of drawing boundaries will always be varying degrees of wrong. If you have very strong boundaries at this stage, iterating on them, moving responsibilities around, can be harder.

Also, with the right tooling it's definitely possible to harden monolith internal boundaries to a comparable level.

I can see though how many smaller companies would not be in a position to build that tooling.

Anyway... there is no either / or here, as I've explained in another comment. What if you have components within a monolith, but each component has its own database, for example? What if test suites are completely isolated, so that tests for component A can not access code in component B?

You can get pretty strong boundaries with a few comparably simple tricks.

exterm · on Sept 17, 2020

you don't work with Ruby eh? :D

WJW · on Sept 17, 2020

The lack of compile time in the ruby world really makes it difficult to do a lot of work there. :P

There's a nice Ruby trick btw where you put significant precalculations in constants, since the value of a constant gets computed during program startup it still allow you to do work "up front" instead of during a web request.

mhoad · on Sept 17, 2020

I never knew that but that IS a cool trick.

skipants · on Sept 17, 2020

I just want to caveat this as it is not a Ruby construct, it's part of Ruby web servers. Because they are long-living Ruby processes they are only loading files once (I suppose it's similar to compiling). This means it runs all globally-scoped code, which class definitions and constants (generally) are. That's actually what Ruby bootloaders like Spring and Zeus are doing on your dev machine to speed up the load time when you use Rails commands. They cache all that globally run stuff in their own process. It's also why they run into a bunch of issues when you have logic in your constant definitions.

owyn · on Sept 17, 2020

Yep, that's a good trick. At a previous PHP shop we had a large amount of static XML configuration (well it was generated, but not that often). Converting it all to PHP arrays and including it was significantly faster than parsing the XML on each request, and then PHP would cache that result too. Re-running the XML->PHP tool just caused it to re-include/cache these giant arrays of static config. It worked great. I mean, arguments about whether that was a good design or not aside...

(edit to reply since I can't reply to a reply to a reply)

Yep, it is very common in lisp/smalltalk environments to dump the state of the world to disk and re-load it later. This is one of those tricks that gets relearned every generation. :)

For bonus credit apply this analogy to docker images. :)

JohnBooty · on Sept 17, 2020

I did this in PHP once as well. I had to code a coupon lookup site where people entered coupon codes and they were verified against a database. I forget how many coupons there were... pretty sure it was less than 100,000.

Anyway, I coded it up in my local dev environment. Unfortunately, it turned out that I'd been mislead and the actual deployment environment didn't have a database server available.

In desperation and facing a deadline, I dumped all the lookup values into an array in a PHP file. As you said, it was really quite performant. The first request after starting the server was a bit slow (but not too bad... still < 10 seconds I think) and after that things were golden.

I felt a bit dirty, but things worked and we got paid.

im3w1l · on Sept 17, 2020

I heard emacs did that but went one step further, they dump the memory of the interpreter post-init* and just load it into memory when starting.

* Some early step in the init process. Many things are still interpreted at init.

andreareina · on Sept 18, 2020

"emacs unexec" is the search term if you're looking for it.

https://news.ycombinator.com/item?id=21394916

https://lwn.net/Articles/707615/ "The Emacs dumper dispute"

https://lwn.net/Articles/673724/ "Removing support for Emacs unexec from Glibc"

im3w1l · on Sept 17, 2020

I heard emacs did that but went one step further, they dump the memory of the interpreter post-init* and just load it into memory when starting.

* Some early step in the init process. May things are still interpreted at init.

shawnz · on Sept 18, 2020

What? Don't global variables exist in pretty much every language on the planet? This isn't a "trick", it is a bad practice which should be avoided in any language.

Imagine describing globals in C, Python, or JavaScript, or static fields in Java or C# as a "neat trick"...

aantix · on Sept 17, 2020

In Ruby, the class definition is code as well.

imhoguy · on Sept 18, 2020

Where they are not code? I think you meant that Ruby class is defined at runtime with sequential imperative or functional code. Ha! You can build class from Ruby code in a string too. A lot of choice.

exterm · on Sept 17, 2020

Hey Leafboi - I recommend reading the first post in the series for some background https://engineering.shopify.com/blogs/engineering/deconstruc...

We don't use "hardware" or "VMs" to facilitate modularity.

leafboi · on Sept 17, 2020

All right. I'm wrong. Didn't know this. Thanks for linking. Still can't exactly fault me on that. It's not easy to find the contextual blog post if this post doesn't easily say it's part of a series.

Still though, my expose is still relevant, those are some hard lines that can easily be gotten rid of if your functions were immutable and not part of a class.

Any internal private function is safe to use anywhere in the system as long as it's not attached to a class and it doesn't modify shared state. If your systems were modelled this way there would be no need to really think about modularization as your subroutines are already modular.

For example:

  class A:
     def constructor:
         //does a bunch of random shit

     def someMethodThatMutatesSomething() -> output




   class B:

       def someOtherFunctionThatNeedsClassA:
           //cannot call someMethodThatMutatesSomethingwithout doing "a bunch of random shit" or even possibly modifying or breaking something else. Modularity is harder to achieve with this pattern.

versus:

   def somePureFunctionWithNoSideEffects(input) -> output

somePureFunctionWithNoSideEffectsabove does not need any hard lines of protection. There is zero need to use the antics of "deconstructing a monolith" if you structured things this way. Functions like this can be exposed publicly for use by anyone with literally zero issues.

Shared muteable state and side effects is really the key thing that breaks modularity. Everyone misses it and comes up with strange ways to improve modularity by using "walls" everywhere. It's like cutting my car in half from left to right with a wall and calling it "modularization." When you find out that the engine in front actually needs the gas tank in back then you'll realize that the wall only produces more problems.

richardlblair · on Sept 17, 2020

I think what's really unfortunate here is you started pretty pointed in what you were saying, and you've stayed pointed. It reads as confrontational.

It's unfortunate because you make a good point. Pure functions do not get the attention they deserve. However, no one will read that because you just sound like you're attacking for no real reason.

I'm only saying this because if you're this way here there is a solid chance you're like that in other areas of your life. What you have to say is important, but if you approach your conversations this way people won't listen.

Why did I take the time to write this? Because sometimes those closest to us won't give us the feedback we need.

leafboi · on Sept 17, 2020

Thanks. But this is the internet. I use a bit of aggression experimentally at times. Overall though, it sounds confrontational but I'm actually pretty factual and I never attacked anyone personally, it's all about the topic and idea. I actually admit when I'm wrong (see above, and who does that in life and on the internet?).

What's going on is I'm spending zero energy in attempting to massage the explanation with fake attempts to be nice. I'm just telling it like it is. Very few opportunities to do this in real life except on the internet.

In the company I work for do I spend time to tell my coworkers that pure functions are the key to modularity when classes and design patterns are ingrained in the culture? Do I tell them that their entire effort to move to microservices is motivated by hype and is really a horizontal objective with no actual benefit? No. I don't. People tend to dismiss things they don't agree with unless it's aggressively shoved in their face. They especially don't agree with ideas that go against the philosophies and and practices and they've been following for years and years.

Thus if I'm nice about it, I'm ignored, if I'm vocal and aggressive about it, I'm heard but it will also hurt my reputation. It's HN feel free to experiment just don't try it at work.

Yeah my attitude isn't the best, but honestly, if I was nice about it, less people would read this or think about it. By doing this on the internet I can raise a point while not ruining my rep. (And I'm not actually aggressive as there are no personal attacks unless someone said something personal about me)

Tell me, in your opinion, how would you get such a point across in a culture where the opposite is pretty ingrained? I'm down to try this, I can repost my original post with the errors corrected and a nicer tone to see the response.

richardlblair · on Sept 17, 2020

I appreciate the point you're trying to make, but the truth is that you can make factual arguments without being so aggressive. Whether the aggression is targeted at a person doesn't really matter. It's unnecessary, disrespectful, and just feeds into the general toxicity that plagues our culture.

> Thus if I'm nice about it, I'm ignored, if I'm vocal and aggressive about it, I'm heard but it will also hurt my reputation.

I think the fact we are talking about your tone and not your points about functional programming speaks to this by itself. You weren't heard. You were felt, though.

> I'm not actually aggressive as there are no personal attacks

Aggression without a target is still aggression. If I aggressively take the recycling out, that aggression is still experienced by people around me. Probably my partner, who will inevitable have a little talk to me about it, lol.

> Tell me, in your opinion, how would you get such a point across in a culture where the opposite is pretty ingrained?

Engage in an intellectual conversion based off mutual respect. You will never change someones mind on the spot, intellectual people will often mull things over for a while. In the process you may learn a few things yourself. I've worked in places that excelled at this, where respectful discourse was promoted. Conversations revolved around facts, but respect was maintained.

Sidebar: Shopify doesn't really have microservices. They have a few services, but they are entire services which serve an entire business unit. They are the exception. When I worked there I worked on one such service. What I'd tell people is if you couldn't start a whole new company with the service you were building, don't build it as a service.

leafboi · on Sept 17, 2020

I think you missed my point. I'm saying when you aren't aggressive people tend not to want to intellectually engage with you. People are emotional creatures and what doesn't excite them emotionally they don't engage. I'm saying I used the aggression on purpose for my own ends, but I caveated by saying that no actual attack occurred.

I think you need to think deeper than the traditional "mutual respect" attitude and generally being nice. Not all great leaders acted this way either. It's very nuanced and complicated how to get people to change or listen. The internet is an opportunity to try things out rather then take the safe uncomplicated "nice" way that we usually try in the workplace.

>Engage in an intellectual conversion based off mutual respect. You will never change someones mind on the spot, intellectual people will often mull things over for a while. In the process you may learn a few things yourself. I've worked in places that excelled at this, where respectful discourse was promoted. Conversations revolved around facts, but respect was maintained.

Right except this is exceedingly rare. Most people do not act this way. Respect was maintained but the point is instantly forgotten and dismissed. Likely the respect covers up actual misunderstanding or disagreement. I find actual intense arguments open people up to say what they mean rather than cover up everything in gift wrapping.

Think about this way. The reason why Trump won the election is not because he was nice. The complexities of human relationships goes deeper then just "mutual respect" There are other ways to make things move. The internet is often an opportunity for you to try the alternative methods without much risk.

>I think the fact we are talking about your tone and not your points about functional programming speaks to this by itself. You weren't heard. You were felt, though.

The world moves through feelings. Not for all cases but oftentimes to get heard you need to get "felt" first.

webmaven · on Sept 18, 2020

> >I think the fact we are talking about your tone and not your points about functional programming speaks to this by itself. You weren't heard. You were felt, though.

> The world moves through feelings. Not for all cases but oftentimes to get heard you need to get "felt" first.

This is true, but you have options in terms of what feeling you're aiming for.

There is a world of difference in the response you're likely to get from "When Z you should do X because Y" vs. "We had a Z problem, it turns out that Y was the issue, so we did X."

The former will probably get you an "uh-oh" and the latter an "a-ha" or "hmm". Big difference.

modal-soul · on Sept 17, 2020

Just because a function is pure doesn't mean there is zero-risk in exposing it publicly. You're conflating complexity in managing state with complexity in managing domain boundaries.

A tangled web of function calls can be very confusing to work with, regardless of purity.

leafboi · on Sept 17, 2020

From a purely structural standpoint there is no risk. But you are talking about something different. You use the word "confusion."

Confusion is an organizational issue that can be handled with social solutions like names, namespaces and things like that. You can compose functions to form higher order functions with proper naming to make sense of things. So for example if you have 30 primitive functions you can compose smaller components into 10 bigger functions in a higher layer and expose that as an api. This is more of a semantical thing as you can still use the lower level primitives as a library and chain those lower level functions to achieve the same goal as using the higher level api, the higher level functions just make it easier to reason about the complexity.

Confusion, Semantics and organization is in a sense a social issue that is solved by social solutions like proper naming, grouping and composing. I'm not dismissing these issues (they are important) but I'm saying they are in a different category.

Overall though the problem I am addressing is structural. There are real structural issues that occur if your functions are not pure. When 4 methods operate on shared state in a class all four methods become glued together. You cannot decompose or recompose these functions ever. They cannot be reused without instantiating all the baggage that comes with the class.

mperham · on Sept 17, 2020

I don't think you need to mansplain architecture to the blog post author.

leafboi · on Sept 17, 2020

You can't talk about modularity without touching on shared mutable state. Shared mutable state is the fundamental primitive that eliminates modularity. You get rid of this, you're entire program is now modular.

None of the writing really gets deep into this so I assume the author doesn't know.

It's not "mansplaining" you social justice warrior. I don't even know the sex of the author and I don't care. Don't turn this into some sex based conflict. It's called explaining, and that's all it is.

I'm assuming you don't know about it either so I suggest you read my "explanation" as well.

exterm · on Sept 17, 2020

As the author, I would know :)

Thank you for the praise.

Ours kind of organically grew over time, but as I've been keeping it alive for the last few years I have a pretty good idea of how I would start it fresh.

You probably have some people in the company who either know much more about architecture than others, or are working on projects that are more interesting in terms of architecture. Find one of them, convince them to give a 15 min talk.

Announce the talk widely within the company, tell people to come to the new "architecture guild" slack channel you created to get the details / invites.

Schedule an hour to give plenty of time for discussions after the talk.

Repeat biweekly.

sandGorgon · on Sept 17, 2020

Thanks for replying.

How would you do it in a remote-first world? A zoom talk ?

How does this go beyond that one talk - would you incorporate aspects of this into official rewards/recognition ?

Or is gratification good enough. Getting a zoom audience is gonna be hard.

exterm · on Sept 18, 2020

Shopify has been a fully remote company for a few months now. https://financialpost.com/technology/shopify-is-joining-twit...

We're not using zoom, but google meet - but yes, these happen completely online now.

I find that people that are doing interesting stuff often _want_ to talk about it. However, a big part of Shopify culture is "do things, tell people" - it is definitely encouraged to spend time spreading context.

It's not directly part of any rewards framework, but one metric that goes into promotions is the area of impact. By giving a talk to the guild, you can have impact on a group that's larger than your team, potentially the whole organization. It counts.

But another reward is the positive feedback, interesting discussions and new connections that you make through this.

exterm · on Sept 17, 2020

It's certainly related. In very general terms, I would say splitting a Rails app into multiple engines is the same pattern as umbrella applications.

However, there are more interesting specifics here about things like all engines sharing a database, but having exclusive ownership of tables, as well as splitting HTTP routing over multiple engines etc.

exterm · on Sept 17, 2020

The answer to that question could probably fill another blog post :D

Long story short, Rails and dependency inversion equals lots of friction. The whole framework is built on the assumption that it's OK to access everything from everywhere, and over the years we've built lots of tooling on top of those assumptions.

E.g. we heavily use https://github.com/Shopify/identity_cache with active record associations that cross component boundaries.

We also have a GraphQL implementation that is pretty closely coupled to the active record layer and _really_ wants to reach into all the components directly.

All of those problems can be overcome, but this is definitely an area where we have to working against "established" Rails culture, and our own assumptions from the past.

straws · on Sept 17, 2020

I hope to hear more in the future!

Do you envision any extension points to the way engines are implemented that could better enforce boundaries? In our engines, there was nothing that referenced another engine's resources, leaving the main application to handle route mapping and ActiveRecord associations between app models and engine's models.

I feel like the use-case for engines has long been around supporting framework like functionality (Devise, Spree, etc), but I wonder if there are changes to be made that better support modularization for large apps.

exterm · on Sept 17, 2020

> extension points to the way engines are implemented that could better enforce boundaries

Can you expand on that? I'm not sure I follow.

sandGorgon · on Sept 17, 2020

What's the difference between "componentization+engines" and microservices?

From a deployment perspective are your engines deployed and scaled independently?

exterm · on Sept 17, 2020

components are

- same database - same runtime - same deployment - same repository

That said, I don't think this is an either/or. It's a spectrum. you can have components within the same runtime and repository that have separate databases, or components that are using the same database but live in separate repos, etc.

From one monolithic app towards fully separated microservices is a spectrum, and I think developers should be enabled to move freely around that spectrum.

sandGorgon · on Sept 17, 2020

I think components are the better option. Because it allows for separation of concerns without introducing deployment ...or worse : political complexity.

I call them Micro-SDKs.