Upgrading GitHub to Rails 3 with Zero Downtime

ics · on Sept 15, 2014

    For those of you keeping score:
    
    - Yes, Rails 3 was released four years ago
    - Yes, the current stable version is Rails 4.1, which left us
      two major versions behind
    
    We had work to do in order to live in the modern world again.

Okay, so why didn't aren't they transitioning to Rails 4? I'm not clued in to much more than the version numbers so I suppose there are reasons that go a little deeper than 'the lowest version that works with the gems we want'. They've been working on the transition for six months according to the post, making it recent enough that 4 would be the 'obvious' choice unless there were fears that it wouldn't be stable (IIRC Rails 4 has only been so for a few months).

donw · on Sept 16, 2014

An important lesson I learned while working as a sysadmin was that when you're making operational changes, you do one thing at a time.

Let's say you need to set up a replicated database with failover and monitoring.

First you set up the database (and make sure it works).

Then you set up monitoring (and make sure it works, too).

Followed by replication (which should also work).

And finally, automatic failover (there's a pattern here...)

This sounds obvious, but I've worked with developers that, when given this task, tried to set up the entire thing in one shot. It took them over a month to wrap up, and they didn't have time to properly test it, because they bit off too much in one go.

The same goes for migrations. You work along an incremental plan to move from Point A to Point B, where at every step of the way, the set of things you need to change is small enough to manage, and you have the option of rolling back.

I can't imagine that Github has a tiny codebase, so I could totally believe six months for a zero-downtime Rails 2.x -> 3.0 migration. The next big step (3.0 -> 3.2) will be easier from what they learned in the first step, as will the step after that (my guess is 3.2 -> 4.0).

bduerst · on Sept 16, 2014

That rule applies to more than just programming languages.

The U.S. Healthcare system has tried to switch to ICD10 standards for a while now, but they keep bumping back the deadline. I think it's EOY 2015 right now.

ICD10 was released in 1992...

danudey · on Sept 16, 2014

My first task at my current job was to build a deployment system where people could deploy code using a website. After two and a half years, version 1.0 was essentially feature-complete (as far as minimal necessary components).

Those two and a half years were spread over dozens of tools, a svn->git migration, two rewrites of the core functionality, but it all stayed on track because I was building small pieces which did limited things, and chained them together (the unix-y philosophy). I could build some functionality, test it, integrate it into the tool, and use it for a while. Then I could add support for it to our distributed RPC-ish system, and then if that worked, I could add it to our automated scripts.

If I had to build the whole thing at once I probably would have ended up with a much nicer, cleaner design, a much worse product, and a far longer deadline, and we wouldn't have had useful automation tools for the two intervening years.

tcopeland · on Sept 16, 2014

Yup. Another advantage of doing things incrementally is you get used to doing things incrementally. So you get good at it, and the business folks get used to hearing "we're upgrading from x to y", and tech folks get used to thinking about the next piece that needs to be upgraded, etc.

jeremycw · on Sept 16, 2014

2.3 to 3.0 is not that bad. 3.0 -> 3.2 is where the trouble will begin.

3.0 deprecates a ton of stuff in 2.3. 3.2 actually removes it all and adds in the asset pipeline.

Best to not bite off more than you can chew.

jrochkind1 · on Sept 16, 2014

In my experience, Rails 2->3 was, most decidely, _that bad_. It was the most difficult and frustrating migration I've ever done, and none of the subsequent Rails migrations have been close to as terrible.

(The fact that 2->3 was, like for most, also ruby 1.8 to 1.9 probably contributed)

dasil003 · on Sept 16, 2014

For me 3.0 -> 3.1 transition was definitely the more painful transition, because it forced me to also do 1.8 -> 1.9 at the same time due to a performance regression that made the test suite jump from 20 minutes to 2.5 hours under 1.8.

sams99 · on Sept 16, 2014

GitHub are running 2.1.2 GitHub edition Ruby so that is a non-issue for them.

2 -> 3 migration is VERY VERY hard. 3 -> 4 should be relatively easy in comparison.

thedaniel · on Sept 16, 2014

We had already backported a whole bunch of of stuff to 2.3 so hopefully the 3->3.2 transition won't be so terrible.

elektronaut · on Sept 16, 2014

Upgrading to 3.0 is a nice milestone, especially if you're porting over a large app.

ActiveRecord 3.x still supports the query interface used in 2.3, I'm guessing this is how they're able to run both versions from the same codebase.

I'm not sure which Ruby version GitHub is on, but Rails 4 requires Ruby 1.9+. Running Rails 2.3 on Ruby 1.9+ is doable but tricky.

alttab · on Sept 16, 2014

Confirmed with 2.3 and 1.9+. We had a ton of encoding issues with that combination. We had to ensure every single thing was encoded properly.

jack_jennings · on Sept 16, 2014

2 to 3 was a more difficult update. If I recall correctly, a feature of 4 was that the upgrade path was more reasonable.

Intrepidd · on Sept 16, 2014

I actually went nuclear on a large rails project and migrated it from custom rails 2 fork to rails 4, it was a mess but we're glad we made the whole deal once instead of migrating slowly to 3.0, then 3.2, then 4. It took ~ 6 months as well.

reedlaw · on Sept 16, 2014

It's usually wise to handle large transitions incrementally. Rails 3 before Rails 4 is probably a wise choice.

technoweenie · on Sept 16, 2014

Work has started on the move to 3.2: https://twitter.com/charliesome/status/511517393038753794

pothibo · on Sept 16, 2014

Rails 4 support ruby > 1.9.3 whereas rails 3 supports 1.8.x. I believe this could be the main reason why they didn't move to Rails 4.

Ruby 1.8.x -> Ruby 1.9.x requires quite some work by itself.

thedaniel · on Sept 16, 2014

We moved to Ruby 1.9+ quite a while ago with our patched-up Rails 2.3.

alttab · on Sept 16, 2014

Mainly due to string encoding. For something like Github, mistakes there could be disastrous. With all of the special characters and user input there are bound to be wild edge cases.

hobarrera · on Sept 16, 2014

> Okay, so why didn't aren't they transitioning to Rails 4?

If I had to guess, I'd say it's easier to hit Rails 3 as an intermediate milestone before moving onto Rails 4.

hemancuso · on Sept 16, 2014

Am I the only one who read this as: tl;dr - don't get too far behind Rails, it's fucking painful and expensive if you do.

brandonmenc · on Sept 16, 2014

Probably any framework is painful to be two major versions behind with.

davidw · on Sept 16, 2014

I'm a big fan of Rails, but I think that Rails does not really give a lot of weight to backwards compatibility. The Ruby community in general often feels like it's always charging forward, very often with tons of cool new things, but occasionally leaving broken, incompatible software in its wake.

yazaddaruvala · on Sept 16, 2014

You seem to imply there is something wrong with that.. As long as you bump the version number and support the old version for bug fixes, it ends up being the best way to move the world forward.

ufmace · on Sept 16, 2014

I am a fan of Rails and am doing some work on it, but it's pretty clear that while Rails is bumping version numbers cleanly, they aren't putting all that much work into supporting older versions. Github has been doing their own work backporting security patches to Rails 2.3 for years.

davidw · on Sept 16, 2014

I tried to keep my comment kind of neutral, actually. There are advantages and disadvantages, and the Ruby community does pretty well, by and large.

tonyedgecombe · on Sept 16, 2014

Why would it be any more difficult to migrate from 2.3 to 3 now instead of four years ago?

raus22 · on Sept 16, 2014

time == more code that can break

reedlaw · on Sept 16, 2014

I'm curious how they handled the differences between the Rails 2 and Rails 3 applications once they enabled dual boot. Surely not all of the changes were compatible. In the Gemfile example, there is some conditional logic that loads different gems depending on whether or not they used RAILS3=true. Was the entire codebase similarly littered with conditionals? That seems like it would be quite a mess.

holman · on Sept 16, 2014

Lots of conditionals, yeah. This isn't really a bad thing, to be honest; it's far better than maintaining two separate branches. We already use similar conditionals when dealing with feature flags, so it's not really something that feels out of the ordinary in our app at all.

What's more, we found that if there was a file that had a ton of complicated, interwoven logic between the two versions of Rails, it was a clear sign we should see if we could instead backport a Rails 3 or 4 component so that we could nix the entire Rails 2.x-style functionality.

why-el · on Sept 16, 2014

Exactly. I handle API changes the same way at my place (Its an internal API exposed to colleagues only).

alttab · on Sept 16, 2014

Only temporary and easy to remove. It would be tricky if entire classes/objects/abstractions were missing entirely. This would require essentially reimplementing that part of the code.

For other Rails version changes, for instance, the removal of RJS as a default component in rails 3.1 could require that you re-architect certain user flows or pages in your app.

chippy · on Sept 16, 2014

The article describes how they compared them for performance but didn't say which was better, and showed a graph which indicated that Rails 3 was worse for longer times in garbage collection for requests. I'd imagine that Rails 2 in GitHub would have been heavily optimized, but....

Is Rails 3 worse performing than Rails 2? Would some performance loss be okay if they had a better codebase?

jrochkind1 · on Sept 16, 2014

I think Rails 3 is _already_ security-fix only.

But I understand why they did it. And I sympathize. The Rails treadmill is a harsh regime.

I wonder if they're considering what the heck they are going to do when Rails 5 comes out (target: spring/summer of 2015. Less than 12 months) and Rails 3.x stops even receiving security updates. I mean, clearly they have the resources to backport security updates themselves that's not a problem -- it's just that they're still not quite in 'the modern world', they've just kept from falling even further behind.

holman · on Sept 16, 2014

We already have the app booting on 3.2; the goal is to try to get to 4.0 and track master fairly quickly. No one's eager to have to go through this whole process again in a year. ;)

InAnEmergency · on Sept 16, 2014

Rails 3 is already only receiving security updates for "severe" issues, see http://rubyonrails.org/security/

jrochkind1 · on Sept 16, 2014

thanks for that link!

While it says `/security`, it's actually the only link I know of with an updated list of how the maintenance policy applies to current versions.

I hadn't been able to find such before, only dated news/blog announcements, which can get out of date really quickly as fast as Rails goes.

dazonic · on Sept 16, 2014

In the multi-version, how would they handle the different DSLs in things like route mapping?

map.resources :users

vs.

resources :users

haileys · on Sept 16, 2014

Rails 3.0 has a legacy routing mapper that lets us continue using the Rails 2.3 routing syntax. The legacy routing mapper is also available as a gem [1] for Rails 3.1 and 3.2.

We plan on gradually upgrading our routes file to use the new syntax now that we're on Rails 3, but as you can imagine it's something that we're going to need to be slow and careful about.

[1]: https://github.com/pixeltrix/rails_legacy_mapper

jjuliano · on Sept 16, 2014

Been There, Done That..

yep, I've upgraded a hundreds of thousands of Rails 2 codebase to Rails 3 point or so and it is a real pain. (Not to mention Ruby 1.8.7 to Ruby 1.9.3 conversion, oh boy!)

The good thing about the experience is that I have mastered upgrading Rails 2's codebases to Rails 3 or so and Ruby 1.8's codebases to Ruby 1.9's or so.

_fx6v · on Sept 16, 2014

Hundreds of thousands?

ben336 · on Sept 16, 2014

I think the phrase "lines of code" is missing in there somewhere :)

flowerpot · on Sept 16, 2014

Very interesting! Just wondering, does GitHub actually have so few dependencies as described?

shayfrendt · on Sept 16, 2014

I wish! Nah I just took a snapshot of the Gemfile for brevity.

rurounijones · on Sept 16, 2014

It would be really interesting to see which (public) gems the Github front-end app itself uses. Would it be possible to post that?

yuhong · on Sept 16, 2014

So are they using Rails LTS or forking Rails 3.0 themselves?

masklinn · on Sept 16, 2014

They're definitely not forking Rails 3, they're not staying on it either: https://news.ycombinator.com/item?id=8322826

> We already have the app booting on 3.2; the goal is to try to get to 4.0 and track master fairly quickly. No one's eager to have to go through this whole process again in a year. ;)

kevinsf90 · on Sept 16, 2014

For a large codebase, these upgrades will be a pain, especially on ruby/rails. To scale in the long run, it'd probably be wise to modularize & split the codebase into microservices, and at the same time, port to, say, a scala or java based framework (like Play).

lucaspiller · on Sept 16, 2014

I'd be interested in hearing a bit more about how GitHub structure their app. From the sounds of it, they have one big monolithic app. Running the tests can't be pretty on that...

holman · on Sept 16, 2014

Yup; one monolithic app. Tests run in a tad over two minutes.

nathan_f77 · on Sept 16, 2014

That is insanely fast, for what must be an enormous codebase. Do you mean it runs in two minutes on your Mac, or on distributed CI servers?

haileys · on Sept 16, 2014

We use the test-queue gem (https://github.com/tmm1/test-queue) to run our test suite in parallel across 10x 8 core machines.

nathan_f77 · on Sept 16, 2014

That's awesome! We were using parallel_tests, but we just bought a second test server. I was looking into Kochiku from Square, but that gem looks perfect. Thanks!

VeejayRampay · on Sept 16, 2014

Two minutes is actually pretty fast for such a large app. Well done. And congratulations on the migration.

imjoshholloway · on Sept 16, 2014

What's the coverage on that?

matthewmacleod · on Sept 16, 2014

Why do you think that Scala or Java will help to reduce upgrade cost?

tootie · on Sept 16, 2014

I feel like the lesson here is not to put mission-critical systems on leading edge software. I'd be porting to Java at this point. GitHub is a big boy company.

wging · on Sept 16, 2014

In 2014, Rails is not leading-edge software.

imjoshholloway · on Sept 16, 2014

It's `web scale` and everything

tootie · on Sept 16, 2014

It was when they started.

Tenhundfeld · on Sept 16, 2014

Leading edge is subjective. Rails was already 3-4 years old and already being shipped standard on OSX when GitHub was launched. IMO, that's not excessively leading edge for a small startup, especially if the founders already know the framework.

iagooar · on Sept 16, 2014

Party hard like it's 1999.