The problem with extrapolating like you just did is that Kubernetes has not been...

thockingoog · on March 28, 2016

As cited above, we simply CAN'T opensource Borg. It's enormous, and it is deeply, DEEPLY entangled with millions of LoC of Google code. Nobody could untangle that. And even if we DID untangle it, it's alien technology. It does not make any attempt to meet people where they are. It does not focus on open standards or simple solutions to simple problems. On top of that, it's got 10+ years of semi-organic growth in it. There are a lot of mistakes that have been made that we simply have to live with internally. Also, it's C++, for which there is approximately ZERO opensource community.

We made a very strategic decision to rebuild it. It embodies many lessons from Borg and Omega (both things we got right and things we botched). It is implemented in an easier-to-approach language (Go) which has an active OSS community. It specifically focuses on "legacy apps" (everything written up to and including today) and open standards (HTTP, JSON, REST).

I've never been shy about my opinion that I do hope to supplant Borg one day, but that day is necessarily years away. Of course no major project has ever used it, the whole thing didn't exist just 2 years ago.

mike_hearn · on March 28, 2016

Whilst I appreciate that you're effectively competing with Borg internally, having been a heavy user of Borg for many years I'm not sure why you think it doesn't focus on simple solutions to simple problems, or that it doesn't meet people where they are. Borg always impressed me as one of the best thought out pieces of Google infrastructure: bringing up simple jobs was in fact quite simple, or so it seemed to me, but it also had sufficient power to do far more complex tasks as well.

Much of the organic growth, as you put it, can also be described another way: as an accumulation of useful features and optimisations.

The language issue was addressed by another response. To claim there's no open source C++ community which is why it's written in Go is just bizarre. There's absolutely a thriving open source C++ community, but if having the biggest open source community was the driving factor in picking the implementation language then I guess you should have picked Java.

thockingoog · on March 28, 2016

How would you run Apache + PHP + MySQL on Borg? Hint: you can't. Not without HUGE difficulty, anyway. Nobody does it. Part of Kubernetes "meet people where they are" mindset is that we simply can not ask people to rewrite their apps.

Truth is, a LOT of people don't write code. They write content and use pre-built code (think WordPress). Borg simply can not accomodate that very well. It's simple as long as you control things from soup to nuts.

Yes, some "organic growth" was useful features. But a lot of it was useless features, or features that are now obsolete but can't be removed because someone somewhere is using them, and probably doesn't have enough time to re-test without the feature (true story).

thockingoog · on March 28, 2016

I can not reply to your last comment, but "port 80 request denied". And where do you store your MySQL data? The point being nothing is impossible, just prohibitively hard.

dekhn · on March 28, 2016

There are any number of places you can store your MySQL data in a container world.

The first and best is to make all your MySQL IO go to an external cluster filesystem or other remote IO system. Because MySQL supports pluggable storage, you could write an Hadoop FS storage manager. This has the advantage that if a single MySQL instance is blown away, all the committed data is available for a new replica to start reading. I don't know if Docker or other container systems support automagically turning local IO calls into remote IO calls (or whether that really makes sense in a MySQL environment), but that's a similar approach. Condor supported this through their remote libc interface.

The second is to use some sort of per-task persistent local storage. In Docker world, this would be a mounted volume- the docker host would manage the storage, and new containers would remount that storage. You could have a process that restored the local storage from a backup, and the use replication from a master to catch up.

The third would be to have some sort of per-container persistent storage (the Borg paper calls this an "alloc").

For the server, most people wouldn't have Apache bind port 80 inside the container- you'd bind another port, and use some sort of other mechanism such as load balancing to expose the web server on a standard port

thockingoog · on March 28, 2016

The question had an implied "... in Borg" suffix. The point was to demonstrate that Borg does not have "legacy" affordances like durable storage (well, not in the same way as MySQL would need).

dekhn · on March 29, 2016

We both are Google employees. I used to be MySQL sre, with experience in this.

thockingoog · on March 29, 2016

You're still making my point. Using something like MySQL on Borg is not trivial.

mike_hearn · on March 28, 2016

Put together the packages, request a fixed port, disable the health checks (unless I had a file in the Apache root with the right name), start it up?

I don't think Borg imposes all that many requirements on jobs, really, and the few it does can be disabled. Or at least could.

But I guess we're probably wandering out of the area covered by the papers now.

gjvc · on March 28, 2016

>> Also, it's C++, for which there is approximately ZERO opensource community.

Rubbish, and saying so undermines the rest of your points.

thockingoog · on March 28, 2016

Wonderful rebuttal. Proof? There are some successful projects, but that is not a community. There are some libs, but that is not a community.

The Go community is vibrant and growing. Go is an easy language to learn (and I say that as someone who LIKES the power of C++) and it is not a total joke to ask people who report bugs to jump in and try to fix them. C++ is simply NOT approachable by mere mortals, and would have made for a very different community and a much slower pace.

And I say that as someone who detests many facets of Go - but it's just better at Getting Things Done than C++.

zeveb · on March 28, 2016

>> Also, it's C++, for which there is approximately ZERO opensource community.

> Rubbish, and saying so undermines the rest of your points.

Hardly rubbish. Although there are open-source projects which use C++, I and many others avoid them like the plague.

I think he meant 'approximately ZERO' in the Spolsky sense, which is 'sure, there are some, but in the grand scheme of things they're indistinguishable from ε.

thrownaway2424 · on March 28, 2016

I don't strongly disagree with any of that. I was only pointing out that it's wrong to attribute a decade of history to something like cadvisor, which is brand new and does not draw on anything more than lessons learned from Google production.

I also don't blame people for being confused about Google's container infrastructure. Google has issued blog posts in the past that were misleading (at best) about the relationship between Omega, Borg, and Kubernetes.

thockingoog · on March 28, 2016

Hmm, what was misleading? That certainly was never the intent.

A decade or history lead to the knowledge that a particular style of monitoring was needed. That knowledge lead to cAdvisor. Is it perfect, of course not, but it fills a need and is directly derived from Google's experience. I fail to see how that misattribution, personally.

TheIronYuppie · on March 28, 2016

Disclaimer: I work at Google on Kubernetes.

Tim knows more about this than almost anyone, but I will add one point - we have used Kubernetes for significant internal projects, and plan to continue expanding its usage over time.

To Tim's point, though, it'll take time. The thought that you could move literally millions of lines of code and applications over to a new platform in just 10 months (the amount of time that Kubernetes has been GA) is... optimistic.

Estragon · on March 29, 2016

When did the internal projects start using Kubernetes? I heard that its uptake within Google has been very anemic.

thockingoog · on March 29, 2016

Some internal projects started evaluating Kubernetes well before 1.0. We don;t talk about them much because they are, well, internal.

This is NOT in competition with Borg, though. Not yet.

jamesblonde · on March 27, 2016

I tend to agree. It's another case of - here is an API to access our products so you can play around with it. The scalable, powerful scheduler, however is not included. There are no other batteries that fit it, apart from GCE. Kind of like Blaze and TensorFlow - nice and shiny, but hollow (missing the good distributed filling).

SEJeff · on April 10, 2016

FWIW, you can plug mesos in as a Kubernetes scheduler. From the OSS world, that is about as heavy duty as you're going to get and proven on quite large (although not google large) 10k+ node clusters.

skj · on March 28, 2016

The internal google scheduling system is simply not appropriate for anyone other than Google to use. A huge number of machines, all alike, with a huge number of trusted binaries that can be multiplexed onto these machines without fear that they're going to break the jail and cause havoc (since there is a solid trail from source to running artifact). It's just not the reality that other companies exist in.

star-trek-fleet · on March 28, 2016

Scheduler is actually one relatively simple piece of the whole picture. If scheduling is a pain, then Kubernetes would just addressed. The fact that Kubernetes did not choose scheduler, means that it is actually not a big problem, at least not the biggest one.

stonogo · on March 28, 2016

That is not an argument against releasing the code. Why would Google assign itself as gatekeeper? I personally could use this code on supercomputers now. I don't work for a supercomputing company; my use case is academic work and computational science. I absolutely have thousands of huge machines that I multiplex trusted binaries to -- and scheduling is not a trivial problem.

So what's the real reason nobody gets to see this code?

thockingoog · on March 28, 2016

The code is literally millions of LoC, all of which needs to be audited for stuff we can't release for whatever reasons. All that code is built upon layer after layer of Google internal stuff. Open-sourcing Borg means open-sourcing Chubby, internal form of gRPC (older), hundreds of libs, monitoring infrastructure, etc. Net result is O(50M) LoC. And when someone sends us a patch - then what? The cost of doing it is simply prohibitive. I'd love to do it, it's just not practical and has no RoI.

stonogo · on March 29, 2016

That's a much more sensible reason! If the code is truly Google specific, then I agre. It sounded to me like the code was not released because nobody else has a lot of computers, which I found odd.

Thank you for the details!

mike_hearn · on March 28, 2016

Borg jobs are not trusted! The system sandboxes them, prevents them spying on other jobs data files, and assumes they might abuse system resources in arbitrary ways. The days when all Google machines trusted all Google employees is long in the past.

thockingoog · on March 28, 2016

As for "no other batteries that fit it" - I am confused. We do run on AWS, OpenStack, and other cloud providers, as well as on bare metal. It's not like nobody is using this thing. In fact, just a finger in the wind, I'd guess the number of people using it outside of Google Cloud is several times more than people using it on Google Cloud.

thockingoog · on March 28, 2016

"Powerful scheduling" is such a tiny piece of what Kubernetes does, it's funny. Yeah, Borg's scheduler is faster and more scalable and has more features. It also has 12 years of optimizations under its belt. I have 100% confidence that, should Kubernetes be around 10 years from now,this will be a non-issue.

x0rg · on March 28, 2016

Sure, even though having multiple schedulers support (for different type of workloads) would be great to increase the cluster utilization, which is one selling point of such systems. I understand that Kubernetes is developed in the open and with the community but the heavy marketing as the "solution you can use now, directly from the creators of Omega" makes some people think it's ready, perfect and will fix all of their problems, but that's simply not true.

thockingoog · on March 28, 2016

We have multi-scheduler support in v1.2 :)

x0rg · on March 28, 2016

Great, the docs are a bit bad for that, I'll have a look at the code :-)