GKE On-Prem Alpha

al_james · on July 24, 2018

So the big question is... Can we point this at aws and get Google to manage our aws kubernetes Installs...

Given the massive shambles eks turned out to be, that would be great.

chumboslice · on July 24, 2018

As of right now, the alpha release is supported on vSphere 6.5... more to come later :)

thockingoog · on July 24, 2018

The mechanism will probably support it. Whether that is "supported" or not I think is TBD. :)

deforciant · on July 24, 2018

So the war for enterprise on-prem Kubernetes just intensified :) RedHat vs Google? :] I wonder how is Rancher doing? I guess life will be harder too?

Timing seems to be important, just before OpenShift finishes absorbing Tectonic (or maybe Tectonic is absorbing OpenShift) with their installer, UI and billing services. I guess there's still time for Google to land some big deals :)

bgracely · on July 25, 2018

The roadmap for Tectonic and CoreOS integration into OpenShift can be found in these videos:

- https://www.youtube.com/watch?v=1AelNjx6BB4 (OpenShift) - https://www.youtube.com/watch?v=LJOm4JbF4eQ (Red Hat CoreOS)

OpenShift is the most mature Enterprise Kubernetes that can be deployed and managed in any cloud environment (public or private). It's great to see further validation that customers want to run applications in both existing data center environments and public cloud environments.

westonh · on July 24, 2018

Hey Weston here from the GKE On-Prem team. Wifi is spotty at the conference but I'll try to answer questions as they come in.

joshwget · on July 24, 2018

Will Container-Optimized OS be used as the operating system for on-prem? If so, any plans to spin this off as a more general purpose OS now that it needs to support on-prem use cases?

westonh · on July 24, 2018

Right now we're working with a Google-hardened Ubuntu image; same one we use for GKE. COS is still TBD.

tootie · on July 24, 2018

Is part of this project intended to support easy federation of Kubernetes workloads across data centers and public cloud?

Any comment on what the pricing model will be like?

sturadnidge · on July 24, 2018

What does this mean, if anything, for PKS - i thought that was a joint Google/Pivotal/VMware effort?

chumboslice · on July 25, 2018

It doesn't really mean anything in relation to PKS. GKE On-Prem is Google's strategic on-prem enterprise Kubernetes offering.

crystaldust · on July 25, 2018

Hi Weston, I'm exploring and using Istio in the recent months, so what is the advantage of the "Managed Istio(with commercial level support)" over the community version? As I know the community is still actively optimizing the performance, will the Managed Istio have a better performance?

nine_k · on July 24, 2018

> With GKE On-Prem, you get the Google Kubernetes Engine (GKE) experience directly in your data center. A quick and simple install and upgrade experience that’s validated and tested by Google. GKE On-Prem also registers your cluster with Google Cloud Console in order to have a single-pane-of-glass view for managing all your clusters.

That latter is an interesting way to mentally merge your local DC and Google Cloud.

eicnix · on July 24, 2018

It will get even more interesting if you get decent federation support between your local GKE clusters and GCP GKE cluster allowing you to scale out to the cloud if you run out of capacity in your local cluster.

westonh · on July 24, 2018

Exactly. This is really the beginning of multi-cluster scenarios that work well across different environments. Failover from on-prem -> GKE is something we're working on.

nine_k · on July 24, 2018

That would be great!

Depends on the latency between your DC and Google's, and the cost of traffic (which could be significant).

chumboslice · on July 24, 2018

Which is where Google's network comes into play.

thockingoog · on July 24, 2018

Perhaps not surprisingly, this is where Istio excels..

patrickg_zill · on July 24, 2018

An admission that cloud is not necessarily the best answer for all use cases.

It is smart of Google to recognize that and yet still have a product for it. It will tend to commoditize data centers as well.

wmf · on July 24, 2018

Hey GKE team, can you give any details about how this works? What does it assume/require from the underlying infrastructure?

westonh · on July 24, 2018

We're going to give a breakout session that goes into more depth on Wednesday @ 4:35pm. IO244.

Some quick details: It's a bit of a split between what GKE runs and what the customer runs. Alpha runs on vSphere 6.5 and we're packing up a Google-hardened OS in much the same way we package GKE for GCP. A lot of the integrations for things like networking and storage will be coming from partners. We'll also have remote mgmt capabilities so we can manage the cluster's control plane in much the same way our SREs do for GKE.

ryanSrich · on July 24, 2018

Will this be something like COS or even CoreOS? Also, I'm curious to hear more about this part:

> GKE On-Prem has a fully integrated stack of hardened components, including OS, container runtime, Kubernetes, and the cloud to which it connects.

Which runtime are you shipping? CRI-O? What type of outgoing cloud connection is that? I have so many questions. I'm actually at the conference this week if you're willing to grab coffee.

westonh · on July 25, 2018

Happy to chat more. DM @westonhutchins and we can setup a time.

alxbog · on July 26, 2018

Is there a video available for session IO244?

puzzle · on July 24, 2018

Exactly. As described, it sounds a bit like magic. Of course, it's not. Thus, one wonders about what happens in the lower layers of the stack, namely who gets to run them (GKE or the HW owners):

* DNS (not the Kubernetes in-cluster one like kube-dns)

* DHCP

* LDAP or equivalent

* SSH and its keys

* on-prem security of the cloud identities (does it require TPM? SGX?)

* bootloaders

* base OS image

* drivers for attached storage, GPUs, etc.

* firewall rules

* routing

ninkendo · on July 24, 2018

Not to mention probably the hardest bit, which is how does it do persistent storage? Running k8s on the various cloud providers tends to use storage engines for those providers (ebs, etc)...

Does it ship with ceph out of the box? Some in-house block store? What happens when it breaks? Persistent volumes are IMO the very hardest thing to get right, and for me it's the big reason why I'd rather put my trust in a hosted solution in the first place.

If they solve this, and make it as seamless and easy as using cloud storage offerings, they've completely changed the game. Somehow I think they have a ways to go.

ferrantim · on July 24, 2018

I can attest that persistent storage is the hard part! Full disclosure, I work for a company[1] who makes a persistent storage solution for containers/Kubernetes. We are absolutely seeing that our large customers (folks like GE, Verizon, Dreamworks, Comcast, etc) are running "cloud native" applications on-prem as well as in the public cloud so this is a really smart move for Google.

[1] https://portworx.com

puzzle · on July 24, 2018

I assumed they would punt persistent storage for the first release, kinda like they launched Cloud Filestore with only NFSv3 support and without snapshots. As you say, it's a tough problem to tackle. If they to ship with something, I'd expect them to go for Ceph. As far as I can tell, if you squint hard enough, its lowest layers are the ones that resemble the most those at Google (Colossus/D).

merb · on July 24, 2018

> Not to mention probably the hardest bit, which is how does it do persistent storage?

on bare-metal this is solved with rook.io. load balancing (not the api servers) is also solved with metallb.

ninkendo · on July 24, 2018

> on bare-metal this is solved with rook.io

Do you have a source for this? Is there documentation anywhere that says GKE On-Prem is using rook? Or are you just saying "people who use kubernetes on-prem often use rook.io"?

cube2222 · on July 24, 2018

Rook is creating software for solving this problem. This does not at all mean that the problem really is solved.

I have yet to hear about any big production deployments of rook, care to provide any to support your claims?

jessebro · on July 24, 2018

Jesse here. I'm the eng manager for GKE and GKE On-Prem Storage lifecycle over at Google. For basic persistent volumes, we have working vSphere (block) support, which allows us to do a lot (including persistent/stateful services).

We also have great storage abstraction layers built into K8S - CSI, FlexVolumes, and a large suite of in-tree plugins - so adding additional ones is pretty easy.

I think you're talking about something else however - specifically scaled/distributed storage services.

We're investigating options here, though keep in mind you should have no problems running containerized storage services in GKE On-prem. They would be on top of the existing block support I mentioned above. I saw a couple comments in other threads from some vendors that sell solutions that do just that.

For storage systems that don't containerize, that is a different discussion. Happy to talk more.

_hoa8 · on July 24, 2018

Karan from the GKE On-Prem eng team here.

In the Alpha, we are supporting vSphere 6.5. Which part of infra are you most curious about knowing?

ninkendo · on July 24, 2018

Ouch. Way to bury the lede!

Is this ever going to be a bare-metal thing? Like probably many others, I'm not really interested in doing on-prem virtualization... kubernetes is interesting to me because containers are a better abstraction than virtual machines in the first place. Why add a virtualization layer if you don't have to?

(I get that it makes your life easier as the developer of this product, but having to run a virtualization IaaS between your metal and your orchestration makes the whole thing rather uninteresting IMO.)

thockingoog · on July 24, 2018

I feel like I have to say this, even if people get it already.

Walk before you run.

Bare metal is a LOT harder to manage because, well, hardware fails. We hear the demand, for sure, but vSphere represents walking (and has a lot of customers, too :)

ninkendo · on July 24, 2018

Sure, I get why the decision was made.

But, I don't buy the hardware failure argument, because the same is true of running a vsphere installation in the first place.

vsphere migrates VMs to other machines when the hardware fails, but, analogously, the kube scheduler moves pods to other machines when they fail as well. You have to worry about disk failures in both cases. You have to worry about keeping your vsphere's database up and in a high-availability mode (postgres in my experience), just as you have to worry about keeping k8s's etcd cluster up and in a high-availability mode.

For any problem k8s has on bare metal due to hardware unreliability, vsphere has an analogous problem, it's just pushed down one layer.

IMO the real reason why this is a pragmatic decision is because, people already have lots of experience in running vsphere and understand where the risks and challenges are, and vsphere has lots of tools for things like automating the installation of the hypervisor OS, base level network setup, expectations around NFS for VM storage, etc.

Vsphere represents a decent, known set of tools for getting an infrastructure up and running on bare metal, which is a prerequisite for getting kubernetes running, but what would be exciting to me would be a rethinking of those infrastructure components in a purely open source and industry standard fashion, in a no-frills way that only exists to get a basic k8s control plane up.

alxbog · on July 25, 2018

Totally agree with this point of view.

_hoa8 · on July 24, 2018

There are definitely enterprises looking to move away from virtualization. We have that in mind.

We are exploring additional options, such as bare metal support, based on customer demand.

Send me an email (karangoel [at] google) if you'd be interested in talking more about bare metal.

rushins · on Aug 2, 2018

hello karan

so what is the model for bare metal with GKE on-prem. The reason is for VSPHERE 6.5 is additional cost (license per cores to VMWARE and VCENTER license) which we want to avoid to use bare metal only.

alxbog · on July 25, 2018

+1 for Bare Metal support.

rushins · on Aug 2, 2018

i definately want to see bare metal with GKE on-prem as VSPHER E is additional cost

rkeene2 · on July 24, 2018

I previously worked on a similar (non-Google) product that was similar -- it was an "On-Premise Cloud" [0], where the cloud provider managed and owned all the hardware and software, and the customer created workloads on it, and the physical hardware was scaled up/down based on demand.

The product worked well, but I think there was an uphill battle in explaining the mechanics of the arrangement to customers.

[0] https://knightpoint.com/what-we-do/offerings/on-premises/inf...

chumboslice · on July 24, 2018

I'm not sure this is akin to that. This is, for lack of a better term at the moment, more of an enterprise Kubernetes distribution.

alxbog · on July 24, 2018

Does it need some special hardware to run on?

_hoa8 · on July 24, 2018

Not sure what you mean by "special hardware" but if your hardware is capable of running vSphere, GKE On-Prem should work for you.

Quick note though - We are exploring additional options, such as bare metal support, based on customer demand.

gigi930 · on July 24, 2018

So this is like Azure Stack with no limitation on hardware configurations?

vjdhama · on July 24, 2018

Wondering about the same thing. What's the base requirement on on-prem side?

chumboslice · on July 24, 2018

Right now for the alpha release, it's vSphere 6.5

_hoa8 · on July 24, 2018

Hi all. Karan from the GKE On-Prem eng team here. Happy to answer any questions you might have.

abrodersen · on July 24, 2018

Which IaaS vendors are supported?

_hoa8 · on July 24, 2018

We will support vSphere 6.5 in Alpha.

sofaofthedamned · on July 24, 2018

This is awesome! One of my problems with showing clients the ability of GKE is having to pay a lot of money to demonstrate.

How complete is this? Can I do the usual ingress/LB annotations for GKE and apply them to an on-prem instance?

wmf · on July 24, 2018

Since this runs on vSphere presumably it will cost far far more than public GKE.

_hoa8 · on July 24, 2018

We are supporting vSphere in Alpha. However, we are exploring additional options, such as bare metal support, based on customer demand.

sofaofthedamned · on July 30, 2018

Ah, just vSphere? That's a hard no from me then.

swozey · on July 24, 2018

Will masters be on-prem or will the on-prem services communicate to masters at GKE over the master-authorized network?

Excited to check this out, completely came out of left field.

westonh · on July 24, 2018

Masters will run on-prem. We have connection agent that let's us securely talk to the Kube API Server from GCP. We wanted to ensure that the cluster is fully functional even if the connection goes down.

_hoa8 · on July 24, 2018

GKE On-Prem eng here - the entire cluster will live on prem, including the masters.

mdelder · on July 25, 2018

Great to see GKE coming to enterprise datacenters! IBM has been very successful with IBM Cloud Private (https://github.com/IBM/deploy-ibm-cloud-private) bringing an enterprise Kube distribution for VMWare/OpenStack/Bare Metal in enterprise datacenters since last year. I love to see the momentum of another Kubernetes distribution helping create the de-facto next generation of apps for all kinds of use cases.

bhouston · on July 24, 2018

What is the pricing?

westonh · on July 24, 2018

Pricing will be announced at a later date.

bhouston · on July 24, 2018

We have an in-house cloud running Kubernetes and we also use Google Cloud with Kubernetes. It is scary to move to GKE in-prem if we do not know pricing.

praseodym · on July 24, 2018

Is this an externally managed service that runs on your own hardware, or is it packaged software that could run in a fully offline environment?

westonh · on July 24, 2018

While the cluster can operate in a disconnected state, much of the functionality is provided by the connection to GCP. Things like UI integration, policy syncing, Stackdriver, etc. Our early focus is on datacenters that have a connection to the internet. However, we're starting to look a lot more are airgapped environments.

cdnsteve · on July 24, 2018

Very excited to test this out, we have lots of use cases that will benefit from it.

_hoa8 · on July 24, 2018

Let me know if you need a contact at Google to get started.

polskibus · on July 24, 2018

What's the difference in running GKE on-prem vs Kubernetes?

_hoa8 · on July 24, 2018

GKE On-Prem eng here.

GKE On-Prem is packaged with upstream K8s. So for your team that currently uses `kubectl` to deploy or manage workloads, there won't be any differences.

polskibus · on July 24, 2018

What is the value prop then, for someone who already runs open source Kubernetes on-prem? Why should he pay extra for GKE on-prem?

_hoa8 · on July 24, 2018

Excellent question.

GKE On-Prem is a Google provided, validated and supported distribution of Kubernetes and extensions that offer a GKE-like experience in your on-premise datacenter. It makes it easy to install and upgrade Kubernetes and provides access to GCP services such as monitoring, logging, metrics, security and auditing for your on-premise installation. It is the foundational component of the Cloud Services Platform, and is how Google "brings the cloud to you".

CSP combines Kubernetes both in your on-premise datacenter (GKE On-Prem) and Google-managed Kubernetes in GCP (GKE) with Istio and other CI/CD (Cloud Build) and serverless (Knative) products. You can leverage this suite of products to both modernize your existing on-premise applications and build new applications in the cloud.

Additionally, Google will be offering phone and email support similar to the existing GCP support packages.

ironjunkie · on July 24, 2018

What is the benefit, if you get the same output from "kubectl" as regular GKE, or from any other distro.

Basically, this is yet another paid packaged Kubernetes distribution, that has the explicit goal to do "Hybrid clustering" so that it is easier to lure the customer back to GKE. Do I get that right ?

_hoa8 · on July 24, 2018

What we have found out is that most on-prem customers are eager to move to the cloud. Practically it's not easy to just lift-and-shift. So think of this is a ramp to the cloud.

Now, the benefit of upstream K8s is that your dev team can build apps and containers without proprietary APIs; so when you are ready to move to the cloud you are not locked-in.

ironjunkie · on July 24, 2018

Thanks. I agree that lift and shift never happens easily in real life.

That being said, why would I not use the actual free upstream Kubernetes for my on-prem distribution ? (with the help of one of the thousands installer out there like kube-adm, kubespray, etc).

What I have seen working with Kubernetes for quite a while, is that the lowest common demominator is the YAML definitions for your workloads (what you want to run on your Kubernetes cluster). Those should be portable accross any Kubernetes distribution, on-prem or on the cloud. As far as I can tell, today this is already the case.

Is the benefit in this case that you can use the Google ecosystem for logs etc ?

thockingoog · on July 24, 2018

Kubeadm and others help install, but they don't manage clusters over time. We find that the ongoing management of GKE is what really speaks to people.

The workloads will still be as portable as ever, of course.

_hoa8 · on July 24, 2018

> That being said, why would I not use the actual free upstream Kubernetes for my on-prem distribution ? (with the help of one of the thousands installer out there like kube-adm, kubespray, etc).

None of them actually provision your infra for you (VMs, LB rules etc). GKE On-Prem will.

ironjunkie · on July 24, 2018

Ok, thanks I got it, you are bundling everything into a VMWare image, that boots ready to use.

Is it fair to say that this is similar to Canonical Ubuntu MAAS + Juju Kubernetes? I'm sure that Red Hat Openshift must have something similar also to install directly on a pool of managed bare metals.

_hoa8 · on July 24, 2018

I don't know enough about Juju K8s to really say similar it is. GKE On-Prem will be an OS image, set of containers, config, tooling and support.

d0rmin · on July 25, 2018

What is CSP?

philip1209 · on July 24, 2018

I think it's that Google provides the master node as a service. That's a big benefit. Plus, it looks like the add some security tooling.

polskibus · on July 24, 2018

The master node is on-prem? Does that mean that this requires that Google has permanent access to my intranet ?

chumboslice · on July 24, 2018

Google K8s specialist here.

The entire cluster is on-prem. At the moment, you can optionally leverage a secured tether to manage your cluster in GCP with the same management features you've come to expect with GKE proper. If the connection is lost, your cluster is still fully functional, so no there is no requirement for permanent access to your intranet. The access, when it exists, is also secured to only permit specific access between Google's network and your cluster.

polskibus · on July 24, 2018

Thanks, and what are the conditions on keeping my on-prem GKE up-to-date so it doesn't lose compatibility with Google Cloud?

_hoa8 · on July 24, 2018

Yes and no. The connection is needed to show your workloads and other cluster information in GCP Console (similar to what was shown in the keynote). However, if the connection goes down, your cluster will not stop working.

mt42or · on July 24, 2018

Does it provide a way to install k8s nodes on a VMware cluster ?

_hoa8 · on July 24, 2018

GKE On-Prem is a full K8s installation - so nodes and masters. We will support vSphere 6.5 in Alpha.

stre · on July 26, 2018

Is vSphere Essentials Plus supported as we have a good use case. Are there any specific additional requirements eg. vsan or does it all run on the base Esxi product?

ferest · on July 24, 2018

would there still be chance for startups doing similar thing ?

ironjunkie · on July 25, 2018

there are literally 60+ startup distributions that try to sell a "distribution as a service" for the upstream code that is free.

I think most if not all of them will fail, and as usual, the big 3 or 4 will win the market (if I had to bet: Google, Red hat, Canonical and maybe the guys at Heptio that are really cool and got the right attitude)