AWS Advent 2014 – CoreOS and Kubernetes on AWS

Our second AWS Advent Post comes to us from Tim Dysinger. He walks us through exploring CoreOS and Kubernetes.

There’s a copy of the source and example code from this post on Github

What’s a CoreOS?

CoreOS is a fork of CrOS, the operating system that powers Google Chrome laptops. CrOS is a highly customized flavor of Gentoo that can be entirely built in one-shot on a host Linux machine. CoreOS is a minimal Linux/Systemd opperating system with no package manager. It is intended for servers that will be hosting virtual machines.

CoreOS has “Fast Patch” and Google’s Omaha updating system as well as CoreUpdate from the CoreOS folks. The A/B upgrade system from CrOS means updated OS images are downloaded to the non-active partition. If the upgrade works, great! If not, we roll back to the partition that still exists with the old version. CoreUpdate also has a web interface to allow you to control what gets updated on your cluster & when that action happens.

While not being tied specifically to LXC, CoreOS comes with Docker “batteries included”. Docker runs out of the box with ease. The team may add support for an array of other virtualization technologies on Linux but today CoreOS is known for it’s Docker integration.

CoreOS also includes Etcd, a useful Raft-based key/value store. You can use this to store cluster-wide configuration & and to provide look-up data to all your nodes.

Fleet is another CoreOS built-in service that can optionally be enabled. Fleet takes the systemd and stretches it so that it is multi-machine aware. You can define services or groups of services in a systemd syntax and deploy them to your cluster.

CoreOS has alpha, beta & stable streams of their OS images and the alpha channel gets updates often. The CoreOS project publishes images in many formats, including AWS images in all regions. They additionally share a ready-to-go basic AWS CloudFormation template from their download page.

Prerequisites

Today we are going to show how you can launch Google’s Kubernetes on Amazon using CoreOS. In order to play along you need the following checklist completed:

  • AWS account acquired
  • AWS_ACCESS_KEY_ID environment variable exported
  • AWS_SECRET_ACCESS_KEY environment variable exported
  • AWS_DEFAULT_REGION environment variable exported
  • Amazon awscli tools http://aws.amazon.com/cli installed
  • JQ CLI JSON tool http://stedolan.github.io/jq/ installed

You should be able to execute the following, to print a list of your EC2 Key-Pairs, before continuing:

CoreOS on Amazon EC2

Let’s launch a single instances of CoreOS just so we can see it work by itself. Here we create a small a YAML file for AWS ‘userdata’. In it we tell CoreOS that we don’t want automatic reboot with an update (we may prefer to manage it manually in our prod cluster. If you like automatic then don’t specify anything & you’ll get the default.)

Our super-basic cloud-config.yml file looks like so:

Here we use ‘awscli’ to create a new Key-Pair:

We’ll also need a security group for CoreOS instances:

Let’s allow traffic from our laptop/desktop to SSH:

Now let’s launch a single CoreOS Amazon Instance:

Running a Docker Instance The Old Fashioned Way

Login to our newly launched CoreOS EC2 node:

Start a Docker instance interactively in the foreground:

OK. Now terminate that machine (AWS Console or CLI). We need more than just plain ol’ docker. To run a cluster of containers we need something to schedule & monitor the containers across all our nodes.

Starting Etcd When CoreOS Launches

The next thing we’ll need is to have etcd started with our node. Etcd will help our nodes with cluster configuration & discovery. It’s also needed by Fleet.

Here is a (partial) Cloud Config userdata file showing etcd being configured & started:

You need to use a different discovery URL (above) for every cluster launch. This is noted in the etcd documentation. Etcd uses the discovery URL to hint to nodes about peers for a given cluster. You can (and probably should if you get serious) run your own internal etcd cluster just for discovery. Here’s the project page for more information on etcd.

Starting Fleetd When CoreOS Launches

Once we have etcd running on every node we can start up Fleet, our low-level cluster-aware systemd coordinator.

We need to open internal traffic between nodes so that etcd & fleet can talk to peers:

Let’s launch a small cluster of 3 coreos-with-fleet instances:

Using Fleet With CoreOS to Launch a Container

Starting A Docker Instance Via Fleet

Login to one of the nodes in our new 3-node cluster:

Now use fleetctl to start your service on the cluster:

NOTE: There’s a way to use the FLEETCTL_TUNNEL environment variable in order to use fleetctl locally on your laptop/desktop. I’ll leave this as a viewer exercise.

Fleet is capable of tracking containers that fail (via systemd signals). It will reschedule a container for another node if needed. Read more about HA services with fleet here.

Registry/Discovery feels a little clunky to me (no offense CoreOS folks). I don’t like having to manage separate “sidekick” or “ambassador” containers just so I can discover & monitor containers. You can read more about Fleet discovery patterns here.

There’s no “volume” abstraction with Fleet. There’s not really a cohesive “pod” definition. Well there is a way to make a “pod” but the config would be spread out in many separate systemd unit files. There’s no A/B upgrade/rollback for containers (that I know of) with Fleet.

For these reasons, we need to keep on looking. Next up: Kubernetes.

What’s Kubernetes?

Kubernetes is a higher-level platform-as-service than CoreOS currently offers out of the box. It was born out of the experience of running GCE at Google. It still is in it’s early stages but I believe it will become a stable useful tool, like CoreOS, very quickly.

Kubernetes has an easy-to-configure “Pods” abstraction where all containers that work together are defined in one YAML file. Go get some more information here. Pods can be given Labels in their configuration. Labels can be used in filters & actions in a way similar to AWS.

Kubernetes has an abstraction for volumes. These volumes can be shared to Pods & containers from the host machine. Find out more about volumes here.

To coordinate replicas (for scaling) of Pods, Kubernetes has the Replication Controller that coordinates maintaining N Pods in place on the running cluster. All of the information needed for the Pod & replication is maintained in the configuration for replications controllers. To go from 8 replicates to 11 is just increment a number. It’s the equivalent of AWS AutoScale groups but for Docker Pods. Additionally there are features that allow for rolling upgrades of a new version of a Pod (and the ability to rollback an unhealthy upgrade). More information is found here.

Kubernetes Services are used to load-balance across all the active replicates for a pod. Find more information here.

A Virtual Network for Kubernetes With CoreOS Flannel

By default an local private network interface (docker0) is configured for Docker guest instances when Docker is started. This network routes traffic to & from the host machine & all docker guest instances. It doesn’t route traffic to other host machines or other host machine’s docker containers though.

To really have pods communicating easily across machines, we need a route-able sub-net for our docker instances across the entire cluster of our Docker hosts. This way every docker container in the cluster can route traffic to/from every other container. This also means registry & discovery can contain IP addresses that work & no fancy proxy hacks are needed to get from point A to point B.

Kubernetes expects this route-able internal network. Thankfully the people at CoreOS came up with a solution (currently in Beta). It’s called “Flannel” (formally known as “Rudder”).

To enable a Flannel private network just download & install it on CoreOS before starting Docker. Also you must tell Docker to use the private network created by flannel in place of the default.

Below is a (partial) cloud-config file showing fleetd being downloaded & started. It also shows a custom Docker config added (to override the default systemd configuration for Docker). This is needed to use the Flannel network for Docker.

Flannel can be configured to use a number of virtual networking strategies. Read more about flannel here.

Adding Kubernetes To CoreOS

Now that we have a private network that can route traffic for our docker containers easily across the cluster, we can add Kubernetes to CoreOS. We’ll want to follow the same pattern for cloud-config of downloading the binaries that didn’t come with CoreOS & adding systemd configuration for their services.

The download part (seen 1st below) is common enough to reuse across Master & Minion nodes (The 2 main roles in a Kubernetes cluster). From there the Master does most of the work while the Minion just runs kube-kublet|kube-proxy & does what it’s told.

Download Kubernetes (Partial) Cloud Config (both Master & Minion):

Master-Specific (Partial) Cloud Config:

Minion-Specific (Partial) Cloud Config:

Kube-Register

Kube-Register bridges discovery of nodes from CoreOS Fleet into Kubernetes. This gives us no-hassle discovery of other Minion nodes in a Kubernetes cluster. We only need this service on the Master node. The Kube-Register project can be found here. (Thanks, Kelsey Hightower!)

Master Node (Partial) Cloud Config:

All Together in an AWS CFN Template with AutoScale

Use this CloudFormation template below. It’s a culmination of the our progression of launch configurations from above.

In the CloudFormation template we add some things. We add 3 security groups: 1 Common to all Kubernetes nodes, 1 for Master & 1 for Minion. We also configure 2 AutoScale groups: 1 for Master & 1 for Minion. This is so we can have different assertions over each node type. We only need 1 Master node for a small cluster but we could grow our Minions to, say, 64 without a problem.

I used YAML here for reasons: 1. You can add comments at will (unlike JSON). 2. It converts to JSON in a blink of an eye.

Converting To JSON Before Launch

If you have another tool you prefer to convert YAML to JSON, then use that. I have Ruby & Python usually installed on my machines from other DevOps activities. Either one could be used.

Launching with AWS Cloud Formation

SSH into the master node on the cluster:

We can still use Fleet if we want:

But now we can use Kubernetes also:

Looks something like this: img

Here’s the Kubernetes 101 documentation as a next step. Happy deploying!

Cluster Architecture

Just like people organizations, these clusters change as they scale. For now it works to have every node run etcd. For now it works to have a top-of-cluster master that can die & get replaced inside 5 minutes. These allowances work in the small scale.

In the larger scale, we may need a dedicated etcd cluster. We may need more up-time from our Kubernetes Master nodes. The nice thing about our using containers is that re-configuring things feels a bit like moving chess pieces on a board (not repainting the scene by hand).

Personal Plug

I’m looking for contract work to fill the gaps next year. You might need help with Amazon (I’ve using AWS FT since 2007), Virtualization or DevOps. I also like programming & new start-ups. I prefer to program in Haskell & Purescript. I’m actively using Purescript with Amazon’s JS SDK (& soon with AWS Lambda). If you need the help, let’s work it out. I’m @dysinger on twitter, dysinger on IRC or send e-mail to tim on the domain dysinger.net

P.S. You should really learn Haskell. 🙂