Atom Feed
Comments Atom Feed

Similar Articles

2017-03-21 13:53
Kubernetes to learn Part 1
2017-03-23 16:09
Kubernetes to learn Part 3
2017-03-28 13:07
Kubernetes to learn Part 4
2009-09-15 23:24
Too hot to handle
2008-11-23 09:11
Photography HOWTO 2: Understanding Exposure

Recent Articles

2019-07-28 16:35
git http with Nginx via Flask wsgi application (git4nginx)
2018-05-15 16:48
Raspberry Pi Camera, IR Lights and more
2017-04-23 14:21
Raspberry Pi SD Card Test
2017-04-07 10:54
DNS Firewall (blackhole malicious, like Pi-hole) with bind9
2017-03-28 13:07
Kubernetes to learn Part 4

Glen Pitt-Pladdy :: Blog

Kubernetes to learn Part 2

So far I've built a basic Kubernetes cluster with one master and two worker (minion) nodes. This time I'm going to be looking at building on that to give us HA masters.

As part of this I'm adding another two worker nodes and will use the first three nodes (original cluster) to run the HA master.

There are many different approaches to this, but for simplicity here I'm using a static configuration to make etcd redundant.

etcd - adding nodes

Getting this working had proved problematic, and once again I suspect subtle differences between versions/distros and the way they do things.

Since we've already got data in etcd on the first node, I'm going to hot-add nodes so far as possible. This isn't completely possible if you've followed the very basic config in Part 1, but if you are doing this for real from the start then the process can be made much cleaner with a little planning.

It's worth noting that since all these services communicate over the network, there's no reason why etcd has to be on specific nodes - we could theoretically create an etcd cluster separate from other master services and that might make a lot of sense on a very large scale.

On the other etcd nodes I'm just installing:

# yum install etcd

Then some basic configuration in /etc/etcd/etcd.conf to prepare:


Note that now that we're putting more nodes in, having unique node names (rather than "default") becomes important.

Since we'll be adding this node to an existing cluster we'll set up the other nodes that it will need awareness of and set it to join an existing cluster rather than the default of being a new cluster. Note that we only add one extra node. This is due to needing 3 nodes before the cluster is redundand, but more about that later:


Our existing etcd node was configured (default) to only listen on localhost so we need to change that in it's /etc/etcd/etcd.conf and restart it:


Since our existing etcd node has already picked up an address for peers based on the main (default route) interface, we have to change that to our private network. You won't have to do this if everything is in a flat network.

# etcdctl member list
3afddc9561d18a43: name=node0 peerURLs= clientURLs= isLeader=true
# etcdctl member update 3afddc9561d18a43
Updated member with ID 3afddc9561d18a43 in cluster

Now this is the critical step that doesn't seem to be mentioned consistently - we need to explicitly tell our existing etcd node about the other cluster members before we can add them else you will get errors along the lines of "error validating peerURLs ..........: member count is unequal". This is done by adding the name and url of each on the existing etcd node with:

# etcdctl member list
3afddc9561d18a43: name=node0 peerURLs= clientURLs= isLeader=true
# etcdctl member add node1
Added member named node1 with ID ecc3b604762ee9c2 to cluster


Since etcd works on the basis that for a cluster to be working it must have N/2+1 nodes active (I won't go into the computer science why this is necessary other than to say we need to get a majority vote), adding a second node that is not running immediately stops the cluster. This is important to remember since to accommodate failure of 1 node requires at minimum 3.

At this point we can start the second node:

# systemctl enable etcd.service
# systemctl start etcd.service

And now we should see both nodes irrespective of which node we run this on:

# etcdctl member list
3afddc9561d18a43: name=node0 peerURLs= clientURLs= isLeader=true
ecc3b604762ee9c2: name=node1 peerURLs= clientURLs= isLeader=false

Since we need redundancy, we need to configure our 3rd node in the same manner as above except on it's IP address and setting all 3 nodes in the configuration:


And then add this to the cluster, but this time the cluster will not go down because we will have the 2 of 3 nodes, and this allows us to list the cluster again before we start the 3rd node:

# etcdctl member list
3afddc9561d18a43: name=node0 peerURLs= clientURLs= isLeader=true
ecc3b604762ee9c2: name=node1 peerURLs= clientURLs= isLeader=false
[root@kubec0 etcd]# etcdctl member add node2
Added member named node2 with ID 7996dfc4c2e60935 to cluster

[root@kubec0 etcd]# etcdctl member list
3afddc9561d18a43: name=node0 peerURLs= clientURLs= isLeader=true
7996dfc4c2e60935[unstarted]: peerURLs=
ecc3b604762ee9c2: name=node1 peerURLs= clientURLs= isLeader=false

Then on the 3rd node start etcd:

# systemctl enable etcd.service
# systemctl start etcd.service

This should leave us with all 3 nodes running:

# etcdctl member list
3afddc9561d18a43: name=node0 peerURLs= clientURLs= isLeader=true
7996dfc4c2e60935: name=node2 peerURLs= clientURLs= isLeader=false
ecc3b604762ee9c2: name=node1 peerURLs= clientURLs= isLeader=false

At this point we have a 3 node etcd cluster which can tolerate the failure of 1 node. That's far from our multi-master setup being complete, but a good step forward.

It is probably a good idea to align the configuration across all 3 nodes for consistency, even though it's not altogether necessary with the way that etcd works. This is more to be clear to any humans looking at the configuration of one node what is going on across the lot.

If things go wrong

The problem you might hit here is if there is a problem with your 2nd node when you only have 2 nodes in the cluster, or you break 2 out of the 3 node cluster. This will take the cluster down. While etcd tries to elect between nodes, if you need to re-create nodes this will be a problem since you could get yourself in a situation where you need to go back to a single node when experimenting.

Beware that the process here needs to be treated cautously since we are deliberately stripping the cluster back to one node, deleting data on other nodes and re-adding them. This could result in data loss so ensure you have taken precautions to be able to recover.

This can be done by ensuring that all nodes are down, and setting this line temporarily in /etc/etcd/etc.conf to tell it to start as if it's the only node again:


Then start up this node. You should see in the member list that there is only this node. Immediately remove this config so it doesn't get used again.

If the single node is working as exected (eg. test with "etcdctl ls" to see if there's data in it still), you can remove the data on the next node we are going to add to the cluster:

# rm -rf /var/lib/etcd/default.etcd/

And then go through the process of re-adding. You can repeat this for further nodes until the cluster is fully running again.

Using the etcd cluster

While we have a cluster, services are still all pointing everything to the single etcd node. We need to change the configuration to point to multiple nodes.

In /etc/kubernetes/apiserver set all nodes and restart the service:


For all worker nodes we need to set flannel to use all the etcd endpoints in /etc/sysconfig/flanneld with in a similar way and restart the service:



While HA is good for keeping things up, the last safety-net when everything goes bad is to have a backup. With etcd we can explicitly tell it to make a backup specifying the data path with:

# etcdctl backup --data-dir /var/lib/etcd/default.etcd/

This will create a backup in the current directory, else you can use the --backup-dir option to specify another location.

This should work by creating a replica snapshot of the data directory, however watch out for permissions since the files will be created owned by the user who ran etcdctl (eg. root) and the real files have ownership etcd.etcd which obviously is important.

All going to plan these files could be replaced with a backup on a single node and started up as described under "If things go wrong" above to bring up the first node in a cluster. This is something that should be tested routinely to ensure your backups can be fully restored.

There is further information on this in the CoreOS Admin Guide


At this point our etcd should be capable of tolerating a failure of a single node. Other master services (eg. apiserver) are not yet redundant so that will be the next thing we look at.

While there's lots of information available, sometimes getting the particular information that's relevant is not so easy. That's made getting this setup a lot harder work than it strictly needed to be.


Note: Identity details will be stored in a cookie. Posts may not appear immediately