Menu
Index

Contact
Atom Feed
Comments Atom Feed

Similar Articles

2017-03-23 16:09
Kubernetes to learn Part 3
2017-03-21 13:53
Kubernetes to learn Part 1
2017-03-21 15:18
Kubernetes to learn Part 2
2015-07-07 21:40
Bayesian Classifier Classes for Python
2015-07-12 07:55
Home Lab Project: Kickstart

Recent Articles

2019-07-28 16:35
git http with Nginx via Flask wsgi application (git4nginx)
2018-05-15 16:48
Raspberry Pi Camera, IR Lights and more
2017-04-23 14:21
Raspberry Pi SD Card Test
2017-04-07 10:54
DNS Firewall (blackhole malicious, like Pi-hole) with bind9
2017-03-28 13:07
Kubernetes to learn Part 4

Glen Pitt-Pladdy :: Blog

Kubernetes to learn Part 4

From our previous parts (1,2,3) we have a basic Kubernetes cluster with HA Master (3 nodes) and 4 worker (minion) nodes.

The big thing at this stage is that as things stand we can't actually get access to the services running on the platform except via each node (NodePort). This certainly isn't ideal since with this setup things could change dynamically and we would want load balancing across all nodes.

At this stage Ingress Controllers don't seem to be mature enough, and load balancers like Vulcan which are ideal for use with Kubernetes (store their config in etcd) are not shipped with CentOS. The aim of this exercise is something low maintenance that could potentially be deployed as a production cluster in an enterprise environment. Having to continually be compiling code and rolling your own components is far from ideal and the more that is under maintenance (especially for security) by the OS vendor, the less maintenance overhead and risk of complications.

This leaves us with HA Proxy as the load balancer of choice which we already configured in Part 3 for load balancing API Server. While you likely want to separate components for management and access in keeping with good practices, for this exercise it doesn't make a lot of difference so I'll be re-using the same HA Proxy instance and adding additional services.

Kubernetes to HA Proxy

While there are some neat tools like confd for tracking changes, this again isn't shipped with CentOS so I'll pass on that. This leaves little choice other than rolling my own glue script.

For this I'm using Python and simply calling kubectl to collect the data. To get the worker nodes (which we need to load balance the NodePort) we use:

# kubectl get nodes -o json
{
    "apiVersion": "v1",
    "items": [
        {
            "apiVersion": "v1",
....

And to get the the services we want we use:

# kubectl get services -o json
{
    "apiVersion": "v1",
    "items": [
        {
            "apiVersion": "v1",
....

Both these output in JSON format which we can easily ingest with Python and then pick out services that have NodePort configured and try to give them their desired service port balancing to the NodePort on all the worker nodes. This gives us a lump of HA Proxy config like:

frontend guestbook_frontend
    bind *:80
    default_backend guestbook_frontend
backend guestbook_frontend
    balance roundrobin
    server 10_146_47_110 10.146.47.110:30197 check
    server 10_146_47_120 10.146.47.120:30197 check
    server 10_146_37_208 10.146.47.130:30197 check
    server 10_146_37_209 10.146.47.140:30197 check

At this point the code is very crude and relies on executing kubectl which is not ideal if your load balancer nodes are separate from the master infrastructure (this is a good idea in production).

Important: before starting the script, copy /etc/haproxy/haproxy.cfg to /etc/haproxy/haproxy.cfg.TEMPLATE as the config file will get overwritten based on the template contents.

The script reads /etc/haproxy/haproxy.cfg.TEMPLATE and writes that with new config appended to /etc/haproxy/haproxy.cfg.TMP, then renames that to /etc/haproxy/haproxy.cfg to ensure that partial configurations are unlikely to be left in place. It polls every 5 seconds and updates the configuration when it changes, reloading HA Proxy after putting the new config in place.

Things that could be easily adapted / changed / improved with this script:

  • Access the API directly rather than via kubectl to allow easy use of separate load balancer nodes
  • Control an external Load Balancer (eg. F5 or others)
  • Be more intelligent about frontend port allocations, handle SSL/TLS, more application awareness
  • Use virtual hosts (host based requests) to put multiple applications on the same port (if you want that from a security/separation PoV)

The Kubernetes HA Proxy script is available on Github.

Once this is running on all the HA Proxy nodes, you should be able to access the application (using the Guestbook example from Part 1) on the load balancer VIP:

Kubernetes NodePort with HA Proxy

This will need you to edit the service to change ClusterIP to NodePort as we did in Part 1. After that you should see the script update HA Proxy and the service should become available.

Thoughts

This gives us a reasonably practical approach to take advantage of Kubernetes without excessive operational effort and without any deep tinkering which will increase future effort required to migrate to new versions.

There are still a number of things that are not ideal with this. At this stage security is neglected and we have no authentication across nodes which is relatively easily solved with some extra configuration.

Potentially a bigger challenge with security and compliance is the fact that this (using Flannel) all uses a flat network which would be outlawed in any strict security/compliance environment. While some filtering could be put in place, all container networks route to all others within the cluster. This is a limitation of Flannel at this time and there is multi-network support on it's way which will help. There are also alternative networking layers, but again, they more and deeper you tinker with standard configurations, the more effort (cost) it's going to require in the long term to maintain. The one exception that looks like it might become the favoured solution is Calico which attempts to simplify things while still maintaining compatibility.

As with any rapidly changing technology, there is always risk of approaches falling out of favour or being displaced in a relatively short timeframe, and if you are making a large commitment to an approach then you could be left with extremely disruptive (expensive) and risky changes being necessary to your setup.

Comments:




Note: Identity details will be stored in a cookie. Posts may not appear immediately