Category Archives: Docker

Service discovery sauce with Interlock and Nginx

Poster Information: If you need to know more about the Service discovery, what options are available and how things change in docker 1.12, check out my service discovery post.

This article applies to docker standalone swarm compare to docker 1.12 swarmkit, where docker engine has integrated swarm mode.

Note: This point forward, in this article, I am referring to swarm, I am referring to standalone swarm and not swarmkit one.

I started the work in Docker 1.11 era, when the swarm was a separate tool from the docker engine, and you need to launch a couple of swarm containers to setup the swarm cluster.

IMHO, interlock + Nginx is poor man’s tools in terms of service discovery. There would be better options available, but for me, it all started with taking a look at swarm at scale example at the docker site. They have shown how to use interlock with Nginx for load balancing and service discovery. Not knowing much on service discovery, and having working example demonstrate was good enough for me to engage with interlock.

Interlock is swarm event listener tool, which listens to the swarm events and performs a respective action on the extension. As of current (Dec 2016), it supports only two extensions, Nginx and HAProxy. Both acts as service discovery and load balancer. There are another extension planned called beacon, which would be used for monitoring and autoscaling perhaps, but now seem to be abandon, thanks to docker 1.12 swarmkit

In simple terms, there are three actors in the play. Swarm manager, Interlock and Nginx acting as a load balancer. Best part, all three runs as the docker container. It means no installation/configuration at host VM and easy to set them up.

Interlock listens to swarm master for start or stop container events. When it hear something about it, it updates the Nginx config. Below animated diagrams explains it better

Interlock.gif

Interlock play in action

Now, we know “what” part of the Interlock, let move towards the “how” part. Unfortunately, there aren’t much documentation on the interlock available on the net. Interlock QuickStart guide provides some clue, but it missing the enterprise part. It doesn’t guide much if you are using docker-compose with the multi-host environment.

For later part, you can draw some inspiration from the Docker at scale, and there is obsolete lab article which shows how to use interlock with docker-compose, but the interlock commands are obsolete for 1.13 (was latest in Oct 2016) version.

I am not planning to write entire to-do article for interlock, but intent to hints some useful when running is multi-host docker cluster with interlock.

The first part is docker swarm cluster, articles like Docker at scale and codeship used docker-machine to create an entire cluster on your desktop/laptop.  I have been more privilege to use R&D cloud account and use Azure Cloud to create my Docker cluster. You use tools like UCP, docker machine, ACS, docker cloud and many other are there in the market or just create cluster manually handheld. It doesn’t matter, where you run your cluster and how did you create it, as long you have working swarm cluster, you are good to play with Interlock and Nginx

Another piece of advice, while setting up swarm cluster, it not mandatory, when good practice to have a dedicated host for interlock and Nginx container. You can see swarm at scale article, where they use docker engine labels to tag particular host for tagging.

If you are docker-machine, you can give docker engine labels similar to below

Docker engine labels.png

And in the docker-compose.yml file, you would specify the contrains that container would be load at com.function=interlock

interlock-compose

Interlock in docker-compose.yml file

Now, in order to prepare interlock sauce, you need following ingredients to set it right

  1. Interlock Config (config.toml)
  2. Interlock data labels
  3. (Optional) SSL certificates

Interlock Configuration File

Interlock uses a configuration store to configure options and extensions. There are three places where this configuration can be saved

1) File,  2) Environment variable or 3) Key value store

I find it convenient to store it in a file. This file by Interlock convention is named as config.toml.

Content: This file contains key-value options which are use by interlock and it’s extension. For more information, you can see https://github.com/ehazlett/interlock/blob/master/docs/configuration.md

Location: If you are running Swarm on multi-host, this file needs to be present  on a VM which will host interlock container. You can then mount this file to a container by volume mapping. See docker-compose file above for more info

TLS Setting: If you are running Swarm on TLS, you need to set TLSCACert , TLSCert, TLSKey variable in toml file. For info, read setting Docker on TLS and Swarm on TLS.

TLSCACert = “/certs/ca.pem”
TLSCert = “/certs/cert.pem”
TLSKey = “/certs/key.pem”

Plus, this certificate needs to be present on a VM which will host interlock container. You can then mount this certificates via volume mount in the compose file. See docker-compose file above for example

PollInterval: If your interlock is not able to connect to Docker swarm, try setting the PollInterval to 3 seconds. In some environments, the event stream can be interrupted and hence Interlock need to rely on pooling mechanism

Interlock Data Label

Now, we have just setup Interlock with Nginx. If you have carefully observe the config.toml file, nowhere we have given which container we need to load balance. Then, how would interlock get this information from?

This brings us to the Interlock Data Labels. It the set of labels you pass to the container, which when Interlock inspect, would know, which containers it needs to load balance.

Here below example shows how to pass Interlock label along with other container labels.

interlockdata

Example of Interlock Data labels in docker-compose.yml

You can get more information about the data labels at https://github.com/ehazlett/interlock/blob/master/docs/interlock_data.md

There is another example from Interlock repo itself, where it how to launch interlock with Nginx as load balancer in docker swarm via compose.

https://github.com/ehazlett/interlock/blob/master/docs/examples/nginx-swarm/docker-compose.yml

(Optional) SSL certificates

As seen the above Interlock label, there are lot of interlock variable related to SSL.

To understand better, we will enumerate to different combinations with SSL, we can setup load balancer

I) NO SSL

You can have flow something like this

OnlyHTTP.png

HTTP Only

Here, we are not using SSL at all.

II) SSL Termination at the Load Balancer 

Or you if you planning to use Nginx as the frontend internet facing load balancer, you should do something like this

HTTPS Termination.png

SSL Termination at the load balancer

III) Both lags with SSL

In my case, there was compliance requirement where all the traffic, internal or external needs to be SSL/TLS. So, I need to do something like this

HTTPS Only.png

HTTPS only traffic

For Case II and III, you need to set interlock SSL related data label. Let me give quick explanation of important ones

interlock.ssl_only : It you want your load balancer to list to HTTPS traffic only, set this to true. If false, the interlock configure Nginx to listen to both, HTTP and HTTPS. If true, then it set redirection rule in HTTP to redirect it to HTTPS

interlock.ssl_cert: This needs to be the X509 certificate path which load balancer will use to server frondend traffic. This certificate Common Name equal to load balancer name.  Plus, in multi-host environment, this certificates needs to be present on the machine which launch the Nginx container. You can then mount this file to a container by volume mapping. See docker-compose file above for more info

interlock.ssl_cert_key: Private key from X509 certificate. Same goes with key, it needs to be on the VM which will run Nginx container.

If your backend requires certificate client authentication, as it was in my case, then interlock has no support for it. But, there is a hack to SSL proxy the certificates. But, that for the another post.

Hope, information I share with you was useful. If you want any help, do write in comments below

What the heck is Service Discovery?

If you are working with container technologies like Docker Swarm, kubernetes or Mesos,  sooner or later you will stumble upon service discovery.

What the heck is service discovery? In layman’s term, it the way one container to know where another container is. Better explain with an example, where Web container needs to connect to DB container, it needs to know the address of DB container.

In container world, containers are like scurry of squirrels. They keep of jumping from one host (may call it VM) to other, moving one cluster to other for the cause of high availability and scalability.

In the tantrum, how does one container connect with other? Here comes the part of Service Discovery, a thing which can tell the present address of container/s. In layman’s term, imagine it to be like a librarian, who tell this book (container) is with Joe, Brandan or anybody else. Service Discovery is like a directory of addresses (Hostname, IP, Port) on all the containers running in a cluster.

If a new container is born or died, this Service Discovery updates its directory to make new entry or delete its entry.

Note, I have explain Service Discovery in it’s core function, but discovery (directory) is not the only function it performs, tools which implement service discocery provides additional functions like load balancing, health check, NATing, bundling, etc. But, if the tools provide all the additional function but not discovery, then it’s should be called Service discovery.

Now, knowing “what” is Service Discovery, come the next part “how”. How do we achieve Service discovery in a docker cluster? There is no single answer to this, answer is “it depends”.

It depends on what stack are you using for docker clustering, is it Mesos, kubernates, Nomad, Docker Swarm.

A stack come with related set of tools for scheduling, storage, scaling, batch execution, monitoring, etc. Stack is collective term of tools. Some stack has all, some stack has few. Service discovery would be part of the sets of tools your choosen stack provide.

Mesos, kubernates has DNS based service discovery. Nomad use consul (service registry). Docker swarm has two stories. If you are using docker prior to 1.12 with swarm, you would use consul, etcd or pick one from large open source community for service discovery. If you are using Docker 1.12 and all next versions, it comes with integrated swarm discovery with based on DNS server embedded in the swarm.

If you are not sure, what I am taking about all the these stack, I would suggest, take a step back,  make google your best friend and try to research what these stack means, what sets of tools and capability they offer, and how they differ for each other. Some are simple and easy to learn, other has many features but also steep learning curve.

If you are new and incline to docker swarm stack, eye close, go with latest docker with integrated swarm. While I was writing post in Dec 2016, Docker 1.3 was in beta, when you would be reading, do research and find out what latest version docker it would be running.

 

When I was working, that was era of Docker 1.11 and I did service discovery using poor man’s tools, Interlock with Ngnix. Why? I find it to be easiest to work with it when you working with Swarm, Consol plus it provides some cool features like load balacing, health checks and ssl offloading (which are nothing, but feature of load balancer).

I would be writting another post of sharing my experience with Interlock and Nginx.

 

 

What is Docker?

The headline from the docker website is a pretty good summary:

Build, Ship and Run Any App, Anywhere.

Docker – An open platform for distributed applications for developers and sysadmins.

On the “What is Docker” page, is a bit more information, but still in marketing speak:

Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications. Consisting of Docker Engine, a portable, lightweight runtime and packaging tool, and Docker Hub, a cloud service for sharing applications and automating workflows, Docker enables apps to be quickly assembled from components and eliminates the friction between development, QA, and production environments. As a result, IT can ship faster and run the same app, unchanged, on laptops, data center VMs, and any cloud.

So, creating a docker image is similar to creating a virtual machine but it’s more lightweight as it runs within a linux environment, so it can allow that environment to handle things like device drivers and the like. Docker does this by combining a lightweight container virtualization platform with workflows and tooling that helps manage and deploy applications.

At its core, Docker provides a way to run almost any application securely isolated in a container. The isolation and security allow you to run many containers simultaneously on the host. The lightweight nature of containers, which run without the extra load of a hypervisor, means you can get more out of your hardware.

Its also like a version control system for any changes that are made to the docker image whilst it’s being built. This allows efficient management of the Docker images. If we have a Docker image that just has a Java installation on it, and we next build two new images, one with Tomcat and the other with Apache HTTPD, then they can both reference the same base Java Docker image, and the new images will only have the changes above the base Java image.

A Docker container consists of an operating system, user-added files, and meta-data. Each container is built from an image. That image tells Docker what the container holds, what process to run, when the container is launched, and a variety of other configuration data. The Docker image is read-only. When Docker runs a container from an image, it adds a read-write layer on top of the image (using a union file system) in which the application can then run. Unless those changes are saved outside the Docker image (like shared drives for Virtual Machines), or the Docker image is “committed” once it’s finished, those changes are lost when the container closes.

What our Docker strategy?

So how will we use Docker within our Project? Initial efforts are on producing repeatable testing environments for QA, using one docker container, running one docker image, per component. This is an accepted standard for docker containers, so that’s one container for Tomcat running the public facing web interface, one for Tomcat running the ESB, and one for the caching services, one for MongoDB, and, well you get the picture.

As an aside, as the Docker runtime provides a REST interface we can do clever things in combination with GIT hooks. For example, if we commit a new feature branch to the central repository, we’d need somewhere for QA’s to test the new features separate from the main development branch QA environment. Using GIT hooks to let us know when a new branch has been created, we can create a new set of docker containers running the whole of the application, so that QA can test the new features. Once the branch has been merged back into the main development branch, these new containers can be archived, and torn down.

Various Docker container can be link together using Docker compose. Docker compose is nice tool, which have intuitive way of defining your multi-container application with all of its dependencies in a single file. Entire application can be spin up within the docker host using command line

This leads naturally onto to deploying into Linux environments, and into cloud based environments such as Amazon Elastic Compute Cloud (EC2), and Azure, both of which support Docker out-of-the-box.

Currently, we haven’t doing our development using Docker, due to fact vexing process of setting up the development. Currently it’s a bit clunky to get Docker running on Windows, which involves the use of a Virtual Box running TinyLinux, which in turn runs the docker runtime, and hence the containers. There is only one share available between the host Windows environment and Virtual Box that is accessible from within the docker runtime, and that’s on a fixed path. Plus, remote debugging needs to set between the machines.

One hack would be to setup dev environment in the single container with our fav IDE, code and symbols in one box, but this won’t go the long way for the next stage deployment.

Vapourware!

There is a lot of hype around Docker, and every man and their dog aspires to include Docker in their product descriptions, whether it makes sense to do so, and whether they have a product rather than a set of loose documentation and a variety of aims.

Microsoft is a case in point. They have stated that they will support Docker in the future, but what isn’t widely reported is that they will only support Windows Docker images – it won’t run the tens of thousands of linux based docker images. For anyone old enough, Windows NT used to have a POSIX subsystem, so somewhere that code must still exist….

Another area that promises much, but delivery can vary is the area of value-add on top of Docker, in areas such as clustering, distributing and monitoring of Docker containers. The two solutions for enterprise level distributing services and applications, Chef and Puppet, were quick to deliver solutions around Docker.

On a smaller scale there are products such as ShipYard for monitoring Docker containers, but a lot of the functionality is still on the horizon. The Docker team have several interesting side projects that are in differing stages of completion:

Compose: Compose is a tool for defining and running complex applications with Docker. With Compose, multi-container applications are defined in a single file, then the application spun up in a single command which does everything that needs to be done to get it running.

Swarm: Docker Swarm is native clustering for Docker. It turns a pool of Docker hosts into a single, virtual host. Swarm serves the standard Docker API, so any tool which already communicates with a Docker daemon can use Swarm to transparently scale to multiple hosts. It ships with a simple scheduling backend out of the box, and as initial development settles, an API will develop to enable pluggable backends. The goal is to provide a smooth out-of-box experience for simple use cases, and allow swapping in more powerful backends, like Mesos, for large scale production deployments. Currently in beta, not ready for production.

Further Reading

The main Docker website can be found at https://www.docker.com/. For further information on Compose hit https://docs.docker.com/compose/, and for Swarm hit https://docs.docker.com/swarm/.

Running Docker locally on Windows (and Mac), then boot2docker is a must, which can be found at: http://boot2docker.io/.

There are a couple of interesting videos on YouTube which introduce Docker and give a few real world examples: Introduction to Docker and A 3 hour introduction and real world examples (3 hours!!)