Category Archives: ACS

Marathon LB service discovery

Solving mystery of the service discovery with Azure ACS DCOS – Part 2

Warning, it Level 300 deep dive topic, novice won’t able to get it. It meant to help wandering souls like me in scarcity of document to explain service discovery with Apache Mesos

In the last post, I have written about the service discovery option with Mesos. I have blabbered about the Mesos DNS. In this post, I would  solve the mystery of how service discovery happens in the sample app deployed in Azure ACS load balance tutorial

Mystery solving part

This post is about solving mystery of sample load balancing app launch in the Azure ACS load balance tutorial

Now here, following the tutorial, we have launch App via marathon. Note, this is import aspect about DCOS cluster. You can launch docker using Marathon or Aurora.
Web App Configuration
  “id”: “web”,
  “container”: {
    “type”: “DOCKER”,
    “docker”: {
      “image”: “yeasy/simple-web”,
      “network”: “BRIDGE”,
      “portMappings”: [
        { “hostPort”: 0, “containerPort”: 80, “servicePort”: 10000 }
  “instances”: 3,
  “cpus”: 0.1,
  “mem”: 65,
  “healthChecks”: [{
      “protocol”: “HTTP”,
      “path”: “/”,
      “portIndex”: 0,
      “timeoutSeconds”: 10,
      “gracePeriodSeconds”: 10,
      “intervalSeconds”: 2,
      “maxConsecutiveFailures”: 10
Here, observe serviceport: 10000. The servicePort is the port that exposes this service on marathon-lb. By default, port 10000 through to 10100 are reserved for marathon-lb services, so you should begin numbering your service ports from 10000 (that’s 101 service ports, if you’re counting). You may want to use a spreadsheet to keep track of which ports have been allocated to which services for each set of LBs.
Now, here I have hostPort is 0, that’s means that Marathon will arbitrarily allocate an available port on that host. This is import aspect of docker hosting. If, I hard code say port 80 in the configuration, ability to scale container is limited.
Take an example, I have 2 agents, and I want to launch 3 containers of the web. Then, the first container will go to agent1, and port 80 on agent1 would be mapped to port 80 on the container. The second container will go to agent2 as agent1 port 80 is taken. The third container would fail to start because there aren’t any port 80 available on both the agent.
Carrying forward the same example, with host port 0, marathon will dynamic port would be assign on hosting say 5252. The second container could have 5253 and third could have 5254 based on the availability of port on that host.
But, next problem how is how will other container call this 3 containers?
There some marathon-lb service, which acts as the load balancer and load-balances web requests to this containers.
What would be load balancer address which you would use to load balances this request?
In order to answer this, we need to understand how Mesos DNS space works.
When this application is launch, it would have DNS of <task>.<service>.mesos i.e. in our case, [web] app which we launch using the json translate to web.marathon.mesos.
If this would be single instance of [web] app with hard coded port 80, then Mesos DNS would register web.marathon.mesos to IP of the AgentVM where it been deployed and access http://web.marathon.mesos/ on master VM would land the UI of the web application.
But, now we have three instances on the app sitting on different agent VM listing to arbitrary port. Here where Marathon load balancer comes in picture. Service Port declare above app configuration is used by Marathon-lb to provide the endpoint to listen to web service.
In load balance scenario, marathon-lb provides the load balance endpoint on <marathon-lb-name>.<framework-name>.mesos:<service port number>. In our case, this translate to marathonlb-default.marathon.mesos:10000 where 10000 is the service port configure on the marathon-lb
Complete communication from browser to the container.
  1. Browser hit the Azure Load balancer on port 80
  2. Azure load balancer forwards request to VMSS in public subnet
  3. In our case, public VMSS has just one VM running Marathon-LB
  4. Marathon-LB is base on HAProxy, which has configuration to listen on port 80 and forward to marathonlb-default.marathon.mesos:10000
  5. marathon-defaultlb is created on the service port definition, which again load balance request to child container running in the private subnet of the cluster
Below diagram tries to explain in the overview.
 Marathon LB service discovery
Now, here base on marathon app definition, it was listing to service port 10000.
If database server needs to hit rest endpoint on the web service, it needs to point to marathonlb-default.marathon.mesos:10000
Database server can register itself on marathon-lb with 10100 port
Marathon-LB endpoint for DB would be marathonlb-default.marathon.mesos:10100
Web can access just marathonlb-default-marathon.mesos:10100 and it will route traffic to one of the instances of container running in cluster
Checking the HAProxy stats
Before that, in ACS-Mesos
  1. Open the port 9090 in public Network Security Group
  2. Add port 9090 in load balancer rule
Now, access haproxy stats of
To access haproxy config in LUA language
There are more, you can reference in below link
This in nutshell represents the ACS-Mesos service discovery using Mesos

Solving mystery of the service discovery with Azure ACS DCOS – Part 1

I been currently working with Azure Container service, and was working with Mesosphere DCOS, Marathon and Mesos to design insanely scalable architecture of 10000 of nodes.

There is a nice tutorial on the Azure website, where it how to deploy the app on the DCOS cluster with Marathon Load balancer.

If you are new to DCOS, Marathon and Mesos, I recommend you to read my previous post which gives you the peep into Docker cluster world.

This post is Level 300, deep dive for people, who need to understand how does service discovery works in Mesosphere DCOS ecosystem.

What is Service Discovery?

Service discovery allows network communication between services.  In Mesos space, containers are known as services. So Service discovery would be knowing well known address of other containers running in the cluster.

There is another post, which I wrote on the Service discovery explained in layman’s term. Check it out.

In DCOS Mesos, this happen in two ways


Mesos DNS

Mesos-DNS is a stateless DNS server for Mesos. Contributed to open source by Mesosphere, it provides service discovery in datacenters or cloud environments managed by Mesos.

What Mesos DNS offer?

Mesos-DNS offers a service discovery system purposely built for Mesos. It allows applications and services running on Mesos to find each other with DNS, similarly to how services discover each other throughout the Internet. Applications launched by Marathon or Aurora are assigned names like search.marathon.mesos or log-aggregator.aurora.mesos. Mesos-DNS translates these names to the IP address and port on the machine currently running each application. To connect to an application in the Mesos datacenter, all you need to know is its name. Every time a connection is initiated, the DNS translation will point to the right machine in the datacenter.


How does it work?


Mesos-DNS periodically queries the Mesos master and retrieves the state of all running applications for all frameworks. It uses the latest state to generate DNS records that associate application names to machine IP addresses and ports. Mesos-DNS operates as the primary DNS server for the datacenter. It receives DNS requests from all machines, translates the names for Mesos applications, and forwards requests for external names, such as, to other DNS servers. The configuration of Mesos-DNS is minimal. You simply point it to the Mesos masters at launch. Frameworks do not need to communicate with Mesos-DNS at all. As the state of applications is updated by the Mesos master, the corresponding DNS records are automatically updated as well.
Mesos-DNS is simple and stateless. Unlike Consul and SkyDNS, it does not require consensus mechanisms, persistent storage, or a replicated log. This is possible because Mesos-DNS does not implement heartbeats, health monitoring, or lifetime management for applications. This functionality is already available by the Mesos master, slaves, and frameworks. Mesos-DNS builds on it by periodically retrieving the datacenter state from the master. Mesos-DNS can be made fault-tolerant by launching with a framework like Marathon, that can monitor application health and re-launch it on failures.
Mesos-DNS defines the DNS top-level domain .mesos for Mesos tasks that are running on DC/OS. Tasks and services are discovered by looking up A and, optionally, SRV records within this Mesos domain. To enumerate all the DNS records that Mesos-DNS will respond to, take a look at the DNS naming documentation

What is Marathon-LB

Marathon-LB is tool provide for containers launch via Marathon App in Mesos. LB stands for Load Balancer, which helps to dynamically add or removing containers from the load balancer running on various Mesos slaves

How Marathon-LB works?

Marathon-lb is based on HAProxy, a rapid proxy and load balancer. Real magic happens when Marathon-lb subscribes to Marathon’s event bus and updates the HAProxy configuration in real time.

It means any new container instantiates, it will add those new containers to load balancer pool automatically in a fraction of second and restart HAProxy with almost zero downtime to route traffic to new containers. Same goes, when container dies out.

Below is the architecture for Marathon-LB

Marathon Load Balancer

Marathon Load Balancer

You can read the nice documentation of Marathon-LB at Mesosphere blog.

Here below, is how Marathon-LB looks on Marathon Web UI Marathon Web UI

Next post would be more interesting about mystery solving of Azure ACS load balance app.


Working with Azure Container Service

However, you have wonder about the lesser known Microsoft Azure Container Service (ACS), and would be wondering about what is ACS

Here the ACS in the nutshell

  • Makes simpler to Create , Configure and Manage a cluster of machines which are preconfigured to run containerized applications
  • Uses optimized configuration of scheduling and orchestration tools
    ACS leverages Docker container format
  • Simplifies the creation of the cluster by providing ‘Wizard’ style
    rather than setting up and configuring the set of co-ordinated machines and software on top of Azure Platform
  • Supports two platforms
  • Docker Swarm to scale to hundreds or thousands of containers
  • Marathon and DC/OS
  • Built on top of Azure VM Scale Sets

What, Why and How


What: Azure Container Service is open source container orchestration system.

Why: It is used to deploy and manage dockerize container within the data center.

How: It does orchestration either by using Docker Swarm or Apache Mesos

Where it helps

Where it helps.png

Where it helps2

Where it helps3

There are two ways to work in ACS


Docker Swarm and Mesos are the orchestrators, which in plain English tell host system which container to host where.

You can view the animated slides at

I would be posting more information regarding this in the following post to come