Category Archives: Azure

Monitoring Azure VMs with Power BI

I had been into the situation where we were running some performance test on the Azure VM and need to monitor their CPU’s

Azure portal (Feb 2017) provides nice metric blade to monitor most of the resources in Azure like Web Apps, Storage Account, SQL Azure, etc. I need to monitor IaaS VM’s.

I was running Linux VM, and there is nice blog on the Azure docs, which show’s how to enable Linux diagnostic on the VM and collect the metric

This works for me, but had two issues. First, Azure metric blade just gives me three filter for time duration. It like, past hour, entire day and pass week. There wasn’t any provision for custom duration say, CPU for yesterday between 1 pm to 3 pm.


Second, I had create a custom dashboard on the Azure Portal for monitoring all my VM’s CPU. But, I can’t share this with non-Azure users. Our performance testers, Managers, etc. didn’t have the Azure subscriptions or any Azure knowledge. To create read-only users with particular dashboard resource group access and getting them the learning curve to view CPU was big ask.

I knew this monitoring reading from the Linux diagnostic accounts are save in Azure Storage Tables,


Linux diagnostic tables in Azure storage as seen in Visual Studio 2015 Cloud explorer

and I can use PowerBI to connect Azure Storage Tables. Power BI report can be publish to Web and can be viewed without need of Azure Subscription account. So, most of my needs were answered.

I create a PowerBI report where I connect to Azure tables. Here, before you click load, you need to filter out the records otherwise, it would download the entire table data to PowerBI which would be in GB’s

I edited to add filter for past week data and then load filter data in Power BI data model (Yes, Power BI stores it own data). Once data is loaded in the Power BI data model, I need to add few new calculated columns in the data model, which I will use this calculated columns to define my new Time Hierarchy.

By default, power BI provides Time hierarchy in Year, Quarter, Month and Date. But, for the data I want Month, Date, Hours and minute

I create my new time hierarchy and create the report out of it. I had create basic report using Column chart which show max CPU by minute, hour, day and month. i.e. Even for 5 min in a hour, if the CPU hit 100%, it would show in cumulative hour as 100%. Same logic will role up to day and month. It helps us in devops to get things in large perceptive of the VM usage.

I am aware, there are better tools in the market like Azure OMS, and from third parties like DataDog and SysDig. But, this is like DIY project rather than using tools.

Word of caution, when using PowerBI with Azure Table Storage, every time PowerBI hits Azure storage table to fetch data, there will be outguess charges on Azure Storage. You can use something call as PowerBI embedded and host Power BI report in same region where your storage account is to avoid these charges.

The whole action I have capture in this youtube video, which

  1. Connect to Azure Table
  2. Filter the Data from Azure Table
  3. Added columns in Data Model for new Time Hierarchy
  4. Create report with new data model
  5. Exploring drill down in power BI

If you have suggestion or comment, do let me know.

Marathon LB service discovery

Solving mystery of the service discovery with Azure ACS DCOS – Part 2

Warning, it Level 300 deep dive topic, novice won’t able to get it. It meant to help wandering souls like me in scarcity of document to explain service discovery with Apache Mesos

In the last post, I have written about the service discovery option with Mesos. I have blabbered about the Mesos DNS. In this post, I would  solve the mystery of how service discovery happens in the sample app deployed in Azure ACS load balance tutorial

Mystery solving part

This post is about solving mystery of sample load balancing app launch in the Azure ACS load balance tutorial

Now here, following the tutorial, we have launch App via marathon. Note, this is import aspect about DCOS cluster. You can launch docker using Marathon or Aurora.
Web App Configuration
  “id”: “web”,
  “container”: {
    “type”: “DOCKER”,
    “docker”: {
      “image”: “yeasy/simple-web”,
      “network”: “BRIDGE”,
      “portMappings”: [
        { “hostPort”: 0, “containerPort”: 80, “servicePort”: 10000 }
  “instances”: 3,
  “cpus”: 0.1,
  “mem”: 65,
  “healthChecks”: [{
      “protocol”: “HTTP”,
      “path”: “/”,
      “portIndex”: 0,
      “timeoutSeconds”: 10,
      “gracePeriodSeconds”: 10,
      “intervalSeconds”: 2,
      “maxConsecutiveFailures”: 10
Here, observe serviceport: 10000. The servicePort is the port that exposes this service on marathon-lb. By default, port 10000 through to 10100 are reserved for marathon-lb services, so you should begin numbering your service ports from 10000 (that’s 101 service ports, if you’re counting). You may want to use a spreadsheet to keep track of which ports have been allocated to which services for each set of LBs.
Now, here I have hostPort is 0, that’s means that Marathon will arbitrarily allocate an available port on that host. This is import aspect of docker hosting. If, I hard code say port 80 in the configuration, ability to scale container is limited.
Take an example, I have 2 agents, and I want to launch 3 containers of the web. Then, the first container will go to agent1, and port 80 on agent1 would be mapped to port 80 on the container. The second container will go to agent2 as agent1 port 80 is taken. The third container would fail to start because there aren’t any port 80 available on both the agent.
Carrying forward the same example, with host port 0, marathon will dynamic port would be assign on hosting say 5252. The second container could have 5253 and third could have 5254 based on the availability of port on that host.
But, next problem how is how will other container call this 3 containers?
There some marathon-lb service, which acts as the load balancer and load-balances web requests to this containers.
What would be load balancer address which you would use to load balances this request?
In order to answer this, we need to understand how Mesos DNS space works.
When this application is launch, it would have DNS of <task>.<service>.mesos i.e. in our case, [web] app which we launch using the json translate to web.marathon.mesos.
If this would be single instance of [web] app with hard coded port 80, then Mesos DNS would register web.marathon.mesos to IP of the AgentVM where it been deployed and access http://web.marathon.mesos/ on master VM would land the UI of the web application.
But, now we have three instances on the app sitting on different agent VM listing to arbitrary port. Here where Marathon load balancer comes in picture. Service Port declare above app configuration is used by Marathon-lb to provide the endpoint to listen to web service.
In load balance scenario, marathon-lb provides the load balance endpoint on <marathon-lb-name>.<framework-name>.mesos:<service port number>. In our case, this translate to marathonlb-default.marathon.mesos:10000 where 10000 is the service port configure on the marathon-lb
Complete communication from browser to the container.
  1. Browser hit the Azure Load balancer on port 80
  2. Azure load balancer forwards request to VMSS in public subnet
  3. In our case, public VMSS has just one VM running Marathon-LB
  4. Marathon-LB is base on HAProxy, which has configuration to listen on port 80 and forward to marathonlb-default.marathon.mesos:10000
  5. marathon-defaultlb is created on the service port definition, which again load balance request to child container running in the private subnet of the cluster
Below diagram tries to explain in the overview.
 Marathon LB service discovery
Now, here base on marathon app definition, it was listing to service port 10000.
If database server needs to hit rest endpoint on the web service, it needs to point to marathonlb-default.marathon.mesos:10000
Database server can register itself on marathon-lb with 10100 port
Marathon-LB endpoint for DB would be marathonlb-default.marathon.mesos:10100
Web can access just marathonlb-default-marathon.mesos:10100 and it will route traffic to one of the instances of container running in cluster
Checking the HAProxy stats
Before that, in ACS-Mesos
  1. Open the port 9090 in public Network Security Group
  2. Add port 9090 in load balancer rule
Now, access haproxy stats of
To access haproxy config in LUA language
There are more, you can reference in below link
This in nutshell represents the ACS-Mesos service discovery using Mesos

Solving mystery of the service discovery with Azure ACS DCOS – Part 1

I been currently working with Azure Container service, and was working with Mesosphere DCOS, Marathon and Mesos to design insanely scalable architecture of 10000 of nodes.

There is a nice tutorial on the Azure website, where it how to deploy the app on the DCOS cluster with Marathon Load balancer.

If you are new to DCOS, Marathon and Mesos, I recommend you to read my previous post which gives you the peep into Docker cluster world.

This post is Level 300, deep dive for people, who need to understand how does service discovery works in Mesosphere DCOS ecosystem.

What is Service Discovery?

Service discovery allows network communication between services.  In Mesos space, containers are known as services. So Service discovery would be knowing well known address of other containers running in the cluster.

There is another post, which I wrote on the Service discovery explained in layman’s term. Check it out.

In DCOS Mesos, this happen in two ways


Mesos DNS

Mesos-DNS is a stateless DNS server for Mesos. Contributed to open source by Mesosphere, it provides service discovery in datacenters or cloud environments managed by Mesos.

What Mesos DNS offer?

Mesos-DNS offers a service discovery system purposely built for Mesos. It allows applications and services running on Mesos to find each other with DNS, similarly to how services discover each other throughout the Internet. Applications launched by Marathon or Aurora are assigned names like search.marathon.mesos or log-aggregator.aurora.mesos. Mesos-DNS translates these names to the IP address and port on the machine currently running each application. To connect to an application in the Mesos datacenter, all you need to know is its name. Every time a connection is initiated, the DNS translation will point to the right machine in the datacenter.


How does it work?


Mesos-DNS periodically queries the Mesos master and retrieves the state of all running applications for all frameworks. It uses the latest state to generate DNS records that associate application names to machine IP addresses and ports. Mesos-DNS operates as the primary DNS server for the datacenter. It receives DNS requests from all machines, translates the names for Mesos applications, and forwards requests for external names, such as, to other DNS servers. The configuration of Mesos-DNS is minimal. You simply point it to the Mesos masters at launch. Frameworks do not need to communicate with Mesos-DNS at all. As the state of applications is updated by the Mesos master, the corresponding DNS records are automatically updated as well.
Mesos-DNS is simple and stateless. Unlike Consul and SkyDNS, it does not require consensus mechanisms, persistent storage, or a replicated log. This is possible because Mesos-DNS does not implement heartbeats, health monitoring, or lifetime management for applications. This functionality is already available by the Mesos master, slaves, and frameworks. Mesos-DNS builds on it by periodically retrieving the datacenter state from the master. Mesos-DNS can be made fault-tolerant by launching with a framework like Marathon, that can monitor application health and re-launch it on failures.
Mesos-DNS defines the DNS top-level domain .mesos for Mesos tasks that are running on DC/OS. Tasks and services are discovered by looking up A and, optionally, SRV records within this Mesos domain. To enumerate all the DNS records that Mesos-DNS will respond to, take a look at the DNS naming documentation

What is Marathon-LB

Marathon-LB is tool provide for containers launch via Marathon App in Mesos. LB stands for Load Balancer, which helps to dynamically add or removing containers from the load balancer running on various Mesos slaves

How Marathon-LB works?

Marathon-lb is based on HAProxy, a rapid proxy and load balancer. Real magic happens when Marathon-lb subscribes to Marathon’s event bus and updates the HAProxy configuration in real time.

It means any new container instantiates, it will add those new containers to load balancer pool automatically in a fraction of second and restart HAProxy with almost zero downtime to route traffic to new containers. Same goes, when container dies out.

Below is the architecture for Marathon-LB

Marathon Load Balancer

Marathon Load Balancer

You can read the nice documentation of Marathon-LB at Mesosphere blog.

Here below, is how Marathon-LB looks on Marathon Web UI Marathon Web UI

Next post would be more interesting about mystery solving of Azure ACS load balance app.


Working with Azure Container Service

However, you have wonder about the lesser known Microsoft Azure Container Service (ACS), and would be wondering about what is ACS

Here the ACS in the nutshell

  • Makes simpler to Create , Configure and Manage a cluster of machines which are preconfigured to run containerized applications
  • Uses optimized configuration of scheduling and orchestration tools
    ACS leverages Docker container format
  • Simplifies the creation of the cluster by providing ‘Wizard’ style
    rather than setting up and configuring the set of co-ordinated machines and software on top of Azure Platform
  • Supports two platforms
  • Docker Swarm to scale to hundreds or thousands of containers
  • Marathon and DC/OS
  • Built on top of Azure VM Scale Sets

What, Why and How


What: Azure Container Service is open source container orchestration system.

Why: It is used to deploy and manage dockerize container within the data center.

How: It does orchestration either by using Docker Swarm or Apache Mesos

Where it helps

Where it helps.png

Where it helps2

Where it helps3

There are two ways to work in ACS


Docker Swarm and Mesos are the orchestrators, which in plain English tell host system which container to host where.

You can view the animated slides at

I would be posting more information regarding this in the following post to come



Azure Application Gateway


Azure Application Gateway provides application-level routing and load balancing services which let you build a scalable and highly-available web front end in Azure. You control the size of the gateway and can scale your deployment based on your needs.
Application Gateway currently supports layer-7 application delivery for the following:

  • HTTP load balancing
  • Cookie-based session affinity
  • Secure Sockets Layer (SSL) offload


Demonstration Scenario

Here, in order to demonstrate the session affinity of the Azure Application Gateway load balancer without the VM scaling option. We need to setup the infrastructure in following architecture.
Here, We have two VM configure in different Cloud Services. We have chosen two cloud services to demonstrate that App Gateway load balancer is different from implicit load balancer which each cloud service has. Each cloud service have just one VM inside it. Both the VM’s are configured in the same azure virtual network, so they can talk to each other and share the same IP convention as specified in the subnet.
Then, we create Application Gateway in the same virtual network and configured to load balance between this two VM with cookie affinity.
Azure appliaction gateway will get the request from the internet of configured DNS address or public IP and would route the request to each VM in the round robin fashion. It monitors the health of the VM by probing configured port (which you specify in BackendHttpSettings section) very 30 second. If the response from these endpoint doesn’t like in 200-390 range, the endpoint is taken out from the pool this next probes happen.

Configure Azure Application Gateway

In order to configure azure application gateway for mention architecture, following are the steps

  1. Create Virtual Network in which VM needs to be placed
  2. Create two identical VM and configure it to be place in same virtual network created above
  3. Configure each VM to host ASP.NET web application and deploy the sample application on them
  4. Create Application Gateway loadbalancer and configure it to load balance the VMs
  5. Validate the application and check if the connection is persists for entire session to same machine

Steps 1

Create Virtual Network in Classic


Give a logical name,

Create Address Space :
Create Two Subnets :

  • 1st Subnet :
  • 2nd Subnet:

Steps 2

Create two VM’s with the Name as sbag1vm1 and sbag1vm2

During the creation of these VM,

VM were put in Virtual Network “Shabs-VirNet” which we had configured earlier

Subnet was of address space which we configure while creating the Shabs-VirNet


Step 3

Each VM needs to be configure as Web server which need to host sample ASP.NET web application on it. Here, we had deployed the sample ASP.NET web application to demonstrate the use of session affinity. ASP.NET application generate session cookie everytime new request hits the server and maintains it through the lifetime of the user session on the server.

Sample app will save a counter in the session state in memory (metal love) and respond to your page refreshes with updated session data.


Step 4

Create the new Application Gateway with following powershell script.

I won’t go in details of explaining each of command in this entire powershell script, but on the whole it creates the Application Gateway with name ‘noscaleapplicationgateway’ and configure it for load balance across two VM’s

$checkGateway = Get-AzureApplicationGateway noscaleapplicationgateway
if($checkGateway -eq $null)
   New-AzureApplicationGateway -Name noscaleapplicationgateway -VnetName applicationgatewaynetwork -Subnets("Subnet-1")
Get-AzureApplicationGateway noscaleapplicationgateway
#Set Application Gateway Configuration
Set-AzureApplicationGatewayConfig -Name noscaleapplicationgateway -ConfigFile $GatewayconfigurationFilePath
#Start Gateway
Start-AzureApplicationGateway noscaleapplicationgateway
#Verify that gateway is running
Get-AzureApplicationGateway noscaleapplicationgateway

Once the Gateway starts (at which point billing also starts), the Get-AzureApplicationGateway command will return a result that looks like the following. Note the DnsName field which contains the URL which users can access to interface with the Application Gateway.


Step 5

Verify the application,  Try accessing the Application Gateway URL from different browsers to hopefully hit the different VMs that you deployed (you can keep closing and reopening browsers until you succeed). Your screen .should look similar to the following. Note that I am accessing the Application Gateway URL here.


If you view the cookies now, you would find the secret sauce that is making the Application Gateway tick.


ARRAffinity cookie contains data that helps Azure Application Gateway determine the endpoint to which it should route the request (yes it is ARR under the hood). ASP.NET_Sessionid cookie is a standard session cookie that contains session identifier.

Azure Traffic Manager Overview

What is Azure Traffic Manager?

Microsoft Azure Traffic Manager allows you to control the distribution of user traffic to your specified endpoints, which can include Azure cloud services, websites, and other endpoints. Traffic Manager works by applying an intelligent policy engine to Domain Name System (DNS) queries for the domain names of your Internet resources. Your Azure cloud services or websites can be running in different datacenters across the world.

Where does Azure Traffic Manager help?

Improves availability

Traffic manager improves the availability of the applications by providing automatic failover capabilities when an Azure cloud service, Azure website, or other location goes down.

Reduce latency

Traffic manager improves the responsiveness of your application and content delivery by directing end-users to the endpoint with the lowest network latency from the client.

Scaling up across globe

Traffic manager help you application to scale across various Azure datacentre across the globe and load balance traffic across the varied region endpoints

Traffic distribution for large, complex deployments

Traffic manager supports the nested profile, which you would need to create configurations to optimize performance and distribution for larger, more complex deployments

How Azure Traffic Manager work?


This figure explain the how the Traffic Manager works in the action. Follow the number in the blue bubble to the interaction in action.


This is typical network architecture looks like. Note, this is just sample architectural splice, and do not that this as definite exclusive.


How do you create one?

Here, how you create Traffic Manager on the Web


Here, you need to give the DNS name for the traffic manager.

It would be, and you choose the load balancing method.

Performance – Shortest latency between the region

Round Robin – Equal distribute traffic across the region

Failover – DR scenarios


Performance – When you choose Performance, Microsoft maintains the latency table in the routes, which guides the latency with various data center. It not real time, but pretty quick to find the shortest latency from the point.

Failover – All connection to one set of server, and if failover, then it send connections to second set of server

This can be change later on, once you define the traffic manager

Click on the Configure Tab


You can configure the Endpoints for the Traffic Manager, where you can add websites, or add Webapps to the site.



In Configuration tab, you can change the load balacing method, you you can use PowerShell to create nested Traffic manager profile.


You can mask the DNS with the CNAME, which mask the * URL with you DNS url.