Posted in Devops, Information Technology, microservices

Container Orchestrator

With enterprises containerizing their applications and moving them to the cloud, there is a growing demand for container orchestration solutions. While there are many solutions available, some are mere re-distributions of well-established container orchestration tools, enriched with features and, sometimes, with certain limitations in flexibility.

Although not exhaustive, the list below provides a few different container orchestration tools and services available today:

Posted in microservices, Software Architecture

What is a service mesh, really?

What is a service mesh, really?

Figure 1: Service mesh overview

Figure 1 illustrates the service mesh concept at its most basic level. There are four service clusters (A-D). Each service instance is colocated with a sidecar network proxy. All network traffic (HTTP, REST, gRPC, Redis, etc.) from an individual service instance flows via its local sidecar proxy to the appropriate destination. Thus, the service instance is not aware of the network at large and only knows about its local proxy. In effect, the distributed system network has been abstracted away from the service programmer.

The data plane

In a service mesh, the sidecar proxy performs the following tasks:

  • Service discovery: What are all of the upstream/backend service instances that are available?
  • Health checking: Are the upstream service instances returned by service discovery healthy and ready to accept network traffic? This may include both active (e.g., out-of-band pings to a /healthcheck endpoint) and passive (e.g., using 3 consecutive 5xx as an indication of an unhealthy state) health checking.
  • Routing: Given a REST request for /foo from the local service instance, to which upstream service cluster should the request be sent?
  • Load balancing: Once an upstream service cluster has been selected during routing, to which upstream service instance should the request be sent? With what timeout? With what circuit breaking settings? If the request fails should it be retried?
  • Authentication and authorization: For incoming requests, can the caller be cryptographically attested using mTLS or some other mechanism? If attested, is the caller allowed to invoke the requested endpoint or should an unauthenticated response be returned?
  • Observability: For each request, detailed statistics, logging, and distributed tracing data should be generated so that operators can understand distributed traffic flow and debug problems as they occur.

All of the previous items are the responsibility of the service mesh data plane. In effect, the sidecar proxy is the data plane. Said another way, the data plane is responsible for conditionally translating, forwarding, and observing every network packet that flows to and from a service instance.

The control plane

The network abstraction that the sidecar proxy data plane provides is magical. However, how does the proxy actually know to route /foo to service B? How is the service discovery data that the proxy queries populated? How are the load balancing, timeout, circuit breaking, etc. settings specified? How are deploys accomplished using blue/green or gradual traffic shifting semantics? Who configures systemwide authentication and authorization settings?

All of the above items are the responsibility of the service mesh control planeThe control plane takes a set of isolated stateless sidecar proxies and turns them into a distributed system.

The reason that I think many technologists find the split concepts of data plane and control plane confusing is that for most people the data plane is familiar while the control plane is foreign. We’ve been around physical network routers and switches for a long time. We understand that packets/requests need to go from point A to point B and that we can use hardware and software to make that happen. The new breed of software proxies are just really fancy versions of tools we have been using for a long time.

Figure 2: Human control plane

However, we have also been using control planes for a long time, though most network operators might not associate that portion of the system with a piece of technology. There reason for this is simple — most control planes in use today are… us.

Figure 2 shows what I call the “human control plane.” In this type of deployment (which is still extremely common), a (likely grumpy) human operator crafts static configurations — potentially with the aid of some scripting tools — and deploys them using some type of bespoke process to all of the proxies. The proxies then consume the configuration and proceed with data plane processing using the updated settings.

Figure 3: Advanced service mesh control plane

Figure 3 shows an “advanced” service mesh control plane. It is composed of the following pieces:

  • The human: There is still a (hopefully less grumpy) human in the loop making high level decisions about the overall system.
  • Control plane UI: The human interacts with some type of UI to control the system. This might be a web portal, a CLI, or some other interface. Through the UI, the operator has access to global system configuration settings such as deploy control (blue/green and/or traffic shifting), authentication and authorization settings, route table specification (e.g., when service A requests /foo what happens), and load balancer settings (e.g., timeouts, retries, circuit breakers, etc.).
  • Workload scheduler: Services are run on an infrastructure via some type of scheduling system (e.g., Kubernetes or Nomad). The scheduler is responsible for bootstrapping a service along with its sidecar proxy.
  • Service discovery: As the scheduler starts and stops service instances it reports liveness state into a service discovery system.
  • Sidecar proxy configuration APIs: The sidecar proxies dynamically fetch state from various system components in an eventually consistent way without operator involvement. The entire system composed of all currently running service instances and sidecar proxies eventually converge. Envoy’s universal data plane API is one such example of how this works in practice.

Ultimately, the goal of a control plane is to set policy that will eventually be enacted by the data plane. More advanced control planes will abstract more of the system from the operator and require less handholding (assuming they are working correctly!).

Data plane vs. control plane summary

  • Service mesh data plane: Touches every packet/request in the system. Responsible for service discovery, health checking, routing, load balancing, authentication/authorization, and observability.
  • Service mesh control plane: Provides policy and configuration for all of the running data planes in the mesh. Does not touch any packets/requests in the system. The control plane turns all of the data planes into a distributed system.

Current project landscape

With the above explanation out of the way, let’s take a look at the current service mesh landscape.

Instead of doing an in-depth analysis of each solution above, I’m going to briefly touch on some of the points that I think are causing the majority of the ecosystem confusion right now.

Linkerd was one of the first service mesh data plane proxies on the scene in early 2016 and has done a fantastic job of increasing awareness and excitement around the service mesh design pattern. Envoy followed about 6 months later (though was in production at Lyft since late 2015). Linkerd and Envoy are the two projects that are most commonly mentioned when discussing “service meshes.”

Istio was announced May, 2017. The project goals of Istio look very much like the advanced control plane illustrated in figure 3. The default proxy of Istio is Envoy. Thus, Istio is the control plane and Envoy is the data plane. In a short time, Istio has garnered a lot of excitement, and other data planes have begun integrations as a replacement for Envoy (both Linkerd and NGINX have demonstrated Istio integration). The fact that it’s possible for a single control plane to use different data planes means that the control plane and data plane are not necessarily tightly coupled. An API such as Envoy’s universal data plane API can form a bridge between the two pieces of the system.

Nelson and SmartStack help further illustrate the control plane vs. data plane divide. Nelson uses Envoy as its proxy and builds a robust service mesh control plane around the HashiCorp stack (i.e. Nomad, etc.). SmartStack was perhaps the first of the new wave of service meshes. SmartStack forms a control plane around HAProxy or NGINX, further demonstrating that it’s possible to decouple the service mesh control plane and the data plane.

The service mesh microservice networking space is getting a lot of attention right now (rightly so!) with more projects and vendors entering all the time. Over the next several years, we will see a lot of innovation in both data planes and control planes, and further intermixing of the various components. The ultimate result should be microservice networking that is more transparent and magical to the (hopefully less and less grumpy) operator.

Key takeaways

  • A service mesh is composed of two disparate pieces: the data plane and the control plane. Both are required. Without both the system will not work.
  • Everyone is familiar with the control plane — albeit the control plane might be you!
  • All of the data planes compete with each other on features, performance, configurability, and extensibility.
  • All of the control planes compete with each other on features, configurability, extensibility, and usability.
  • A single control plane may contain the right abstractions and APIs such that multiple data planes can be used.


Posted in microservices, Software Architecture

Microservices Architectures: What Is Fault Tolerance?

In this article, we discuss an important property of microservices, called fault tolerance.

You Will Learn

  • What is Fault Tolerance?
  • Why is fault tolerance important in microservices architecture?
  • How do you achieve fault tolerance?

What Is Fault Tolerance?

Microservices need to be extremely reliable.

When we build a microservices architecture, there are a large number of small microservices, and they all need to communicate with one another.

Lets consider the following example:

Basic microservices architecture

Let’s say Microservice5 is down at some point in time.

All the other microservices are directly or indirectly dependent on it, so they all go down as well.

The solution to this problem is to have a fallback in case a microservice fails. This aspect of a microservice is called fault tolerance.

Implementing Fault Tolerance With Hystrix

A popular framework used to implement fault tolerance is Hystrix, a Netflix open source framework. Here is some sample Hystrix code:

public LimitConfiguration retrieveConfiguration() {
throw new RuntimeException("Not Available");

public LimitConfiguration fallbackRetrieveConfiguration() {
return new LimitConfiguration(999, 9);

Hystrix enables you to specify the fallback method for each of your service methods. If the method throws an exception, what should be returned to the service consumer?

Here, if retrieveConfiguration() fails, then fallbackRetrieveConfiguration is called, which returns a hardcoded LimitConfiguration instance:

Hystrix and Alerts

With Hystrix, you can also configure alerts at the backend. If a service starts failing continuously, you can send alerts to the maintainance team.

Hystrix Is Not a Silver Bullet

Using Hystrix and fallback methods is appropriate for services that handle non-critical information.

However, it is not a silver bullet.

Consider, for instance, a service that returns the balance of a bank account. You cannot provide a default hardcoded value back.

Using Sufficient Redundancy

It is important to design critical services in a fail safe manner. It is important to build enough redundancy into the system to ensure that the services do not fail.

Have Sufficient Testing

It is important to test for failure. Bring a microservice down. See how your system reacts.

Chaos Monkey from Netflix is a good example of this.


In this article, we discussed fault tolerance. We saw how fault tolerance is essential in a microservices architecture. We then saw how it can be implemented at the code level using frameworks such as Hystrix.

Posted in Information Technology, microservices, Software Architecture

What Is Service Discovery?

When we talk about a microservices architecture, we refer to a system with a large number of small services, working with each other:

Basic Microservices Architecture

Basic Mircoservices Architecture

An important feature of such architectures is auto-scaling. The number of instances of a microservice varies based on the system load. Initially, you could have 5 instances of Microservice5, which go up later to 20, 100, or 1000!

Two important questions arise

  • How does Microservice4 know how many instances of Microservice5 are present, at a given time?
  • In addition, how does it distribute the load among all of them?

Hardcoding URLs Is Not an Option

One way to do this is to hard-code the URLs of Microservice5 instances, within Microservice4. That means every time the number of Microservice5 instances changes (with the addition of new one or the deletion of existing one), the configuration within Microservice4 needs to change. This is a big headache.

Using Service Discovery

Ideally, you want to change the number of instances of Microservice5 based on the load, and make Microservice4 dynamically aware of the instances.

That’s where the concept of Service Discovery comes into the picture.

The component that provides this service is generally called a naming server.

All instances of all the microservices register themselves with the naming server. Whenever a microservice wants to talk to another microservices, it asks the naming server about the available instances.

In the example above, whenever a new instance of Microservice5 is launched, it registers with the naming server. When Microservice4 wants to talk to Microservice5, it asks the naming server: what are the available instances of Microservice5?

Another Example of Service Discovery

Using Service Discovery to identify microservice instances helps keep things dynamic.

Let’s say there is a service for currency conversion:

The CurrencyConversionService (CCS) talks to the ForexService. At a certain point of time, these services have two instances each:

However, there could be a time where there are five instances of the ForexService (FS):

In that case, CurrencyConversionService needs to make sure that the load is evenly distributed across all the ForexService instances. It needs to answer two important questions:

  • How does the CurrencyConversionService know how many instances of ForexService are active?
  • How does the CurrencyConversionService distribute the load among those active instances?

When a CCS microservice instance is brought up, it registers with Eureka. The same thing happens with all instances of FS as well.

When a CCS instance needs to talk to an FS instance, it requests information from Eureka. Eureka would then return the URLS of the two FS instances active at that time. Here, the application makes use of a client-side load distribution framework called Ribbon. Ribbon ensures proper load distribution over the two FS instances, for events coming in from the CCS.


In this video, we talked about microservice service discovery. We saw that microservices need to be able to communicate with each other. The number of instances of a microservice changes over time, depending on the load. Service discovery enables us to dynamically adapt to new instances and distribute load among microservices.

Posted in Information Technology, microservices, Software Architecture

Why Centralized Configuration?

When we talk about a microservices architecture, we visualize a large number of small microservices talking to each other. The number of microservices depends on the size of the enterprise.

Basic Microservices ArchitectureBasic Microservices Architecture

The interesting part is that each of these microservices can have their own configuration.

Such configurations include details like:

  • Application configuration.
  • Database configuration.
  • Communication Channel Configuration – queues and other infrastructure.
  • URLs of other microservices to talk to.

In addition, each microservice will have a separate configuration for different environments, such as development, QA, and production.

If maintaining a single configuration for a large application is difficult, imagine maintaining configurations for hundreds of microservices in different environments.

Centralized Config Server to the Rescue

That’s where a centralized configuration server steps in.

Configuration for all microservices (for all environments) is stored at one place — a centralized configuration store.

When a microservice needs its configuration, it provides an ID at launch — a combination of the name of the microservice and the environment.

The centralized config server looks up the configuration and provides the configuration to the microservice.

Ensure that the configuration in a centralized config server is secured and has role-based access.

Introducing Spring Cloud Config Server

Spring Cloud Config Server is one of the popular implementations of a cloud config server.

Spring Cloud Config Server enables you to store all the configurations for multiple microservices for different environments in a git or SVN Repository. A set of folder structures and conventions needs to be followed for the setup to work.

Spring Cloud Config Server

A microservice can connect to the config server and identify itself, and also specify the instance it represents. This enables it to get the required configuration.

The setup ensures that the operations team does not need to take time out to configure the individual microservices on a case-by-case basis. All that they need to worry about is configuring the centralized config server, and starting to put relevant configurations into the git repository.

Automatically Picking Up Configuration Changes

An interesting feature present with the Spring Cloud Config Server is auto refresh. Whenever a change is committed to the git repository, configuration in the application is auto-refreshed.


In this article, we looked at why we need centralized configuration in microservices-based applications. We looked at how the Spring Cloud Config Server manages centralized configuration.

Posted in Information Technology, microservices, Software Architecture

The Need for API Gateways

Handling Cross Cutting Concerns

Whenever we design and develop a large software application, we make use of a layered architecture. For instance, in a web application, it is quite common to see an architecture similar to the following:

Web application architecture

Here, we see that the application is organized into a web layer, a business layer, and a data layer.

In a layered architecture, there are specific parts that are common to all these different layers. Such parts include:

  • Logging
  • Security
  • Performance
  • Auditing

All these features are applicable across layers, hence it makes sense to implement them in a common way.

Aspect Oriented programming is a well established way of handling these concerns. Use of constructs such as filters and interceptors is common while implementing them.

The Need for API Gateways

When we talk about a microservices architecture, we deal with multiple microservices talking to each other:Basic Microservices Architecture

Where do you implement all the features that are common across microservices?

  • Authentication
  • Logging
  • Auditing
  • Rate limiting

That’s where the API Gateway comes into the picture.

How Does an API Gateway Work?

In microservices, we route all requests — both internal and external — through API Gateways. We can implement all the common features like authentication, logging, auditing, and rate limiting in the API Gateway.

For example, you may not want Microservice3 to be called more than 10 times by a particular client. You could do that as part of rate limiting in the API gateway.

You can implement the common features across microservices in the API gateway. A popular API gateway implementation is the Zuul API gateway.


Just like AOP handles cross cutting concerns in standalone applications, API gateways manage common features for microservices in an enterprise.

Posted in Information Technology, microservices, Software Architecture

Microservices Architecture: The Importance of Centralized Logging

The Need for Visibility

In a microservices architecture, there are a number of small microservices talking to each other:

Basic microservices communication

In the above example, let’s assume there is a problem with Microservice5, due to which Microservice1 throws an error.

How does a developer debug the problem?

They would like to know the details of what’s happening in every microservice from Microservice1 through Microservice5. From such a trace, it should be possible to identify that something went wrong at Microservice5.

The more you break things down into smaller microservices, the more visibility you need into what’s going on in the background. Otherwise, a lot of time and effort needs to be spent in debugging problems.

One of the popular ways to improve visibility is by using centralized logging.

Centralized Logging Using Log Streams

Using Log Streams is one way to implement centralized logging. The common way to implement it is to stream microservice logs to a common queue. Distributed logging server listens to the queue and acts as log store. It provides search capabilities to search the trace.

Popular Implementations

Some of the popular implementations include

  • the ELK stack (Elastic Search, Logstash and Kibana) for Centralized Logging.
  • Zipkin, Open Tracing API, and Zaeger for Distributed Tracing.


In this article, we had a look at centralized logging. We saw that there is a need for high visibility in microservices architecture. Centralized logging provides visibility for better debugging of problems. Using log streams is one way of implementing centralized logging.

Posted in microservices, Software Architecture

Microservices Architecture: Introduction to Auto Scaling

The Load on Applications Varies

The load on your applications vary depending on time of the day, the day of the month or the month of the year.

Take for instance, It has very high loads during Thanksgiving, up to 20 times the normal load. However, during the major sports events such as the Super Bowl or a FIFA World Cup, the traffic could be considerably less – because every body is busy watching the event.

How can you setup infrastructure for applications to manage varying loads?

It is quite possible that the infrastructure needs to handle 10x the normal load.

If you have on-premise infrastructure, you need a large infrastructure in place to handle peak load.

During periods with less load, a lot of infrastructure would be sitting idle.

Cloud to the Rescue

That’s where cloud comes into the picture. With cloud, you can request more resources when the load is high and give them back to the cloud when you have less load.

This is called Scale Out (create more instances as the load increases) and Scale In (reduces instances as the load goes down)

How do you build applications that are cloud enabled, i.e. applications that work well in the cloud?

That’s where a microservices architecture comes into the picture.

Introducing Auto Scaling

Building your application using microservices enables you to increase the number of microservice instances during high load, and reduce them during times with less load.

Consider the following example of a CurrencyConversionService:

Basic Microservice ArchitectureBasic Microservice Architecture

The CurrencyConversionService talks to the ForexService. The ForexService is concerned with calculating how many INR can result from 1 USD, or how many INR can result from 1 EUR.

The CurrencyConversionService takes a bag of currencies and amounts and produces the total amount in a currency of your choice. For example, it will tell the total worth in INR of 10 EUR and 25 USD.

The ForexService might also be consumed from a number of other microservices.

Scaling Infrastructure to Match Load

The load on the ForexService might be different from the load on the CurrencyConversionService. You might need to have a different number of instances of the CurrencyConversionService and ForexService. For example, there may be two instances of the CurrencyConversionService, and five instances of the ForexService:

Basic Microservice ArchitectureBasic Microservice Architecture

At a later point in time, the load on the CurrencyConversionService could be low, needing just two instances. On the other hand, a much higher load on the ForexService could need 50 instances. The requests coming in from the two instances of CurrencyConversionService are distributed across the 50 instances of the ForexService.

That, in essence, is the requirement for auto scaling — a dynamically changing number of microservice instances, and evenly distributing the load across them.

Implementing Auto Scaling

There are a few important concepts involved in implementing auto scaling. The following sections discuss them in some detail.

Naming Server

Naming servers enable something called location transparency. Every microservice registers with the naming service. Any microservice that needs to talk to another microservice will ask the naming server for its location.

Whenever a new instance of CurrencyConversionService or ForexService comes up, it registers with the naming server.Basic Microservice Architecture Auto Scaling

When CurrencyConversionService wants to talk to ForexService, it asks the naming server for available instances.

Implementing Location Transparency

CurrencyConversionService knows that there are five instances of the ForexService.

How does it distribute the load among all these instances?

That’s where a load balancer comes into the picture.

A popular client side load balancing framework is Ribbon.Basic Microservice Architecture

Let’s look at a diagram to understand whats happening:

Load balancing framework

As soon as any instance of CurrencyConversionService or ForexService comes up, it registers itself with the naming server. If the CCSInstance2 wants to know the URL of ForexService instances, it again talks to the naming server. The naming server responds with a list of all instances of the ForexService — FSInstance1 and FSinstance2 — and their corresponding URLs.

The Ribbon load balancer does a round-robin among the ForexService instances to balance out the load among the instances.

Ribbon offers wide variety of load balancing algorithms to choose from.

When to Increase and Decrease Microservices Instances

There is one question we did not really talk about.

How do we know when to increase or decrease the number of instances of a microservices?

That is where application monitoring and container (Docker) management (using Kubernetes) comes into the picture.

Auto scaling microservices

An application needs to be monitored to find out how much load it has. For this, the application has to expose metrics for us to track the load.

You can containerize each microservice using Docker and create an image.

Kubernetes has the capability to manage containers. Kubernetes can be configured to auto scale based on the load. Kubernetes can identify the application instances, monitor their loads, and automatically scale up and down.


In this article, we talked about auto scaling. We looked at important parts of implementing auto scaling — naming server, load balancer, containers (Docker), and container orchestration (Kubernetes).