Platform Engineering

Elmira Hasanzadeh

Comparing Istio with Netflix Frameworks for Inter-Microservices Communications

Posted by Elmira Hasanzadeh on 04 September 2018

Architecture, Microservices, tech, api, istio, netflix, service mesh, gcp

The technical advancements in every aspect of software development lifecycle make it clear that there are more than one solution to any problem. In this article I examine Istio’s service mesh capabilities to address issues that developers face while creating microservices and compare it with the widely adopted Netflix frameworks. Istio takes away many of these microservices concerns from the developer and delegates them to operations where collective behaviours can be managed better.

Service Meshes and Istio

Distributed architecture offers significant advantages over monolithic and layered-based architectures however it also brings along increased complexity such as maintaining contracts, service registry and discovery, service availability, responsiveness, securing access to services and managing distributed transactions to name a few. Microservices architecture has gained popularity and evolved over the past decade. Several frameworks and standards have been created to overcome the complexities without compromising its benefits. A well-known example is Netflix that introduced several frameworks such as Eureka, Hystrix, Robbin to make the microservices-based applications more resilient, maintainable and robust.

However that isn’t the end of the story. The rise of microservices saw concepts like a service mesh gain rapid popularity, further propelled by the wider adoption of containerisation and container orchestrators. The term service mesh is used to describe an infrastructure layer for handling service-to-service communications through the complex topology of services that comprise cloud native applications. Google, IBM and Lyft launched Istio service mesh in 2017 under an open source license. Istio addresses the challenges developers and operators face while creating and deploying distributed services.

While Istio provides a wide range of features, in this article I will only concentrate on how microservices developers benefit when Istio relieves them from system-level tasks such as service addressing and timeouts. Delegating these decisions to Istio improves the overall agility, robustness and loose coupling in a microservices application. I will demonstrate the concepts by an example application and comparing existing solutions from Netflix frameworks with Istio’s approach.

Example Application and Problem Definition

Let’s assume that we have an “art gallery” application. By using this app one can pick a gallery from list of the galleries and browse all the exhibitions currently held in the gallery as well as viewing the detailed information of a particular artwork. For the sake of simplicity I’ll explain the rest of this post by assuming that we have only two microservices to implement for this app, GalleryInfo and ArtworkProfile services. Although splitting gallery-app to these two services might not be the best design here but for the sake of simplicity let’s say it is!!

GalleryInfo is the service responsible to manage the requests related to galleries and exposes three endpoints:

  • Get list of galleries: returns the name and general information of galleries
  • Get list of exhibitions in a gallery: returns the name and opening hours of all the exhibitions in a gallery
  • Get exhibition details: returns details of an exhibition such as name of the displayed artworks and their artists

ArtworkProfile service is responsible to handle the requests querying a specific artwork or artist and exposes two endpoints:

  • Get artwork simple info: returns (name,artistname) pair of an artwork
  • Get artwork details: returns detailed information of an artwork, this includes artist details too

sample microservice design

Microservices are designed to be fully self-contained with access only to their own tables and schemas. Microservices principles encourage loose-coupling by removing cross-service dependencies and discouraging inter-service request/reply interactions. The resulting architecture becomes less chatty and resilient. However achieving zero chat and dependency between services is not always possible and eventually some services need to communicate with others to fulfil their obligations. In this example the “Get exhibition details” endpoint which is exposed by GalleryInfo service, has to send artwork Ids to ArtworkProfile service in order to compose a list of (artwork name, artist name) pairs. This is the area which we can point out few challenges which raised as a result of mMcroservices and distributed application development:

  • How much do we need to involve the developers to address the service discovery?
  • Or how GalleryInfo service should find and invoke ArtworkProfile service?
  • What if ArtworkProfile is down or takes too long to respond and consequently how much we need to involve the developers to maintain the service robustness?

Solution

It is essential to abstract developers away form dealing with addresses and URLs when they need to implement inter-microservices communications. The main difference between Istio and existing frameworks is in the level of this abstraction!!

Service Discovery

Netflix Eureka is a well-known framework to provide the service discovery for microservices. Microservices should register a unique name in Eureka for themselves which then will be used by Eureka to identify and locate them. However, the developer needs to get involved to a certain level to be able to use it. Below is a code example of using Eureka with our GalleryInfo service with Spring Boot and RestTemplate:

package demo.microsrvice.istio;
 // imports……….
@RestController
public class GalleryInfoController {
    @Autowired
    private RestTemplate restTemplate;

    @Autowired
    private EurekaClient eurekaClient;

    @Value("${service.artwork.serviceId}")
    private String artworkServiceId;

    @RequestMapping(value="/galleries/{galleryId}/exhibitions/{exhibitionId}")
    public ExhibitionDetail getExhibition(@PathVariable int galleryId, @PathVariable int exhibitionId) {

      // find exhibition entity
      // find the artworks Ids displayed in this exhibition from exhibition-artwork intermediate table

        Application application = eurekaClient.getApplication(artworkServiceId);

        InstanceInfo instanceInfo = application.getInstances().get(0);

        String baseUrl = "http://" + instanceInfo.getIPAddr() + ":" 
         + instanceInfo.getPort() + "/" + "artworks";

        for(int id : artworkIds){
	      String url = baseUrl.concat(id);

	      ArtworkInfo artworkInfo = restTemplate.getForObject(url, artworkInfo.class);

	        // add to the response list

	}   
}

First the developer must enable service discovery for both microservices by@EnableDiscoveryClient annotation , inject the Eureka client in the GalleryInfo service code, add the Eureka client library as the dependency to the project and by using the ArtworkProfile service-name resolve the ArtworkProfile service URL. This URL will be used by RestTemplate to call the ArtworkProfile service and retrieve the artwork (name, artistName) pairs.

Now let’s have a look how GalleryInfo service code will change if we use Istio service mesh.

@RequestMapping(value="/galleries/{galleryId}/exhibitions/{exhibitionId}")
  public ExhibitionDetail getExhibition(@PathVariable int galleryId, @PathVariable int exhibitionId) {

      // find exhibition entity
      // find the artworks Ids displayed in this exhibition from exhibition-artwork intermediate table

        String baseUrl = "http://" + artworkServiceName  + "/" + "artworks";

        for(int id : artworkIds){

	     String url = baseUrl.concat(id);
	     ArtworkInfo artworkInfo = restTemplate.getForObject(url, artworkInfo.class);

	    // add to the response list

	}   
}

As you see there is no code dependency or any particular library for service discovery and microservices code refer to each other by name. These names are same as the names which are used for deploying microservices in their deployment configuration yaml file.

Service Robustness

Remote process responsiveness is one of the major concerns when designing distributed applications. If this concern is neglected then it will affect the overall performance of the application. As said before, no matter how we stick to the principals of microservices architecture, it is inescapable that services will need to communicate with each other. Therefore we need to decide what course of action should be taken if a callee service does not respond in a timely manner or fails to respond at all! Setting up a timeout for remote service calls is a common practice but however can lead down a bad path.

Imagine that service A calls service B and the timeout value for this call is x seconds, if service B is down or not responsive – because of heavy load or etc – then service A needs to wait x seconds each time before it finds out that service B is not going to respond. However, the “circuit breaker” pattern comes here to help!! The name comes from the electrical circuit breaker and it refers to functionality when the circuit breaker detects that a service is not responding, it opens, rejecting requests to that service. Once the service becomes responsive, the breaker closes the circuit, allowing requests through. Depending on the implementation, the service consumer must always check with the circuit breaker first to see if it is open or closed therefore service A in our example doesn’t need to wait x seconds each time to finally timeout. You can read more about the circuit breaker pattern in Michael Nygard’s book “Release It”!

Netflix took the idea to the next level and created the open source project named Hystrix fault tolerance library. Let’s have a look at the GalleryInfo code to see how it changes by leveraging Hystrix for fault tolerance.

@RequestMapping(value="/galleries/{galleryId}/exhibitions/{exhibitionId}")
@HystrixCommand(fallbackMethod="defaultMethod")

    public ExhibitionDetail getExhibition(@PathVariable int galleryId, @PathVariable int exhibitionId) {

      // find exhibition entity
      // find the artworks Ids displayed in this exhibition from exhibition-artwork intermediate table

        Application application = eurekaClient.getApplication(artworkServiceId);

        InstanceInfo instanceInfo = application.getInstances().get(0);

        String baseUrl = "http://" + instanceInfo.getIPAddr() + ":" 
        + instanceInfo.getPort() + "/" + "artworks";

        for (int id : artworkIds){
	     String url = baseUrl.concat(id);
	     ArtworkInfo artworkInfo = restTemplate.getForObject(url, artworkInfo.class);

	     // add to the response list
	}   
}

Private ExhibitionDetail defaultMethod(){
 	return new ExhibitionDetail();
}

Again the developer must enable circuit breaker for the application by @EnableCircuitBreaker, add the dependency to Hystrix library in the project level, create the fall-back method in case the circuit is open or when the call to other service times out and invoke the remote services within Hystrix command. But more importantly, our developer needs to configure the Hystrix circuit breaker for timeout, number of tries, time interval to check-in with called service and reassess an open circuit…… and if we consider that it is normal to have a different set of circuit breaker configuration values for different microservices then we realise that it would be very frustrating for developers if we make them responsible to create and manage these configurations.

Now it is Istio’s turn to see how it can help with implementing the circuit breaker pattern. Istio’s circuit breaker becomes an operational task which can be applied, changed or even removed on the fly without having to redeploy the services. In addition, the developer doesn’t need to make any changes in the microservices code. The usual exception handling for 5xx range of exceptions while communicating with remote services is sufficient. Therefore, from the caller microservice’s perspective, other services are available or not and Istio will return “503-service-not-available” status code immediately if the callee service behaviour matches the Istio circuit breaker condition. However Istio will check-in with this service at certain intervals to find out whether the service is back on the good behaviour or not!! If it is, then Istio will close the circuit allowing the requests to go through. All of this can be done by creating a “DestinationRule” configuration and applying to cluster by ops team.

Set up Istio on GKE

There are few documentation and tutorials out there explaining how to set up Istio for the workloads running on Google Kubernetes Engine clusters. The integration between Istio and GKE is seamless and even if you got your apps already up and running on GKE, adding Istio to the play is as easy as injecting the sidecar to the pods. Below are the steps to set up an Istio service mesh on GKE, we assume that the cluster has already been created:

  1. Download desired release from Istio releases page

  2. Add istioctl to your PATH

  3. Deploy Istio in your cluster by kubectl apply -f your-istio-folder/install/kubernetes/istio-demo.yaml Alternatively, you can install the istio-demo-auth.yaml file which will enable TLS Authentication between the sidecars.

  4. Deploy your microservices to GKE with the injection of sidecar: kubectl apply -f <(istioctl kube-inject -f your-app-deployment-file.yaml)Alternatively, you can enable the auto inject for a particular namespace which will inject the sidecar to the pods within that namespace automatically during pod creation.

  5. Now you need to define the ingress gateway for the system to work. This allows the Istio’s load balancer to route the requests to the designated service. Inside the downloaded Istio folder there are a few gateway.yaml sample files which can be modified and used according to your services ports and other information. From this point on microservices will be able to communicate with each other by their names and to enable the circuit breaking capability a configuration file like below also needs to be applied to the cluster:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: test-circuitBreaking
spec:
  host: artworkProfile
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL
    connectionPool:
      tcp:
        maxConnections: 1
      http:
        http1MaxPendingRequests: 1
        maxRequestsPerConnection: 1
    outlierDetection:
      consecutiveErrors: 1
      interval: 1s
      baseEjectionTime: 3m
      maxEjectionPercent: 100
  subsets:
  - name: v1
    labels:
      version: v1

Istio keeps track of the operation of the different endpoints in its load-balancing pool for a particular cluster. If it detects abnormal behavior, it can remove that from the load balancing pool. The outlier detection part in combination with connectionPool, set up a circuit breaking “detection and action” rule for artworkProfile service. The configuration says “if we have 1 5xx” error in our communication with artworkProfile we should temporarily remove it from our load balancing pool for this cluster.

The beauty of using Istio for managing inter-microservices traffic is that you can bake configuration into the CICD process where it is applicable and the rest can be performed and optimised by ops team on demand.

Conclusion

Istio supports the service mesh concept by taking care of cross-cutting concerns in your microservices implementation. We looked at two of those concerns - service discovery and circuit breakers - to illustrate how Istio does this without adding extra burden to service developers. Compared with Eureka and Hystrix Istio’s support for these concerns are configuration-based. If you already are using other frameworks for service discovery and circuit breakers, you can still leverage Istio for monitoring, telemetry and other features without contradicting with your existing solutions.

 

If you too identify as unapologetically 'geeky', join our team as we seek to solve wicked problems within Complex Programs, Process Engineering, Integration, Cloud Platforms, DevOps & more!

Click here to GET IN TOUCH!

 

Leave a comment on this blog: