Why Use Service Discovery
Why Use Service Discovery
There are two main service discovery patterns: client-side discovery and server-side discovery.
Netflix OSS provides a great example of the client-side discovery pattern. Netflix Eureka is a
service registry. It provides a REST API for managing service-instance registration and for
querying available instances. Netflix Ribbon is an IPC client that works with Eureka to load
balance requests across the available service instances. We will discuss Eureka in more depth
later in this article.
The client-side discovery pattern has a variety of benefits and drawbacks. This pattern is
relatively straightforward and, except for the service registry, there are no other moving parts.
Also, since the client knows about the available services instances, it can make intelligent,
application-specific load-balancing decisions such as using hashing consistently. One significant
drawback of this pattern is that it couples the client with the service registry. You must
implement client-side service discovery logic for each programming language and framework
used by your service clients.
Now that we have looked at client-side discovery, let’s take a look at server-side discovery.
The client makes a request to a service via a load balancer. The load balancer queries the
service registry and routes each request to an available service instance. As with client-side
discovery, service instances are registered and deregistered with the service registry.
The AWS Elastic Load Balancer (ELB) is an example of a server-side discovery router. An ELB is
commonly used to load balance external traffic from the Internet. However, you can also use an
ELB to load balance traffic that is internal to a virtual private cloud (VPC). A client makes
requests (HTTP or TCP) via the ELB using its DNS name. The ELB load balances the traffic among
a set of registered Elastic Compute Cloud (EC2) instances or EC2 Container Service (ECS)
containers. There isn’t a separate service registry. Instead, EC2 instances and ECS containers are
registered with the ELB itself.
HTTP servers and load balancers such as NGINX Plus and NGINX can also be used as a server-
side discovery load balancer. For example, this blog post describes using Consul Template to
dynamically reconfigure NGINX reverse proxying. Consul Template is a tool that periodically
regenerates arbitrary configuration files from configuration data stored in
the Consul service registry.
It runs an arbitrary shell command whenever the files change. In the example described by the
blog post, Consul Template generates an nginx.conf file, which configures the reverse proxying,
and then runs a command that tells NGINX to reload the configuration.
The server-side discovery pattern has several benefits and drawbacks. One great benefit of this
pattern is that details of discovery are abstracted away from the client. Clients simply make
requests to the load balancer. This eliminates the need to implement discovery logic for each
programming language and framework used by your service clients. Also, as mentioned above,
some deployment environments provide this functionality for free.
This pattern also has some drawbacks, however. Unless the load balancer is provided by the
deployment environment, it is yet another highly available system component that you need to
set up and manage.
However, that information eventually becomes out of date and clients become unable to
discover service instances. Consequently, a service registry consists of a cluster of servers that
use a replication protocol to maintain consistency.
As mentioned earlier, Netflix Eureka is good example of a service registry. It provides a REST API
for registering and querying service instances. A service instance registers its network location
using a POST request. Every 30 seconds it must refresh its registration using a PUT request. A
registration is removed by either using an HTTP DELETE request or by the instance registration
timing out. As you might expect, a client can retrieve the registered service instances by using
an HTTP GET request.
Netflix achieves high availability by running one or more Eureka servers in each Amazon EC2
availability zone. Each Eureka server runs on an EC2 instance that has an Elastic IP address.
DNS TEXT records are used to store the Eureka cluster configuration, which is a map from
availability zones to a list of the network locations of Eureka servers. When a Eureka server
starts up, it queries DNS to retrieve the Eureka cluster configuration, locates its peers, and
assigns itself an unused Elastic IP address.
Eureka clients – services and service clients – query DNS to discover the network locations of
Eureka servers. Clients prefer to use a Eureka server in the same availability zone. However, if
none is available, the client uses a Eureka server in another availability zone.
• etcd – A highly available, distributed, consistent, key-value store that is used for shared
configuration and service discovery. Two notable projects that use etcd are Kubernetes
and Cloud Foundry.
• consul – A tool for discovering and configuring services. It provides an API that allows clients
to register and discover services. Consul can perform health checks to determine service
availability.
• Apache Zookeeper – A widely used, high-performance coordination service for distributed
applications. Apache Zookeeper was originally a subproject of Hadoop but is now a top-level
project.
Also, as noted previously, some systems such as Kubernetes, Marathon, and AWS do not have
an explicit service registry. Instead, the service registry is just a built-in part of the
infrastructure.
Now that we have looked at the concept of a service registry, let’s look at how service instances
are registered with the service registry.
A good example of this approach is the Netflix OSS Eureka client. The Eureka client handles all
aspects of service instance registration and deregistration. The Spring Cloud project, which
implements various patterns including service discovery, makes it easy to automatically register
a service instance with Eureka. You simply annotate your Java Configuration class with
an @EnableEurekaClient annotation.
The self-registration pattern has various benefits and drawbacks. One benefit is that it is
relatively simple and doesn’t require any other system components. However, a major
drawback is that it couples the service instances to the service registry. You must implement
the registration code in each programming language and framework used by your services.
The alternative approach, which decouples services from the service registry, is the third-party
registration pattern.
One example of a service registrar is the open source Registrator project. It automatically
registers and deregisters service instances that are deployed as Docker containers.
Registrator supports several service registries, including etcd and Consul.
Another example of a service registrar is NetflixOSS Prana. Primarily intended for services
written in non-JVM languages, it is a sidecar application that runs side by side with a service
instance. Prana registers and deregisters the service instance with Netflix Eureka.
The service registrar is a built-in component of deployment environments. The EC2 instances
created by an Autoscaling Group can be automatically registered with an ELB. Kubernetes
services are automatically registered and made available for discovery.
The third-party registration pattern has various benefits and drawbacks. A major benefit is that
services are decoupled from the service registry. You don’t need to implement
service-registration logic for each programming language and framework used by your
developers. Instead, service instance registration is handled in a centralized manner within a
dedicated service.
One drawback of this pattern is that unless it’s built into the deployment environment, it is yet
another highly available system component that you need to set up and manage.