Ebook Deploying Reactive Microservices
Ebook Deploying Reactive Microservices
Microservices
Strategies and Tools for Delivering
Resilient Systems
Edward Callahan
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Deploying Reac‐
tive Microservices, the cover image, and related trade dress are trademarks of
O’Reilly Media, Inc.
While the publisher and the author have used good faith efforts to ensure that the
information and instructions contained in this work are accurate, the publisher and
the author disclaim all responsibility for errors or omissions, including without limi‐
tation responsibility for damages resulting from the use of or reliance on this work.
Use of the information and instructions contained in this work is at your own risk. If
any code samples or other technology this work contains or describes is subject to
open source licenses or the intellectual property rights of others, it is your responsi‐
bility to ensure that your use thereof complies with such licenses and/or rights.
978-1-491-98148-1
[LSI]
Table of Contents
1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Every Company Is a Software Company 2
Full-Stack Reactive 4
Deploy with Confidence 5
3. Deploying Reactively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Getting Started 24
Developer Sandbox Setup 25
Clone the Example 26
Deploying Lagom Chirper 26
Reactive Service Orchestration 28
Elasticity and Scalability 29
Process Resilience 30
Rolling Upgrade 31
Dynamic Proxying 33
Service Locator 35
Consolidated Logging 39
Network Partition Resilience 41
4. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
iii
CHAPTER 1
Introduction
1
This report aims to demonstrate that not only should you be certain
to utilize the Reactive patterns in our operational platforms as well
as your applications, but in doing so, you can enable teams to deliver
software with precision and confidence. It is critical that these tools
be dependable, but it is equally important that they also be enjoyable
to work with in order to enable adoption by both developers and
operations. The deployment toolset must be a reliable engine, for it
is at the heart of iterative software delivery.
This report deploys the Chirper Lagom sample application using the
Lightbend Enterprise Suite. The Lightbend Enterprise Suite provides
advanced, out-of-the-box tools to help you build, manage, and
monitor microservices. These tools are themselves Reactive applica‐
tions. They were designed and developed using the very Reactive
traits and principles examined in this series. Collectively, this series
describes how organizations design, build, deploy, and manage soft‐
ware at scale in the data-fueled race of today’s marketplace with agil‐
ity and confidence using Reactive microservices.
2 | Chapter 1: Introduction
netes, Mesosphere DC/OS, IBM OpenWhisk, and Amazon Web
Services’ Lambda within enterprises today.
Operations departments within organizations are increasingly
becoming a resource provider that provisions and monitors com‐
puting resources and services of various forms. Their focus is shift‐
ing to the security, reliability, resilience, and efficient use of the
resources consumed by the organization. Those resources them‐
selves are configured by software and delivered as services using
very little or no human effort.
Having been tasked to satisfy many diverse needs and concerns,
operation departments realize that they must modernize, but are
understandably hesitant to commit to an early leader. Consider the
serverless, event-driven, Function as a Service platforms that are
gaining popularity for their simplicity. Like the batch schedulers
before them, many of these systems will prove too limited for system
and service use cases which require a richer set of interfaces for
managing long-running components and state. Operations teams
must also consider the amount of vendor lock-in introduced in the
vendor-specific formats and processes. Should the organizations not
yet fully trust cloud services, they may require an on-premise con‐
tainer management solution. Building one’s own solution, however,
has another version of lock-in: owning that solution. These conflict‐
ing interests alone can make finding a suitable system challenging
for any organization.
At the same time, developers are increasingly becoming responsible
for the overall success of applications in deployment. “It works for
us” is no longer an acceptable response to problem reports. Devel‐
opment teams need to design, develop, and test in an environment
similar to production from the beginning. Multi-instance testing in
a clustered environment is not a task prior to shipping, it is how
services are built and tested. Testing with three or more instances
must be performed during development, as that approach is much
more likely to detect problems in distributed systems than testing
only with single instances.
Once confronted with the operational tooling generally available,
developers are frustrated and dismayed. Integration is often cum‐
bersome on the development process. Developers don’t want to
spend a lot of time setting up and running test environments. If
something is too difficult to test and that test is not automated, the
Full-Stack Reactive
Reactive microservices must be deployed to a Reactive service
orchestration layer in order to be highly available. The Reactive
principles, as defined by the Reactive Manifesto, are the very foun‐
dation of this Reactive microservices series. In Reactive Microservi‐
ces Architecture, Jonas explains why principles such as acting
autonomously, Asynchronous Message-Passing, and patterns like
shared nothing architecture are requirements for computing today.
Without the decoupling these provide, it is impossible to reach the
level of compartmentalization and containment needed for isolation
and resilience.
Just as a high-rise tower depends upon its foundation for stability,
Reactive microservices must be deployed to a Reactive deployment
system so that organizations building these microservices can get
the most out of them. You would seriously question the architect
who suggests building your new high-rise tower on an existing
foundation, as is. It may have been fine for the smaller structure, but
it is unlikely to be able to meet the weight, electrical, water, and
safety requirements of the new, taller structure. Likewise, you want
to use the best, purpose-built foundation when deploying your
Reactive microservices.
This report walks through the deployment of a sample Reactive
microservices-based application using the Developer Sandbox from
Lightbend Enterprise Suite, Lightbend’s offering for organizations
building, managing, and monitoring Reactive microservices. The
example application is built using Lagom, a framework that helps
4 | Chapter 1: Introduction
Java and Scala developers easily follow the described requirements
for building distributed, Reactive systems.
7
ment. Why? As stated in the readme: “Even if you are confident that
your architecture can tolerate a system failure, are you sure it will
still be able to next week, how about next month?”
Persistent data storage is required in any application that handles
business transactions. It is also more complicated than working with
stateless services. Here as well, our Reactive principles help simplify
the solution. Event sourcing and CQRS isolate backend data storage
and streaming to engines like Apache Cassandra and Apache Kafka.
Their durable storage needs are likewise isolated. This can be done
using roles to direct those services to designated nodes, or by using a
specialized service cluster to provide the storage engine “as a ser‐
vice.” If using specialized nodes, those nodes and the services they
execute can have a different life cycle than that of stateless services.
Shards need time to synchronize, volumes need to be mounted, and
caches populated. Cluster roles enable application configuration to
specify the roles required of a node that is to execute the service.
Specialized clusters make persistence issues the concern of the ser‐
vice provider. That could be Amazon Kinesis or an in-house Cas‐
sandra team providing the organization with Cassandra as a service.
The storage as a solution approach offers the benefit that the many
details of persistence are the provider’s problem.
Tomorrow’s upgrades require semantic versioning today for the
smooth managing of compatibility. Incompatible, major version
upgrades use just-in-time record migration patterns instead of big
bang style, all-in-one migrations. Minor version, compatible
upgrades are rolled in as usual. Applications must be able to express
compatibility using system and version number declarations. Simple
string version tags lack the semantics needed to automatically deter‐
mine compatibility, limiting autonomy of the cluster services. Dur‐
ing an upgrade, API gateways and other anti-corruption layers can
operate with both service versions simultaneously during the transi‐
tion. This enables you to better control the migration to the new
version. Schema-incompatible upgrades can be further controlled
with schema upgrade-only releases or by using new keyspaces.
Either approach can be used to ensure there is always a rollback path
should the upgrade fail.
The Reactive deployment uses the Reactive principles to embrace
failure and to be resilient to failure. With a fully Reactive stack
deployment, you enable confidence. Immutability provides the abil‐
ity to roll back to known good states. Confidence and usability
Distributed by Design
First and foremost, your deployment platform must be a Reactive
one. A highly available application should be deployed to a resilient
deployment platform if it itself is to be highly available. The reality is
that systems are either well designed for distributed operation or are
forever struggling to work around those realities. (In the physical
world, the speed of light is the speed limit. It doesn’t matter what
type of cable you run between data centers, the longer the cable
between the two ends, the longer it takes to send any message across
the cable.)
The implications of your services failing and not being available are
wide reaching. System outages and other software application–
caused disruptions are part of daily news cycles. On the other end of
the spectrum, consider the user experience when using old, slow,
and other aged systems. Like a blocking writer in data stream pro‐
cessing, you immediately notice the impact. If you need to make
multiple updates into a system that requires you to perform one
change at a time, you may reconsider how many changes you really
need. If the system further encumbers you with wait periods, refus‐
ing to input your next update until all writers have synchronized,
making many changes quickly becomes an exercise in patience.
Even if you discount these as inconveniences to be tolerated, you
cannot deny their impact on productivity. The experience is boring,
if not outright demotivating. If allowed, you become more likely to
accept “good enough” solely to avoid another agonizing experience
of applying those updates. You avoid interacting with the system.
Distributed system operation is difficult. In describing the architec‐
ture of Amazon’s Elastic Container Service (ECS), Werner Vogel
notes the use of a “Paxos-based transactional journal data store” to
provide reliable state management. Docker Engine, when in swarm
mode, uses a Raft Consensus Algorithm to manage cluster state.
Neither algorithm is known for its simplicity. The designers felt
these components were required to meet the challenges of dis‐
tributed operation.
Distributed by Design | 9
The Lightbend Enterprise Suite’s Reactive Service Orchestration fea‐
ture is a masterless system. Conflict-free replicated data types, or
CRDTs, are used for reliably propagating state within the cluster,
even in the face of network failure. Everything from available agent
nodes to the location of service instance executions is shared across
all members using these CRDTs. This available/partition tolerance–
based eventual consistency enables coordination of data changes in
a scalable and resilient fashion.
Developer Friendly
A deployment control system should enable teams to push new
ideas out quickly and easily. It should nurture creativity, not inhibit
it. The greater a team’s velocity, the faster the team can realize its
vision. It should be straightforward to package, deploy, and run a
service in the cluster environment, be it locally or in the cloud. A
deployment system must be built from the ground up with the
developer’s needs being considered.
Developer friendly means allowing developers to focus on the busi‐
ness end of the application instead of on how to build packages, find
Ease of testing
It must be simple for developers to test locally in an environment
that is highly consistent with production. Testing is fundamental to
the deployment process. Users must be able to test at all stages in an
appropriate production-like environment and do so easily. Hosted
services and other black-box systems can be difficult to mock in
development and generally require full duplicate deployments for
the most basic of integration testing.
For developers, particularly those accustomed to using language
platforms that do not provide dependency management, Docker
makes it simple to quickly test changes in the containerized environ‐
ment. Consider a typical single-service application that can be run
in place out of the source tree for development run and test such as
a common blog or wiki app. Setting up the host environment for
testing changes can require more effort than the changes them‐
selves. Virtual Machines (VMs) help, but are big, heavyweight
Continuous Delivery
A workflow-driven Continuous Delivery (CD) pipeline from devel‐
opment to production staging is a foundational part of any software
project. A reliable, easy-to-use CD pipeline is not only an important
stabilizer to the project, it is key to enabling innovative iteration.
After developers submit their PRs, CI will test the proposed rever‐
sion. CI also uses the developer sandbox version of the cluster to test
the changes. Once accepted and merged, the update is deployed.
This will be as staging or test instances to the production cluster, or
sometimes to a dedicated test cluster with a test framework such as
Gatling.io running against it to validate performance under load.
For most teams this means that every time there is a new head revi‐
sion of the release branch, it is delivered to a cluster in a pre-
Cluster conveniences
You want your teams to focus on addressing business needs, not
managing cluster membership, security, service lookup, and many
other moving parts. You will want libraries to provide helper func‐
tions and types for dealing with the common tasks in your primary
languages, with REST and environmental variables for the other
needs. Good library and tool support may seem like conveniences
for lazy developers, but in reality they are optimizations that keep
the cluster concerns out of your services so your teams can focus on
their services.
Service Discovery, introduced in Reactive Microservices Architec‐
ture, is an essential part of a microservices-based platform. Eventu‐
ally consistent, peer gossip-based service registries are used for the
same reason strong consistency is avoided in your application serv‐
ices: because strong consistency comes at a cost and is avoidable in
many scenarios. Library support should include fallbacks for testing
outside of the clustering system. Other interstitial concerns include
mutual service authentication and peer-node discovery. If it is too
difficult to encrypt data streams that should be encrypted, they are
more likely to be unencrypted, or worse, not encrypted properly.
User quotas, or request rate limits, are a key part of keeping services
available by preventing abuse, intended or otherwise. A user-
Composability
You want a descriptive approach that enables you to treat your infra‐
structure as code, and apply the same techniques you apply to appli‐
cation code. You want to be able to pipe the output of one command
to another to create logical units of work. You want composability.
Composability is no accident. It generally requires a well-
implemented domain-driven design. It also requires real-world
usage: teams building solutions, overcoming obstacles, and enhanc‐
ing and fixing the user interfaces. When realized, “composability
enables incremental consumption or progressive discovery of new
concepts, tools and services.” Incremental consumption comple‐
ments the “just the right size” approach to Reactive microservices.
Operations Friendly
Operations teams also enjoy the benefits of the developer-friendly
features I noted. Meaningful application-specific data streams, such
as logging output and scheduling events, benefit all maintainers of
an application. Accounting only for its service provider and reliabil‐
ity roles, operations has many needs beyond those of the developers.
A fundamental aspect of any deployment is where it will reside, on
which physical resources. Operations must integrate with both the
new and existing infrastructure while enforcing business rules and
best practices. Hybrid cloud solutions seek to augment on-premise
resources with cloud infrastructure. The latency introduced between
on-premise and cloud resources makes it difficult to scale a single
23
The Reactive Service Orchestration feature of Lightbend Enterprise
Suite is provided by a project called ConductR, which I discussed in
Chapter 2. ConductR also provides a Developer Sandbox for
deploying services into a local production-like environment. The
sandbox is lightweight and simple to run, making it ideal for devel‐
oper and CI validation testing.
Multinode clusters can be created in DC/OS using Universe, or in
Amazon Web Services using Cloudformation and Ansible play‐
books. Visit the getting started guide for additional details and
information.
This chapter assumes that you’re using the Lightbend Enterprise
Suite Developer Sandbox. If you are using a multinode cluster, you
can alternatively use a Windows, macOS, or Linux host to execute
the conduct CLI commands on a remote cluster, such as in the
cloud. If you are able to SSH to one of the cluster nodes, the CLI is
installed on all nodes by default. If you are attempting to access an
existing cluster, such as one installed on DC/OS, you may need to
run the CLI from a system with specific access to the cluster, such as
by VPN or SSH tunneling. In such cases you may need to contact
your IT department for details. Follow the instructions for installing
the CLI in the ConductR documentation to install that latest CLI
release.
Getting Started
The exercises in this chapter utilize Java applications. Please ensure
you have a recent Java 8 JRE available on the system. Both OpenJDK
and Oracle’s Java are widely used. You will need git, and Maven or
sbt installed on your system in order to build the application. You
will need to be able execute docker commands in order to run the
examples I present, as they utilize Docker images. ConductR’s
command-line tools include resolvers for the DockerHub and Bin‐
tray registries. This makes it simple to deploy public and private
The examples in this chapter use the HAProxy image from Docker‐
Hub to enable the dynamic proxying feature. ConductR’s dynamic
proxy feature routes advertised endpoints to worker instances for
public ingress. The proxy configuration is dynamically updated as
container state changes in the cluster. All proxying is done by HAP‐
roxy. The Lightbend-provided conductr-haproxy bundle subscribes
to bundle events and dynamically updates the HAProxy configura‐
tion in response. This ensures HAProxy is always current, Reac‐
tively. For more information about loading Docker and other OCI
container images into ConductR, see the project documentation.
I recommend registering for the free deployment license for run‐
ning multiple instances of your services at Lightbend.com. This
license allows for the use of up to three agent nodes in production,
allowing for the scaling of up to three instances of each service.
Please note that at least three nodes are required to use the network
partition, or Split-Brain Resolution (SBR) feature, of Lightbend
Enterprise Suite. It is not possible to form an initial quorum with
less than three nodes. Given the near certainty of numerous network
partitions during the deployment of a distributed application, if you
are delivering clustered production applications to the cloud, then
you’ll want to be certain to enable this feature.
SBT:
sbt install
When you deploy Chirper for the first time, it may take a few
minutes since the build tool will need to download all of the project’s
dependencies. On subsequent deploys, these tasks will complete sig‐
nificantly faster as the build tool does not need to re-fetch these files.
Once the artifacts are downloaded, the following tasks will be per‐
formed by the build tool:
• Build the bundle for the Activity Stream, Chirp, Friend, and
Front-End services, respectively.
Once deployed, use the conduct info command to inspect the state
of the deployed services. When you run conduct info, you should
see something similar to this:
$ conduct info
ID NAME VER #REP #STR #RUN
89fe6ec activity-stream-impl v1 1 0 1
73595ec visualizer v2 1 0 1
bdfa43d-e5f3504 conductr-haproxy v2 1 0 1
6ac8c39 load-test-impl v1 1 0 1
9a2acf1-44e4d55 front-end v1 1 0 1
3349b6b eslite v1 1 0 1
01dd0af friend-impl v1 1 0 1
d842342 chirp-impl v1 1 0 1
1acac1d cassandra v3 1 0 1
SBT:
sbt generateInstallationScript && cat target/install.sh
Output similar to the following will be displayed. The CLI tool will
wait for Activity Stream to be scaled to 2 instances:
$ conduct run activity-stream-impl --scale 2
Bundle run request sent.
Bundle 39f36b39adcd108 waiting to reach expected scale 2
Bundle 39f36b39adcd108 has scale 1, expected 2
Bundle 39f36b39adcd108 expected scale 2 is met
Stop bundle with: conduct stop 39f36b3
Print ConductR info with: conduct info
When services are scaled up, the declared resource profile will be
used to determine where the service instance will run. An agent
node that has the requested resources available, as defined in units
of CPU, memory, and disk space, and is not already executing the
same bundle will be selected to start a new instance. The resource
profile is declared as part of a bundle configuration. Bundle configu‐
ration includes roles for grouping workload with resources. When
Process Resilience
ConductR monitors the service processes that it has launched and
ensures that the requested number of instances is being satisfied.
When a service process terminates unexpectedly, new instances of
the bundle will be started, seeking to meet the requested scale. If
conditions require multiple instances of a service to be started, Con‐
ductR will ensure the processes are launched sequentially, one at a
time, avoiding unnecessary and often disruptive application cluster
state changes. In the situation where a service instance terminates
due to loss of a node, ConductR will attempt to start the service on
one of the remaining machines where the service is not already run‐
ning until the requested scale is met.
In the node failure scenario, it’s possible that the number of reques‐
ted instances cannot be met due to an insufficient amount of resour‐
ces being available. In such cases, when a replacement node is
commissioned and joins the cluster, ConductR will automatically
attempt to start the interrupted service until the number of reques‐
ted instances is met again. If a service is loaded with a configuration
bundle, that bundle configuration will always be applied to new
instances of the service by the same bundle identifier.
This resilient behavior relieves operations from the burden of hav‐
ing to restart the services whenever there’s an unexpected termina‐
tion. Should the service interruption be caused by hardware failure,
operations can focus on node provisioning and let the cluster handle
the recovery of the service instances.
Once scaled, the overall state should be similar to the following. The
activity-stream-impl, which is the bundle name of the Activity
Stream service, now has 2 running instances:
$ conduct info
ID NAME VER #REP #STR #RUN
89fe6ec activity-stream-impl v1 1 0 2
73595ec visualizer v2 1 0 1
bdfa43d-e5f3504 conductr-haproxy v2 1 0 1
6ac8c39 load-test-impl v1 1 0 1
9a2acf1-44e4d55 front-end v1 1 0 1
3349b6b eslite v1 1 0 1
01dd0af friend-impl v1 1 0 1
d842342 chirp-impl v1 1 0 1
1acac1d cassandra v3 1 0 1
You can scale Activity Stream back down to 1 instance if you wish:
conduct run activity-stream-impl --scale 1
Rolling Upgrade
Next, you will perform a rolling upgrade of the Friend service. Roll‐
ing upgrades on ConductR are relatively straightforward. The new
version has a new bundle identifier, making the new and old ver‐
sions distinct. A new version of a service can be deployed and run
alongside an existing version. Should the new version expose the
same endpoint as the existing version, while running alongside each
Rolling Upgrade | 31
other the traffic from the proxy will be delivered to both new and
existing versions in round-robin fashion.
There is an instance of Friend service already running—look for the
bundle with friend-impl as its NAME. Note the ID of the friend-
impl service—it has 01dd0af as its value. You call this identifier
“Bundle ID.”
$ conduct info
ID NAME VER #REP #STR #RUN
89fe6ec activity-stream-impl v1 1 0 1
73595ec visualizer v2 1 0 1
bdfa43d-e5f3504 conductr-haproxy v2 1 0 1
6ac8c39 load-test-impl v1 1 0 1
9a2acf1-44e4d55 front-end v1 1 0 1
3349b6b eslite v1 1 0 1
01dd0af friend-impl v1 1 0 1
d842342 chirp-impl v1 1 0 1
1acac1d cassandra v3 1 0 1
First, build a new version of the Friend service. Even if you don’t
make code changes, a clean build will produce a new bundle which
you can use to perform a rolling upgrade.
Maven:
mvn clean package docker:build && \
docker save chirper/friend-impl | bndl --no-default-check \
--endpoint friend --endpoint akka-remote | conduct load
SBT:
sbt friend-impl/clean friend-impl/bundle:dist && \
conduct load -q $(find friend-impl/target -iname \
"friend-impl-*.zip" | head -n 1) | xargs conduct run
You now have a new instance of the Friend service running along‐
side the original one. The original Friend service has 01dd0af as its
bundle ID, while the new one has 87375e6.
$ conduct info
ID NAME VER #REP #STR #RUN
89fe6ec activity-stream-impl v1 1 0 1
73595ec visualizer v2 1 0 1
bdfa43d-e5f3504 conductr-haproxy v2 1 0 1
Dynamic Proxying
Lightbend Enterprise Suite’s Dynamic Proxying feature provides
location transparency of your services to their clients. The caller of
your services only needs to know the address of the proxy in order
to access the services. Service location information is updated Reac‐
tively, absolving callers from needing to be aware of changes in ser‐
vice instance locations, such as those due to scaling changes or
rolling upgrades.
ConductR allows services to expose their endpoints to be accessible
via the proxy. The proxies are used for ingress traffic to the applica‐
tion services. The proxy configuration will be updated as services
are scaled up or down, ensuring that the requests being made to
these services via the proxy will be routed to an available instance.
Let’s scale the Lagom Chirper Activity Stream up and down, and
observe the automatic changes made to the proxy configuration. In
the Developer Sandbox, HAProxy is started as a Docker image. To
view the HAProxy configuration, execute the following command:
docker exec -ti sandbox-haproxy \
cat /usr/local/etc/haproxy/haproxy.cfg
Dynamic Proxying | 33
You should see a HAProxy frontend configuration similar to the
following:
frontend default-http-frontend
mode http
bind 0.0.0.0:9000
acl 1095435chirpsvc-acl-0 path_beg /api/chirps/live
use_backend 109543-chirpsvc-backend-0 if 109543-chirpsvc-acl-0
acl 109543-chirpsvc-acl-1 path_beg /api/chirps/history
use_backend 109543-chirpsvc-back-1 if 109543-chirpsvc-acl-1
acl f13f7a-activitysvc-acl-0 path_beg /api/activity
use_backend f13f7a-activsvc-back-0 if f13f7a-activitysvc-acl-0
acl 29006c-friendsvc-acl-0 path_beg /api/users
use_backend 29006c-friendsvc-back-0 if 29006c-friendsvc-acl-0
acl 64c310-44e4d5-web-acl-0 path_beg /
use_backend 64c310-44e4d5-web-back-0 if 64c310-44e4d5-web-acl-0
Service Locator
Lightbend Enterprise Suite’s Service Discovery feature allows a caller
of a service to lookup the service’s address, thus allowing the request
to be made to the correct address. This pattern is particularly impor‐
tant in the deployment of microservice-based applications, as serv‐
ices are expected to be scaled up, down, and relocated for various
reasons. The list of address for all services running within the sys‐
tem needs to be maintained and kept current. When using an
orchestration product that comes with a built-in service registry
such as ConductR, our services can utilize the Service Discovery fea‐
ture without adding the complexity of additional daemons to our
cluster.
Let’s test the Service Discovery feature by attempting to look up a
particular user from the Friend service within the Lagom Chirper
example. First, you need to register a user. Visit the Front-End
Service Locator | 35
address at http://192.168.10.1:9000/, and click on the “Sign Up” but‐
ton to view the registration page. Enter joe for both the Username
and Name, and click the Submit button. The user joe can now be
looked up from the Friend service through the proxy URL:
curl http://192.168.10.1:9000/api/users/joe
Next, try to look up joe from the Friend service through the service
locator instead. To do that, you will need to look up the Friend API
from the service locator. Issue the following command to see the
endpoints that can be looked up via the service locator:
conduct service-names
The BUNDLE NAME for the Friend service is called friend-impl, and
it exposes its endpoint called friendservice. ConductR exposes its
service locator on port 9008, and in the Developer Sandbox the ser‐
vice locator is accessible on http://192.168.10.1:9008.
Find the addresses for friendservice by executing the following
command:
curl -v http://192.168.10.1:9008/service-hosts/friendservice
You should see a JSON list containing the host address and bind
port of the friendservice, similar to the following:
["192.168.10.2:10785"]
Let’s examine the HTTP request in detail. The URL of the request is
http://192.168.10.1:9008/services/friendservice/api/users/joe, and it is
comprised of the following parts. The http://192.168.10.1:9008/serv‐
ices is the base URL of the HTTP redirection service provided by the
service locator. The next part of the URL is friendservice, which is
the name of the endpoint you would like to be redirected to. The
remaining part of the URL, /api/users/joe, forms the actual redirect
URL to the friendservice endpoint. You can view the request and
response by executing the curl command and passing the verbose
switch, -v:
curl http://192.168.10.1:9008/services/friendservice/api/users/joe \
-v -L
Service Locator | 37
< Server: akka-http/10.0.0
< Date: Tue, 13 June 2017 06:17:37 GMT
< Content-Type: text/plain; charset=UTF-8
< Content-Length: 50
<
* Ignoring the response-body
* Connection #0 to host 192.168.10.1 left intact
* Issue another request to this URL:
* 'http://192.168.10.2:10785/api/users/joe'
* Trying 192.168.10.2...
* Connected to 192.168.10.2 (192.168.10.2) port 10785 (#1)
> GET /api/users/joe HTTP/1.1
> Host: 192.168.10.2:10785
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Length: 42
< Content-Type: application/json; charset=utf-8
< Date: Tue, 13 June 2017 06:17:37 GMT
<
* Connection #1 to host 192.168.10.2 left intact
{"userId":"joe","name":"Joe","friends":[]}
Consolidated Logging
Reviewing application log files is part of regular support activities.
With applications built using microservices, the number of log files
to be inspected can grow significantly. The effort to inspect and
trace these log files grows tremendously when each log file is located
on separate machines.
ConductR provides an out-of-the-box solution to collect and con‐
solidate the logs generated by the application, deployed and
launched through ConductR itself. Once consolidated, the logs then
can be viewed using the conduct logs command. Let’s view the log
from the visualizer bundle by running the following command:
conduct logs visualizer
You should see the log entries from the visualizer application, simi‐
lar to the following:
$ conduct logs visualizer
TIME HOST LOG
14:05:34 les1 [info] play.api.Play - Application started (Prod)
14:05:34 les1 [info] application - Signalled start to ConductR
14:05:34 les1 [info] Listening for HTTP on /192.168.10.2:10609
Consolidated Logging | 39
conduct run visualizer --scale 3
Once scaled to 3 instances, you can view the logs consolidated from
all the visualizer instances by executing the following command:
conduct logs visualizer
You should see something similar to the output below. The log
entries are consolidated from all three instances of visualizer run‐
ning on 192.168.10.1, 192.168.10.2, and 192.168.10.3:
$ conduct logs visualizer
TIME HOST LOG
14:05:34 les1 [info] application - Signalled start to ConductR
14:05:34 les1 [info] Listening for HTTP on /192.168.10.2:10609
14:16:31 les1 [info] play.api.Play - Application started (Prod)
14:16:31 les1 [info] application - Signalled start to ConductR
14:16:31 les1 [info] Listening for HTTP on /192.168.10.3:10166
14:16:35 les1 [info] play.api.Play - Application started (Prod)
14:16:35 les1 [info] application - Signalled start to ConductR
14:16:35 les1 [info] Listening for HTTP on /192.168.10.1:10822
When using RSYSLOG, apart from directing the logs into the RSY‐
SLOG logging service, the logs can be sent to any log aggregator that
speaks the syslog protocol, such as Humio.
SBT:
sbt install
After the scale request completes, the state should look similar to the
following:
$ conduct info
ID NAME VER #REP #STR #RUN
acc2d2b friend-impl v1 3 0 1
bdfa43d-e5f3504 conductr-haproxy v2 3 0 1
The output you see should be similar to the following (i.e., there
should be three core nodes running):
$ conduct members
UID ADDRESS STATUS REACHABLE
-1775534087 conductr@192.168.10.1 Up Yes
-56170110 conductr@192.168.10.2 Up Yes
-322524621 conductr@192.168.10.3 Up Yes
Given that core and agent instances are bound to addresses that are
address aliases for a loopback interface, the simplest way to simulate
a network partition is to pause the core and agent instances. When
the signal SIGSTOP is issued to both core and agent instances, they
will be paused and effectively frozen in execution. From the per‐
spective of the other core and agent nodes, the frozen core and agent
nodes have become unreachable, effectively simulating a network
partition from their point of view.
In order to demonstrate ConductR’s self-healing capability for net‐
work partitions, let’s simulate a network partition. To do this, first
pause the agent instance listening on 192.168.0.3 by executing the
following shell command:
pgrep -f "conductr.ip=192.168.10.3" | xargs kill -s SIGSTOP
After at least one minute, you will see that the agent on
192.168.10.3 can no longer be observed by any remaining mem‐
ber:
ADDRESS OBSERVED BY
conductr-agent@192.168.10.1/client#165917 conductr@192.168.10.2
conductr-agent@192.168.10.2/client#-96672 conductr@192.168.10.2
conductr-agent@192.168.10.3/client#170693
Once this occurs, issue conduct info to see the state of our cluster.
The #REP column indicates the replicated copy of the bundle file has
been reduced from 3 to 2 due to the missing core node indicated by
the conduct members. The #RUN column of the front-end has been
When you do this, the core and agent instance on 192.168.10.3 will
realize that they have been split from the cluster, and will automati‐
cally restart.
Eventually, the conduct members command will indicate that a new
core instance on 192.168.10.3 has rejoined the cluster. Below, the
new core instance is indicated by the new UID value of -761520616,
while the previous core instance had a value of -322524621. Note
that you will observe different UID values on your screen than what
I have show here:
UID ADDRESS STATUS REACHABLE
-1775534087 conductr@192.168.10.1 Up Yes
-56170110 conductr@192.168.10.2 Up Yes
-761520616 conductr@192.168.10.3 Up Yes
47
container image and runtime, continue to aid in mitigating vendor
lock-in. Reactive microservices are best delivered using Reactive
deployment tools that are both operations and developer friendly.
Also in this report you deployed a Reactive application to a Reactive
delivery and deployment platform. You deployed the sample appli‐
cation Chirper using Lightbend Enterprise Suite. I encourage you to
continue experimenting with and exploring the deployment. In this
report you induced some failures so that you could observe the resil‐
ience and self-healing features, firsthand. Many additional failure
scenarios exist, and you’re welcome to test other use cases. Be cer‐
tain to see the the project repository on GitHub for other test cases.
Not that many years ago, the smallest of development test clusters
required a four-posted server rack to hold all the parts. Today, pre‐
senters now only need to bring a small box with several Raspberry
Pi boards and some switches to live-demonstrate a clustered solu‐
tion. In Chapter 3 we ran multiple instances of an application ser‐
vice using Lightbend Enterprise Suite in a production-like cluster
environment. We tested our clusters by inducing failures, including
a network partition, so that you can be assured of its resilience and
observe its self-healing.
Efforts continue to further simplify the task of delivering scalable
and resilient services with agility. Delta State Replicated Data Types,
for example, reduce the amount of state that needs handling when
performing updates across the clustered CRDT. We are likely to see
new ways of testing emerge as it becomes easier to define and
restore not only the collection of container services that compose an
application, revision information, and so on, but also the state of the
persistent actors in the running services. It is conceivable that, like
algorithmic traders testing new trading strategies against replays of
market data, we might apply Lineage-Driven Fault Injection and
machine learning to the events, commands, and facts from our own
systems to train them to be more resilient. Intelligent auto-scaling
utilizing self-tuning and predictive analysis will not be far behind.
Throughout this three-part series of reports on Reactive microservi‐
ces, we’ve seen how the Reactive principles are represented in the
designing, development, and deployment of microservices. As data
and data-driven software becomes essential to the success of organi‐
zations, they are adopting the best practices of software develop‐
ment. It is the innovation, hardening, and re-architecting of over 40
48 | Chapter 4: Conclusion
years of research and real-world usage that bring us Reactive micro‐
services. We know that failures will happen, so you must embrace
them by considering them up front. When you do, you view deploy‐
ment in a whole new light. Instead of a weight that must be carried,
deployment can become the exciting delivery of the new, faster, and
better versions of your software to your customers and subscribers.
Flexible and composable, your deployment platform becomes a
highly effective weapon in the rapid telemetry, high-demand mar‐
kets that organizations compete in today. We hope that this series
has helped you better understand the critical importance of using a
Reactive deployment system when delivering Reactive applications.
This concludes this Reactive microservice series. It has been our sin‐
cere pleasure to introduce you to Reactive microservices and the
joys of a fully Reactive deployment. We hope that the Reactive
strategies and tools presented here are just the beginning of your
Reactive journey. Happy travels!
Conclusion | 49
About the Author
Edward Callahan is a senior engineer at Lightbend. Ed started
delivering Java and JVM services into production environments
back when NoSQL databases were called object databases. At Light‐
bend he developed and deployed early versions of Reactive Micro‐
services using Scala, Akka, and Play with prerelease versions of
Docker, CI jobs, and shell scripts. Those “Sherpa” services went on
to become the first production deployment using the Lightbend
Enterprise Suite. He enjoys being able to share the joys of teaching
and learning while working to simplify building and delivering
streaming applications in distributed computing environments.
Acknowledgments
Ed would like to especially thank Christopher Hunt, Markus Jura,
Felix Satyaputra, and Jason Longshore for their contributions to this
report. This publication was a team effort and would not have been
possible without their contributions.