0% found this document useful (0 votes)
2 views

Ccaws Unit 5

Uploaded by

nissieruby
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Ccaws Unit 5

Uploaded by

nissieruby
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Unit-5

AWS Essentials Review and System Design for


High availability:======
When architecting a system, how often do you start with availability requirements or service
level agreement (SLA)? Most of us don’t. And most of us shouldn’t, because trying to build
high availability into a system early on may slow things down and lead to over engineering.
Yet, we should consider eventual desired level of availability and ensure that we can grow
our system into it when the time comes.

In this article, I will use simple terms and a simple web application stack to discuss high
availability, how it is calculated and how it can be practically achieved. I will also briefly
touch on scalability and how similar approach can achieve higher availability and greater
scalability of our simple web stack.

Start Simple

When we start a new system, it makes sense to keep things simple. At the early stage of any
product, you don’t really know whether it is even going to fly. Speed of iteration, ability to
quickly respond to feedback is the most important attribute of your system.

AWS provides you with a variety of high level services that manage entire parts of the system
for you and if I were to start today, I would maximize the use of these resources, and thus
maximize the time spent doing that special thing that my product is good at. For the purposes
of this article, though, let us assume that I am a bit of an old school guy and I prefer your
plain vanilla LAMP stack. In that case, the simplest way for me to start would be by keeping
everything on a single box (or a single EC2 instance if I were to do it in the cloud). The main
issues with this approach are of course scale and availability. You can only support as many
customers as your single box and should this single box fail, you won’t support any at all.
And while these may not be your primary concerns in the early stages, as your system
matures and your customer base grows, availability and scale will become more important.

Split into Tiers

As the system grows, we often split it into tiers (also known as layers), which sets it on the
path of greater scale and availability. Now every tier can be placed on its own box of the
appropriate size and cost. We can choose bigger, better boxes with multiple levels of
hardware redundancy for our database servers and cheaper, commodity-grade hardware for
our web and application servers.

Let’s say that our business requirements call for 99.5% uptime. In other words, it is allowed
to be down no more than 44 hours in any given 12-month period. The total availability of a
system of sequentially connected components is the product of individual availabilities. Let’s
for example assume that individual servers that host our web and application tiers have 90%
availability and the server hosting our database has 99%. For simplicity’s sake, let’s also
assume that these availability numbers include hardware, OS, software, and connectivity.
Introduce Redundancy

At the first glance, we are not doing too well: The availability of the whole system is way
below our desired uptime and even below the availability of any single component of the
system. However, by splitting the system into tiers and by pushing the state to the database
tier, we have set the stage for solving (to some degree) the problems of availability and scale.

The simplest way to increase availability is by adding redundancy, which is much easier to do
with the stateless parts of the system. As we add web servers to our web tier, the probability
of its total failure decreases, pushing the availability of this tier and of the entire system up:
Adding even one extra server to the web tier brings its availability by 8%. Unfortunately,
adding another web server does not do as much to our overall availability, increasing it only
by 0.8%:
The cost of this tier grows linearly with every server, but availability grows logarithmically.
Consequently, we will soon reach the point of diminishing returns where the value of
additional availability will be less than the cost of adding another server:

Thus, we have noticed the first pattern: Adding redundancy to a single component or tier
leads to logarithmic increase in availability, eventually reaching the point of diminishing
returns, at least from the availability standpoint.

Now, this is oversimplification of course, and there could be other reasons for adding more
servers to your fleet. Overscaling in order to reduce the impact of server failure (AKA “blast
radius”) could be one. This is an example of so-called N+1 deployment. Another reason
could be scale (which will talk about later).

Expand Redundancy to Other Tiers

As we have exhausted our ability to have significant impact on the availability of our system
by adding redundancy to the web tier, we need to look for new opportunities elsewhere. The
next most logical place is another stateless part of the system, the application tier:
Again, we have gained around 8% after adding a second box and the incremental availability
gains started diminishing quickly after that, eventually also reaching the point of diminishing
returns.

As this point, we have noticed a second pattern: The availability of your entire stack cannot
exceed that of its least-available component or tier.

It looks inevitable that we have to add redundancy to your data tier as well:
And sure enough, doing so solves the problem, at least on paper. In reality, adding
redundancy to a statefull component bring with it additional challenges, but that is a topic of
another blog.

Data Center Redundancy

Let’s now take this one step further and take a look at another part of the stack that has been
assumed all along, but never brought to light, the data center that hosts it. Should it go down
due to power outage, Internet connectivity issues, fire or natural disaster, your entire stack
will become inaccessible, rendering all investments of time and money that we made into
your multi-node multi-tiered redundant stack useless.

We can choose to host our stack in a Tier IV data center, which often will cost us a bundle
and still may not offer sufficient protection against natural disasters. Or, we can apply the
same principle of redundancy to the data center itself and spread our stack across multiple
data centers. Wait, what? Multiple data centers? What about latency? What about
maintenance overhead?

All these and other questions may cross your mind when you read the last suggestion. And
under normal circumstances, you would be correct. In order for this approach to be successful
and cost effective, we need data centers that are:

 Built for high reliability so that nothing short of a natural disaster brings one down
 Located close to one another and connected via low-latency high-bandwidth links to ensure
low latency between the facilities
 Located far enough from each other that a single natural disaster cannot bring all of them
down at the same time.

This definition sounds familiar to some of us, doesn’t it? It sounds like an AWS Availability
Zone (or AZ for short)! By moving our stack to AWS, we can spread it across multiple data
centers just as easily as within a single one. And it will cost us just as much:
If we add more bells and whistles, such as hosting our static assets in AWS Simple Storage
Service (S3), serving our content through a CDN — AWS CloudFront and add the ability to
scale both stateless tiers automatically (AWS Auto-Scaling), we’ll arrive at this canonical
highly available web architecture:

Web application architecture: http://aws.amazon.com/architecture/

A Few Words About Scalability

And to top things off, let’s also briefly talk about scalability. By splitting our stack into tiers,
we made it possible to increase the scale of each tier independently. When scaling a
component of a system, two approaches can be used: Getting a bigger box (also known as
scaling up or vertical scaling) or getting more boxes (also knowns as scaling out or horizontal
scaling).

Vertical scaling may be easier from operational standpoint, but has two major limitations:
First, bigger boxes tend to get more expensive faster than the additional scale they provide.
Second, there is a limit to how big of a box you can get. Horizontal scaling offers more room
to grow and better cost efficiency, but introduces additional level of complexity, requiring
higher level of operational maturity and high degree of automation.

AWS automation and serverless architectures


Event-Driven Scaling:==============
Here's an example of an event-driven architecture for an e-commerce site. This architecture
enables the site to react to changes from a variety of sources during times of peak demand,
without crashing the application or over-provisioning resources.

An event-driven architecture is a software design pattern in which microservices react to


changes in state, called events. Events can either carry a state (such as the price of an item or
a delivery address) or events can be identifiers (a notification that an order was received or
shipped, for example). The events trigger microservices that work together to achieve a
common goal, but do not have to know anything about each other except the event format.
Although operating together, each microservice can apply a different business logic, and emit
its own output events.

An event has the following characteristics:

 It is a record of something that has happened.


 It captures an immutable fact that cannot be changed or deleted.
 It occurs whether or not a service applies any logic upon consuming it.
 It can be persisted indefinitely, at a large scale, and consumed as many times as necessary.

In an event-driven system, events are generated by event producers, ingested and filtered by
an event router (or broker), and then fanned out to the appropriate event consumers (or
sinks). The events are forwarded to subscribers defined by one or more matching triggers.
These three components—event producers, event router, event consumers—are decoupled
and can be independently deployed, updated, and scaled:

The event router links the different services and is the medium through which messages are
sent and received. It executes a response to the original event generated by an event producer
and sends this response downstream to the appropriate consumers. Events are handled
asynchronously and their outcomes are decided when a service reacts to an event or is
impacted by it, as in the following diagram of a very simplified event flow:
When to use event-driven architectures

Consider the following usages when designing your system.

 To monitor and receive alerts for any anomalies or changes to storage buckets, database
tables, virtual machines, or other resources.
 To fan out a single event to multiple consumers. The event router will push the event to all
the appropriate consumers, without you having to write customized code. Each service can
then process the event in parallel, yet differently.
 To provide interoperability between different technology stacks while maintaining the
independence of each stack.
 To coordinate systems and teams operating in and deploying across different regions and
accounts. You can easily reorganize ownership of microservices. There are fewer cross-team
dependencies and you can react more quickly to changes that would otherwise be impeded by
barriers to data access.

Benefits of event-driven architectures

These are some of the advantages when building an event-driven architecture.

Loose coupling and improved developer agility

Event producers are logically separated from event consumers. This decoupling between the
production and consumption of events means that services are interoperable but can be
scaled, updated, and deployed independently of each other.

Loose coupling reduces dependencies and allows you to implement services in different
languages and frameworks. You can add or remove event producers and receivers without
having to change the logic in any one service. You do not need to write custom code to poll,
filter, and route events.

Asynchronous eventing and resiliency

In an event-driven system, events are generated asynchronously, and can be issued as they
happen without waiting for a response. Loosely coupled components means that if one
service fails, the others are unaffected. If necessary, you can log events so that the receiving
service can resume from the point of failure, or replay past events.

Push-based messaging, real-time event streams, and lower costs

Event-driven systems allow for easy push-based messaging and clients can receive updates
without needing to continuously poll remote services for state changes. These pushed
messages can be used for on-the-fly data processing and transformation, and real-time
analysis. Moreover, with less polling, there is a reduction in network I/O, and decreased
costs.
Simplified auditing and event sourcing

The centralized location of the event router simplifies auditing, and allows you to control
who can interact with a router, and which users and resources have access to your data. You
can also encrypt your data both in transit and at rest.

Additionally, you can make use of event sourcing, an architectural pattern that records all
changes made to an application's state, in the same sequence that they were originally
applied. Event sourcing provides a log of immutable events which can be kept for auditing
purposes, to recreate historic states, or as a canonical narrative to explain a business-driven
decision.
well architecured best practices in
security:==
Security:
The Security pillar includes the ability to protect data, systems, and assets to take advantage
of cloud technologies to improve your security.

The security pillar provides an overview of design principles, best practices, and questions.
You can find prescriptive guidance on implementation in the Security Pillar whitepaper .

Design Principles

There are seven design principles for security in the cloud:

 Implement a strong identity foundation: Implement the principle of least privilege and
enforce separation of duties with appropriate authorization for each interaction with your
AWS resources. Centralize identity management, and aim to eliminate reliance on long-
term static credentials.
 Enable traceability: Monitor, alert, and audit actions and changes to your environment
in real time. Integrate log and metric collection with systems to automatically investigate
and take action.

 Apply security at all layers: Apply a defense in depth approach with multiple security
controls. Apply to all layers (for example, edge of network, VPC, load balancing, every
instance and compute service, operating system, application, and code).

 Automate security best practices: Automated software-based security mechanisms


improve your ability to securely scale more rapidly and cost-effectively. Create
secure architectures, including the implementation of controls that are defined and
managed as code in version-controlled templates.

 Protect data in transit and at rest: Classify your data into sensitivity levels and use
mechanisms, such as encryption, tokenization, and access control where appropriate.

 Keep people away from data: Use mechanisms and tools to reduce or eliminate the
need for direct access or manual processing of data. This reduces the risk of mishandling
or modification and human error when handling sensitive data.

 Prepare for security events: Prepare for an incident by having incident management
and investigation policy and processes that align to your organizational requirements.
Run incident response simulations and use tools with automation to increase your speed
for detection, investigation, and recovery.

Definition

There are seven best practice areas for security in the cloud:

 Security
 Identity and Access Management
 Detection
 Infrastructure Protection
 Data Protection
 Incident Response
 Application Security

Before you architect any workload, you need to put in place practices that influence security.
You will want to control who can do what. In addition, you want to be able to identify
security incidents, protect your systems and services, and maintain the confidentiality and
integrity of data through data protection. You should have a well-defined and practiced
process for responding to security incidents. These tools and techniques are important
because they support objectives such as preventing financial loss or complying with
regulatory obligations.

The AWS Shared Responsibility Model enables organizations that adopt the cloud to achieve
their security and compliance goals. Because AWS physically secures the infrastructure that
supports our cloud services, as an AWS customer you can focus on using services to
accomplish your goals. The AWS Cloud also provides greater access to security data and an
automated approach to responding to security events.

Best Practices

Security
To operate your workload securely, you must apply overarching best practices to every area
of security. Take requirements and processes that you have defined in operational
excellence at an organizational and workload level, and apply them to all areas.

Staying up to date with AWS and industry recommendations and threat intelligence helps you
evolve your threat model and control objectives. Automating security processes, testing, and
validation allow you to scale your security operations.

The following questions focus on these considerations for security.

SEC 1: How do you securely operate your workload?

In AWS, segregating different workloads by account, based on their function and compliance
or data sensitivity requirements, is a recommended approach.

Identity and Access Management


Identity and access management are key parts of an information security program, ensuring
that only authorized and authenticated users and components are able to access your
resources, and only in a manner that you intend. For example, you should define principals
(that is, accounts, users, roles, and services that can perform actions in your account), build
out policies aligned with these principals, and implement strong credential management.
These privilege-management elements form the core of authentication and authorization.

In AWS, privilege management is primarily supported by the AWS Identity and Access
Management (IAM) service, which allows you to control user and programmatic access to
AWS services and resources. You should apply granular policies, which assign permissions
to a user, group, role, or resource. You also have the ability to require strong password
practices, such as complexity level, avoiding re-use, and enforcing multi-factor
authentication (MFA). You can use federation with your existing directory service.
For workloads that require systems to have access to AWS, IAM enables secure access
through roles, instance profiles, identity federation, and temporary credentials.

The following questions focus on these considerations for security.

SEC 2: How do you manage identities for people and machines?

SEC 3: How do you manage permissions for people and machines?

Credentials must not be shared between any user or system. User access should be granted
using a least-privilege approach with best practices including password requirements and
MFA enforced. Programmatic access including API calls to AWS services should be
performed using temporary and limited-privilege credentials such as those issued by
the AWS Security Token Service.

Reliability:=====
The Reliability pillar includes the reliability pillar encompasses the ability of a workload to
perform its intended function correctly and consistently when it’s expected to. this includes
the ability to operate and test the workload through its total lifecycle. this paper provides in-
depth, best practice guidance for implementing reliable workloads on aws.

The reliability pillar provides an overview of design principles, best practices, and questions.
You can find prescriptive guidance on implementation in the Reliability Pillar whitepaper.

Design Principles

There are five design principles for reliability in the cloud:

 Automatically recover from failure: By monitoring a workload for


key performance indicators (KPIs), you can trigger automation when a threshold is
breached. These KPIs should be a measure of business value, not of the technical aspects
of the operation of the service. This allows for automatic notification and tracking of
failures, and for automated recovery processes that work around or repair the failure.
With more sophisticated automation, it’s possible to anticipate and remediate failures
before they occur.

 Test recovery procedures: In an on-premises environment, testing is often conducted to


prove that the workload works in a particular scenario. Testing is not typically used to
validate recovery strategies. In the cloud, you can test how your workload fails, and you
can validate your recovery procedures. You can use automation to simulate different
failures or to recreate scenarios that led to failures before. This approach exposes failure
pathways that you can test and fix before a real failure scenario occurs, thus reducing
risk.

 Scale horizontally to increase aggregate workload availability: Replace one large


resource with multiple small resources to reduce the impact of a single failure on the
overall workload. Distribute requests across multiple, smaller resources to ensure that
they don’t share a common point of failure.

 Stop guessing capacity: A common cause of failure in on-premises workloads is


resource saturation, when the demands placed on a workload exceed the capacity of
that workload (this is often the objective of denial of service attacks). In the cloud, you
can monitor demand and workload utilization, and automate the addition or removal of
resources to maintain the optimal level to satisfy demand without over- or under-
provisioning. There are still limits, but some quotas can be controlled and others can be
managed (see Manage Service Quotas and Constraints).

 Manage change in automation: Changes to your infrastructure should be made using


automation. The changes that need to be managed include changes to the automation,
which then can be tracked and reviewed.

Performance Efficiency:=======
The Performance Efficiency pillar includes the ability to use computing resources efficiently
to meet system requirements, and to maintain that efficiency as demand changes and
technologies evolve.

The performance efficiency pillar provides an overview of design principles, best practices,
and questions. You can find prescriptive guidance on implementation in the Performance
Efficiency Pillar whitepaper.

Design Principles

There are five design principles for performance efficiency in the cloud:

 Democratize advanced technologies: Make advanced technology implementation


easier for your team by delegating complex tasks to your cloud vendor. Rather than
asking your IT team to learn about hosting and running a new technology, consider
consuming the technology as a service. For example, NoSQL databases, media
transcoding, and machine learning are all technologies that require specialized expertise.
In the cloud, these technologies become services that your team can consume, allowing
your team to focus on product development rather than resource provisioning and
management.
 Go global in minutes: Deploying your workload in multiple AWS Regions around the
world allows you to provide lower latency and a better experience for your customers at
minimal cost.
 Use serverless architectures: Serverless architectures remove the need for you to run
and maintain physical servers for traditional compute activities. For example, serverless
storage services can act as static websites (removing the need for web servers)
and event services can host code. This removes the operational burden of managing
physical servers, and can lower transactional costs because managed services operate at
cloud scale.
 Experiment more often: With virtual and automatable resources, you can quickly carry
out comparative testing using different types of instances, storage, or configurations.
 Consider mechanical sympathy: Understand how cloud services are consumed and
always use the technology approach that aligns best with your workload goals. For
example, consider data access patterns when you select database or storage approaches.

Definition

There are four best practice areas for performance efficiency in the cloud:

 Selection
 Review
 Monitoring
 Tradeoffs

Take a data-driven approach to building a high-performance architecture. Gather data on all


aspects of the architecture, from the high-level design to the selection and configuration of
resource types.

Reviewing your choices on a regular basis ensures that you are taking advantage of the
continually evolving AWS Cloud. Monitoring ensures that you are aware of any deviance
from expected performance. Make trade-offs in your architecture to improve performance,
such as using compression or caching, or relaxing consistency requirements..
Design patterns:===
This guide provides guidance for implementing commonly used modernization design
patterns by using AWS services. An increasing number of modern applications are designed
by using microservices architectures to achieve scalability, improve release velocity, reduce
the scope of impact for changes, and reduce regression. This leads to improved developer
productivity and increased agility, better innovation, and an increased focus on business
needs. Microservices architectures also support the use of the best technology for the service
and the database, and promote polyglot code and polyglot persistence.

Traditionally, monolithic applications run in a single process, use one data store, and run on
servers that scale vertically. In comparison, modern microservice applications are fine-
grained, have independent fault domains, run as services across the network, and can use
more than one data store depending on the use case. The services scale horizontally, and a
single transaction might span multiple databases. Development teams must focus on network
communication, polyglot persistence, horizontal scaling, eventual consistency, and
transaction handling across the data stores when developing applications by using
microservices architectures. Therefore, modernization patterns are critical for solving
commonly occurring problems in modern application development, and they help accelerate
software delivery.

This guide provides a technical reference for cloud architects, technical leads, application and
business owners, and developers who want to choose the right cloud architecture for design
patterns based on well-architected best practices. Each pattern discussed in this guide
addresses one or more known scenarios in microservices architectures. The guide discusses
the issues and considerations associated with each pattern, provides a high-level architectural
implementation, and describes the AWS implementation for the pattern. Open-source GitHub
samples and workshop links are provided where available.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy