DSCC Notes
DSCC Notes
➢ CLOUD COMPUTING
Cloud computing delivers computing services including servers, storage, databases, networking,
software, analytics, and intelligence over the Internet (“the cloud”) to offer faster innovation,
flexible resources, and economies of scale.
Components of Cloud Computing:
1. Client
• It is an access Device or Software interface that users use to access cloud services.
• If the client is a hardware device, it comes with resources like processor, memory, OS,
database, middleware, and applications to perform some user-related tasks and
processing.
• Three broad categories of clients:
- Mobile clients
- Thin clients—thin clients rely on a network connection for computing and don't do
much processing on the hardware itself. Examples include Google Docs, web
applications, and Yahoo Messenger.
- Thick clients- systems that connect to servers even without a network. Put simply, a
thick client does not rely on server applications since it can process, store and manage
data, as well as perform different tasks independently.
2. Cloud Network
3. Cloud API
• It is a set of programming instructions and tools that provide an abstraction over specific
provider cloud.
➢ CHARACTERISTICS OF CLOUD COMPUTING
1. On-demand self-services: The Cloud computing services does not require any human
administrators, user themselves are able to provision, monitor and manage computing
resources as needed.
2. Broad network access: The Computing services are generally provided over standard
networks and heterogeneous devices.
3. Rapid elasticity: The Computing services should have IT resources that are able to scale out
and in quickly and on a need basis. Whenever the user require services it is provided to him
and it is scale out as soon as its requirement gets over.
1
4. Resource pooling: The IT resource (e.g., networks, servers, storage, applications, and services)
present are shared across multiple applications and occupant in an uncommitted manner.
Multiple clients are provided service from a same physical resource.
5. Measured service: The resource utilization is tracked for each application and occupant, it will
provide both the user and the resource provider with an account of what has been used. This
is done for various reasons like monitoring billing and effective use of resource.
➢ GRID COMPUTING
• Grid computing is a computing infrastructure that combines computer resources spread over
different geographical locations to achieve a common goal.
• All unused resources on multiple computers are pooled together and made available for a
single task.
• Organizations use grid computing to perform large tasks or problems that are difficult to do on
a single computer solve complex.
• For example, meteorologists use grid computing for weather modeling.
• Weather modeling is a computation-intensive problem that requires complex data
management and analysis.
• Processing massive amounts of weather data on a single computer is slow and time-
consuming. That’s why meteorologists run the analysis over geographically dispersed grid
computing infrastructure and combine the results.
The following are some common applications of grid computing:
i. Financial services: Financial institutions use grid computing primarily to solve problems
involving risk management. By harnessing the combined computing powers in the grid, they
can shorten the duration of forecasting portfolio changes in volatile markets.
ii. Gaming: The gaming industry uses grid computing to provide additional computational
resources for game developers. The grid computing system splits large tasks, such as creating
in game designs, and allocates them to multiple machines. This results in a faster turnaround
for the game developers.
2
iii. Entertainment: Some movies have complex special effects that require a powerful computer
to create. The special effects designers use grid computing to speed up the production
timeline. They have grid-supported software that shares computational resources to render
the special-effect graphics.
➢ UTILITY COMPUTING
• Utility computing started in the 1960s when mainframe manufacturers provided a service
called time-sharing.
• It allowed organizations like banks to use computing power and database storage without
owning the infrastructure.
• In utility computing, resources such as CPU cycles, storage (measured in GBs), and network
data transfer are tracked and billed based on usage.
• This model charges users only for the resources they consume, making it cost-effective and
scalable.
• Cloud computing expanded the utility computing model to include software applications,
licenses, and self-service portals, allowing users to access and pay for services on-demand.
• Utility computing provides scalable resources, meaning users can increase or decrease their
usage based on their needs.
• It reduces the need for organizations to invest in expensive infrastructure, as they only pay
for what they use.
• Users can access services and applications from anywhere, making it convenient for businesses.
➢ CLIENT-SERVER ARCHITECTURE
• The client-server architecture refers to a system that hosts, delivers, and manages most of the
resources and services that the client requests. In this model, all requests and services are
delivered over a network, and it is also referred to as the networking computing model or
client-server network.
• Client-server architecture, alternatively called a client-server model, is a network application
that breaks down tasks and workloads between clients and servers that reside on the same
system or are linked by a computer network.
• Client-server architecture typically features multiple users’ workstations, PCs, or other devices,
connected to a central server via an Internet connection or other network. The client sends a
request for data, and the server accepts and accommodates the request, sending the data
packets back to the user who needs them.
• This model is also called a client-server network or a network computing model.
• For example, in hospital data processing, a client computer can be busy running an application
program for entering patient information, meanwhile, the server computer can be running
another program to fetch and manage the database in which the information is permanently
stored.
➢ TYPES OF CLOUD COMPUTING(DEPLOYMENT MODELS)
i. Private Cloud Deployment Model
Private Cloud lets us use the infrastructure and resources for a single
organization. Users and organizations do not share resources with other users. That
3
is why it is also called as Internal or corporate model. Private clouds are more costly
than public clouds due to their costly maintenance.
4
vendors came up with a hybrid cloud that offers a smooth transition with public and private
cloud facilities. They keep the sensitive data in the private cloud and non-sensitive data in the
public cloud.
5
• Virtualization: It is the technology that creates multiple simulated environments or
dedicated resources from a single physical hardware system. For example, a server can be
divided into multiple virtual machines (VMs), each running its own operating system.
• Cloud Computing: It is a service delivery model that provides access to computing
resources (like servers, storage, and applications) over the internet. The cloud utilizes
virtualization to offer scalable, on-demand services.
2. Purpose:
• Virtualization: Primarily used to optimize hardware usage, increase flexibility, and run
multiple systems on one physical machine.
• Cloud Computing: Focused on delivering services such as Infrastructure as a Service (IaaS),
Platform as a Service (PaaS), and Software as a Service (SaaS) over the internet.
3. Resource Management:
• Virtualization: Manages resources within a single physical system (like one server or one
data center).
• Cloud Computing: Manages resources across distributed systems, providing access to a
pool of resources over the internet.
4. Scalability:
• Virtualization: Limited to the capacity of the host machine or data center.
• Cloud Computing: Highly scalable, allowing users to increase or decrease resources based
on demand, without worrying about hardware limitations.
5. Cost:
• Virtualization: Requires upfront investment in hardware, but offers cost savings through
better hardware utilization.
• Cloud Computing: Follows a pay-as-you-go model, reducing upfront costs and offering
flexibility based on usage.
➢ VIRTUALIZATION
• Virtualization is used to create a virtual version of an underlying service With the help of
Virtualization, multiple operating systems, and applications can run on the same machine and
its same hardware at the same time, increasing the utilization and flexibility of hardware. It
was initially developed during the mainframe era.
• It is one of the main cost-effective, hardware-reducing, and energy-saving techniques used by
cloud providers.
• Virtualization allows sharing of a single physical instance of a resource or an application among
multiple customers and organizations at one time.It does this by assigning a logical name to
physical storage and providing a pointer to that physical resource on demand.
• The term virtualization is often synonymous with hardware virtualization, which plays a
fundamental role in efficiently delivering Infrastructure-as-a-Service (IaaS) solutions for cloud
computing.
• Moreover, virtualization technologies provide a virtual environment for not only executing
applications but also for storage, memory, and networking.
6
• Host Machine: The machine on which the virtual machine is going to be built is known as Host
Machine.
• Guest Machine: The virtual machine is referred to as a Guest Machine.
Work of Virtualization in Cloud Computing(Virtual Machines)
• Virtualization has a prominent impact on Cloud Computing. In the case of cloud computing,
users store data in the cloud, but with the help of Virtualization, users have the extra benefit
of sharing the infrastructure.
• Cloud Vendors take care of the required physical resources, but these cloud providers charge
a huge amount for these services which impacts every user or organization.
• Virtualization helps Users or Organisations maintain those services that are required by a
company through external (third-party) people, which helps in reducing costs to the company.
This is the way through which Virtualization works in Cloud Computing.
Benefits of Virtualization
• More flexible and efficient allocation of resources.
• Enhance development productivity.
• It lowers the cost of IT infrastructure.
• Remote access and rapid scalability.
• High availability and disaster recovery.
• Pay peruse of the IT infrastructure on demand.
• Enables running multiple operating systems.
Drawback of Virtualization
• High Initial Investment: Clouds have a very high initial investment, but it is also true that it will
help in reducing the cost of companies.
• Learning New Infrastructure: As the companies shifted from Servers to Cloud, it requires
highly skilled staff who have skills to work with the cloud easily, and for this, you have to hire
new staff or provide training to current staff.
• Risk of Data: Hosting data on third-party resources can lead to putting the data at risk, it has
the chance of getting attacked by any hacker or cracker very easily.
➢ VMM DESIGN REQUIREMENTS AND PROVIDERS
• There are three requirements for a VMM.
7
• First, a VMM should provide an environment for programs which is essentially identical to the
original machine.
• Second, programs run in this environment should show, at worst, only minor decreases in
speed.
• Third, a VMM should be in complete control of the system resources. Any program run under
a VMM should exhibit a function identical to that which it runs on the original machine directly.
• Two possible exceptions in terms of differences are permitted with this requirement:
differences caused by the availability of system resources and differences caused by timing
dependencies. The former arises when more than one VM is running on the same machine.
➢ HARDWARE-LEVEL VIRTUALIZATION
• Hardware virtualization is the method used to create virtual versions of physical desktops and
operating systems.
• It uses a virtual machine manager (VMM) called a hypervisor to provide abstracted hardware
to multiple guest operating systems, which can then share the physical hardware resources
more efficiently.
• Hardware virtualization offers many benefits, such as better performance and lower costs.
Components of hardware virtualization
i. Hardware Layer (Virtualization Host):
• This is the physical server that includes components like CPU, memory, network, and
storage (disk drives).
• It is the actual hardware where virtualization happens. The server must be an x86-
based system with one or more CPUs to support running multiple virtual operating
systems.
ii. Hypervisor:
• The hypervisor is the software layer between the operating system and the hardware.
• It allows multiple operating systems (or different OSes) to run on the same physical
machine at the same time.
• It isolates each virtual machine from the host hardware and from other virtual
machines, ensuring independent operations.
iii. Virtual Machines (VMs):
• Virtual machines are like software versions of physical computers.
• Each VM has its own virtual hardware, operating system, and applications, making it
work like a real computer but running on a shared physical server.
Working of H/W Virtualization:
• Hardware virtualization enables a single physical machine to operate like multiple virtual
machines (VMs), creating isolated environments.
• A hypervisor is used to manage the interaction between the physical hardware and virtual
machines, controlling the allocation of resources like CPU, memory, and storage.
• The hypervisor divides and shares the physical machine’s resources (CPU, memory, etc.)
among the virtual machines as needed.
8
• When virtualization is applied to servers, it is called server virtualization, allowing for
better resource utilization of physical servers.
• Virtual machines are isolated from each other, providing protection against malware and
ensuring secure environments for each VM.
Properties of hardware level virtualization:
• It supports multiple OS and applications to be run simultaneously
• It requires no system reboot or dual boot setup
• It gives the appearance of having multiple separate machines, each of which can be used
as normal system
• Degree of isolation is high
• Implementation is less risky and maintenance is easy
Issues in hardware-level virtualization:
• A lot of time is spent on installation and administration of virtual system before testing or
running the applications
• In case, physical and virtual OS are same, this kind of virtualization results in duplication
of efforts and reduces efficiency
• To eliminate these issues we implement virtualization at OS level
➢ VIRTUALIZATION AT OS LEVEL
• Operating system-based Virtualization refers to an operating system feature in which the
kernel enables the existence of various isolated user-space instances.
• The installation of virtualization software also refers to Operating system-based
virtualization. It is installed over a pre-existing operating system and that operating system
is called the host operating system.
• In this virtualization, a user installs the virtualization software in his operating system like
any other program and utilizes this application to operate and generate various virtual
machines.
• Here, the virtualization software allows the user direct access to any of the created virtual
machines.
Virtualization structure:
• Hosted virtualization runs virtual machines (VMs) on top of a base operating system (like
Windows) using software called a hypervisor (VMM). This allows multiple guest operating
systems to run in separate windows on the same physical machine.
• Virtual machines in hosted virtualization only have limited access to physical I/O devices (like
network cards), as the host OS controls most of the hardware. The VMM provides an emulated
version of the hardware to each VM.
9
• Hosted virtualization is easy to install and configure, like VMWare Workstation, which can be
set up quickly. It runs on many different computers because it relies on the host OS for
hardware drivers.
• A drawback is that I/O requests must go through the host OS, which can slow down
performance. Additionally, real-time operating systems (RTOS) are not well-supported
because the host OS controls scheduling.
• Hosted virtualization is useful for testing software or running old applications on newer
systems. It's also convenient for running different operating systems on a single PC, especially
for engineers needing access to various platforms.
• In hosted virtualization, a base operating system (e.g., Windows) is installed first, followed by
a hypervisor or virtual machine monitor (VMM) that allows users to run multiple guest
operating systems in their own windows on the same physical machine.
• Each virtual machine in this architecture only has access to limited I/O devices, as the host OS
retains control over physical devices. The VMM emulates common devices like network cards
and CD-ROM drives for the virtual machines.
• I/O requests made by virtual machines are passed through the host operating system, which
manages hardware interactions in the background.
• Hosted virtualization is easy to install and configure. For instance, VMWare Workstation can
be set up quickly, allowing users to create multiple virtual machines running different
operating systems on one physical system.
• VMMs in hosted virtualization work with a wide variety of PCs since the host OS provides
drivers for interacting with the hardware, eliminating the need for custom drivers.
10
• Performance can be affected because all I/O requests from virtual machines go through the
host OS. Additionally, real-time operating systems are not supported well due to scheduling
constraints imposed by the host OS.
• Hosted virtualization is typically used for testing beta software, running legacy applications,
or enabling engineers to run different operating systems on a single PC for accessing different
software environments.
➢ BINARY TRANSLATION WITH FULL VIRTUALIZATION
• Depending on implementation technologies, hardware virtualization can be classified into two
categories: full virtualization and host-based virtualization.
• Full virtualization does not need to modify the host OS. It relies on binary translation to trap
and virtualize the execution of certain sensitive, nonvirtualizable instructions. The guest OSes
and their applications consist of noncritical and critical instructions.
• With full virtualization, noncritical instructions run on the hardware directly while critical
instructions are discovered and replaced with traps into the VMM to be emulated by software.
• only critical instructions are trapped into the VMM. This is because binary translation can incur
a large performance overhead.
• Noncritical instructions do not control hardware or threaten the security of the system, but
critical instructions do.
• Therefore, running noncritical instructions on hardware not only can promote efficiency but
also can ensure system security.
• This approach was implemented by VMware and many other software companies. As shown
in Figure 3.6, VMware puts the VMM at Ring 0 and the guest OS at Ring 1. The VMM scans the
instruction stream and identifies the privileged, control- and behavior-sensitive instructions.
When these instructions are identified, they are trapped into the VMM, which emulates the
behavior of these instructions. The method used in this emulation is called binary translation.
Therefore, full virtualization combines binary translation and direct execution. The guest OS
is completely decoupled from the underlying hardware. Consequently, the guest OS is
unaware that it is being virtualized.
➢ HYPERVISOR MODE
• CPU Rings: The x86 CPU architecture provides different levels of protection called rings,
with Ring 0 having the highest privilege, where the operating system kernel typically runs.
• Kernel Mode vs. User Mode: Code running in Ring 0 operates in kernel mode (full system
access), while applications typically run in Ring 3 with limited privileges (user mode).
11
Role of Hypervisor
• Paravirtualization: One approach to this challenge, which involves modifying the guest
operating systems so they can run efficiently in lower CPU rings by communicating with
the hypervisor directly for privileged operations.
12
kernel is modified to replace these sensitive instructions with hypercalls, which are direct calls
to the hypervisor or VMM. Xen is an example of a system that uses para-virtualization.
• In this architecture, the guest OS operates at Ring 1, rather than the highly privileged Ring 0,
meaning it cannot execute certain privileged instructions. These privileged tasks are managed
by the hypervisor through hypercalls. Once the OS is modified to replace these instructions,
the behavior of the guest OS mimics that of the original system.
• For example, in UNIX systems, a system call involves an interrupt or service routine. In para-
virtualized systems like Xen, hypercalls are processed through a dedicated service routine to
emulate the same system behavior.
➢ VIRTUALIZATION OF CPU
• A VM is a duplicate of an existing computer system in which a majority of the VM instructions
are executed on the host processor in native mode.
• Thus, unprivileged instructions of VMs run directly on the host machine for higher efficiency.
Other critical instructions should be handled carefully for correctness and stability.
• The critical instructions are divided into three categories: privileged instructions, control-
sensitive instructions, and behavior-sensitive instructions.
• Privileged instructions execute in a privileged mode and will be trapped if executed outside
this mode.
• Control-sensitive instructions attempt to change the configuration of resources used.
• Behavior-sensitive instructions have different behaviors depending on the configuration of
resources, including the load and store operations over the virtual memory.
• A CPU architecture is virtualizable if it supports the ability to run the VM’s privileged and
unprivileged instructions in the CPU’s user mode while the VMM runs in supervisor mode.
• When the privileged instructions including control- and behavior-sensitive instructions of a
VM are executed, they are trapped in the VMM.
• In this case, the VMM acts as a unified mediator for hardware access from different VMs to
guarantee the correctness and stability of the whole system.
• However, not all CPU architectures are virtualizable. RISC CPU architectures can be naturally
virtualized because all control and behavior-sensitive instructions are privileged instructions.
• On the contrary, x86 CPU architectures are not primarily designed to support virtualization.
This is because about 10 sensitive instructions, such as SGDT and SMSW, are not privileged.
When these instructions execute in virtualization, they cannot be trapped in the VMM.
➢ HARDWARE-ASSISTED CPU VIRTUALIZATION
• Hardware-assisted virtualization uses special support from processors like Intel and AMD,
which create a separate mode (Ring -1) for the hypervisor, allowing operating systems to run
normally in their highest mode (Ring 0).
• This method makes virtualization easier by automatically managing privileged tasks, so the
operating system doesn’t need to be modified and no complex binary translation is required.
• Although hardware-assisted virtualization improves efficiency, switching between the
hypervisor and guest OS can cause performance slowdowns due to high overhead.
• Some virtualization platforms, like VMware, use a mix of hardware and software (a hybrid
approach) to balance performance by offloading some tasks to hardware.
13
• Combining para-virtualization and hardware-assisted virtualization can boost performance
even further by optimizing how sensitive instructions are handled.
➢ MEMORY VIRTUALIZATION
• In virtual memory virtualization, the operating system maps virtual memory to physical
memory, and in virtual environments, this mapping extends to machine memory, requiring
two stages of mapping by both the guest OS and the hypervisor (VMM).
• The memory management unit (MMU) and translation lookaside buffer (TLB) optimize
memory performance by handling address translations, reducing latency during memory
access.
• The guest OS controls virtual-to-physical memory mapping, but the hypervisor maps this
physical memory to the actual machine memory through "shadow page tables," which serve
as intermediaries.
• The two-stage mapping in virtual memory creates extra layers, increasing overhead and
slowing performance, especially when the shadow page tables grow large, leading to potential
bottlenecks.
• Hardware assistance, like AMD's nested paging, allows direct memory translation from virtual
memory to machine memory, reducing overhead and improving performance in virtual
environments by minimizing the need for multiple address translations.
• Virtual memory virtualization helps to isolate and manage resources among different virtual
machines, ensuring that they operate independently without interfering with each other.
• Modern virtualization technologies enable dynamic memory allocation, allowing the VMM to
allocate memory to VMs based on current needs, which optimizes resource utilization and
improves overall efficiency.
• The complexity of managing multiple page tables can lead to increased memory consumption,
necessitating efficient management strategies to minimize the performance impact on the
host system.
➢ I/O VIRTUALIZATION
• I/O virtualization is essential for managing the routing of input/output requests between
virtual devices and shared physical hardware.
• There are three primary methods of implementing I/O virtualization: full device emulation,
para-virtualization, and direct I/O virtualization.
• Full device emulation replicates the functionality of physical devices entirely in software
managed by the Virtual Machine Monitor (VMM), allowing multiple VMs to share a single
hardware device.
• In full device emulation, tasks such as device enumeration, identification, and interrupts are
handled by the VMM, but this approach often results in slower performance compared to the
actual hardware due to high software overhead.
• Para-virtualization, commonly used in virtualization platforms like Xen, utilizes a split driver
model with a frontend driver in the guest domain and a backend driver in the control domain
(Domain 0).
14
• The frontend driver manages I/O requests from the guest operating systems, while the
backend driver interacts with the real I/O devices, achieving better performance than full
device emulation, but with increased CPU overhead.
• Direct I/O virtualization allows virtual machines to access physical devices directly, enabling
performance close to native hardware.
• While direct I/O virtualization provides efficiency, it faces challenges with commodity
hardware devices, particularly when physical devices are reclaimed for workload migration,
which may lead to instability.
• Hardware-assisted I/O virtualization reduces the overhead associated with software-based
emulation by using technologies like Intel VT-d, which supports remapping of I/O DMA
transfers and device-generated interrupts.
• VT-d enhances virtualization by allowing multiple guest operating systems, including
unmodified and virtualization-aware ones, to efficiently share hardware resources.
• Self-virtualized I/O (SV-IO) is another approach that utilizes multicore processor resources,
encapsulating tasks related to I/O device virtualization and providing virtual devices and
access APIs to VMs.
• Each virtual interface (VIF) in the SV-IO model corresponds to a type of virtualized I/O device,
such as virtual network interfaces or block devices, enabling efficient communication between
the guest OS and the virtual devices.
• VIFs include dedicated message queues for outgoing and incoming messages, along with
unique identifiers, facilitating effective management and communication within the
virtualization framework.
➢ VIRTUALIZATION IN MULTICORE PROCESSORS
➢ VIRTUAL HIERARCHY
• Many-core chip multiprocessors (CMPs) are changing how we think about computing by
allowing the use of numerous processor cores. Instead of just running time-sharing jobs on a
15
few cores, these chips can allocate multiple cores to different tasks at the same time, known
as space-sharing.
• This approach was proposed by researchers Marty and Hill, who suggested using virtual
hierarchies to improve performance. Unlike a fixed physical hierarchy, a virtual hierarchy can
adapt based on how workloads are distributed across cores.
• Modern many-core CMPs typically have a physical hierarchy with multiple cache levels, which
determine how memory is allocated. A virtual hierarchy, on the other hand, can change based
on the workloads being run. This allows for better data access and communication between
cores.
• In a virtual hierarchy, the first level of the cache is designed to keep data close to the cores
that need it, reducing access times and improving performance. If a cache miss occurs, the
system tries to find the data in the next level of cache. This design helps isolate different
workloads from each other, reducing interference.
• For example, in a tiled architecture, multiple virtual machines (VMs) can be assigned to
different clusters of virtual cores. Each VM operates independently, minimizing the time taken
to access data and ensuring that one workload does not negatively impact another.
• Operating system virtualization adds a layer of abstraction within an OS, allowing multiple
isolated virtual machines to share the same kernel. These are often referred to as virtual
execution environments (VEs) or containers.
• From the user's perspective, these containers function like individual servers, each with its
processes, file system, and network settings. However, they share the same OS kernel, which
is why this method is sometimes called single-OS image virtualization.
• OS-level virtualization creates isolated environments on a single physical server, allowing for
efficient resource allocation among many users. This method is particularly useful in hosting
environments where many users need separate, secure spaces.
• In containerization, programs run within these containers and can only access the resources
allocated to them, giving the illusion of having the entire computer at their disposal. Multiple
containers can exist on a single operating system, allowing programs to run concurrently and
interact with one another.
• This approach is beneficial for consolidating server hardware and managing resources
effectively, enabling services to run in a more organized and efficient manner.
➢ FEATURES OF OS-LEVEL VIRTUALIZATION
• Resource Isolation: Provides a high level of resource isolation, allowing each container to have
its own set of resources, including CPU, memory, and I/O bandwidth.
• Lightweight: Containers are lighter than traditional virtual machines since they share the same
host operating system, leading to faster startup and lower resource usage.
• Portability: Highly portable, making it easy to move containers between environments
without modifying the underlying application.
• Scalability: Easily scalable based on application requirements, allowing applications to
respond quickly to changes in demand.
• Security: Offers a high level of security by isolating containerized applications from the host
OS and other containers on the same system.
16
• Reduced Overhead: Containers incur less overhead than traditional VMs, as they do not need
to emulate a full hardware environment.
• Easy Management: Simple commands can be used to start, stop, and monitor containers,
making management straightforward.
Advantages of Operating System-Based Virtualization:
• Resource Efficiency: Greater resource efficiency is achieved since containers do not emulate a
complete hardware environment, reducing resource overhead.
• High Scalability: Quick and easy scaling of containers in response to demand changes
enhances workload management.
• Easy Management: Managing containers through simple commands facilitates the
deployment and maintenance of numerous containers.
• Reduced Costs: Significantly lowers costs by requiring fewer resources and infrastructure
compared to traditional VMs.
• Faster Deployment: Rapid deployment of containers shortens the time needed to launch new
applications or update existing ones.
• Portability: The portability of containers simplifies moving them between environments
without needing application changes.
Disadvantages of Operating System-Based Virtualization:
• Security: Potential security risks arise since containers share the same host OS; a breach in
one container may affect others.
• Limited Isolation: Containers may not provide complete isolation between applications,
risking performance degradation or resource contention.
• Complexity: Setting up and managing OS-level virtualization can be complex, requiring
specialized skills and knowledge.
• Dependency Issues: Containers can experience dependency problems with other containers
or the host OS, leading to compatibility issues.
• Limited Hardware Access: Containers may have restricted access to hardware resources,
limiting their capability to perform tasks requiring direct hardware access.
➢ XEN HYPERVISOR
• Xen is an open-source hypervisor developed at Cambridge University, designed for efficient
virtualization.
• It operates as a micro-kernel hypervisor, including only essential functions like physical
memory management and processor scheduling.
• Device drivers and other variable components are kept outside the hypervisor, allowing for a
more modular architecture.
• A micro-kernel hypervisor has a smaller code size compared to a monolithic hypervisor,
which includes all functions and drivers.
• The main function of Xen is to convert physical devices into virtual resources that can be
allocated to virtual machines (VMs).
17
• Xen does not include native device drivers; instead, it provides mechanisms for guest
operating systems to access physical devices directly.
• This design choice helps maintain the lightweight nature of the Xen hypervisor while
providing effective virtualization capabilities.
• By separating the core hypervisor functions from device drivers, Xen enhances stability and
security.
TERMINOLOGY
ARCHITECTURE
Domain 0 (Dom0):
• Domain 0 is a privileged guest operating system that has direct access to the hardware and
can manage other guest operating systems (DomU). It is the first domain loaded when Xen
boots and plays a key role in managing device drivers, handling I/O operations, and managing
hardware resources (CPU, memory, etc.) for other guest VMs.
• Dom0 Responsibilities: Allocating and mapping hardware resources for guest domains,
creating and managing VMs, controlling device drivers, and allowing direct access to hardware
resources.
• If Domain 0 is compromised, the entire system becomes vulnerable, which is why securing
Dom0 is essential.
18
• DomU represents unprivileged guest VMs that run on top of the Xen hypervisor. These
domains do not have direct access to hardware and instead rely on Domain 0 to manage I/O
requests and hardware resources.
• PV Guest (Paravirtualized Guest): Paravirtualized guests have modified operating systems to
interact more efficiently with the hypervisor.
• HVM Guest (Hardware Virtualized Machine): Hardware virtualized guests run without
modification and rely on hardware extensions (such as Intel VT or AMD-V) to communicate
with the hypervisor.Hardware Components:
• The hardware layer at the bottom includes the physical resources such as disk storage,
network interface cards (NICs), video graphics adapters (VGAs), processors, and memory.
• The Xen hypervisor virtualizes these physical resources and provides virtual versions of these
devices for each VM.
Security Considerations:
• The architecture of Xen introduces certain security challenges, particularly with Domain 0.
Since Domain 0 has privileged access to the hardware and manages all other domains, any
compromise in Dom0 can potentially lead to the entire system being at risk.
• Implementing strict security policies and regularly updating patches is crucial to ensuring the
security of Domain 0 and protecting the virtualized environment.
• Xen allows various VM operations, such as creating, copying, modifying, migrating, and rolling
back VMs. These functionalities provide users with flexibility in managing multiple VMs but
also introduce potential risks such as data leakage or VM state corruption.
Benefits of XaaS
i. Cost-effective: Cloud service models can cut costs and streamline IT deployments. Using
them, organizations can scale back IT infrastructure and provide on-demand services,
deploying fewer servers, storage devices, network switches and software deployments in
their data centers.
ii. Less environmental and staffing overhead: Using cloud-based services lets an
organization reduce its physical overhead, such as space, power and cooling. Contracted
services also enable IT staff to be reassigned to more important projects and business
19
processes. For budgeting, the use of outside services rather than deploying on-premises
technology shifts many capital expenditures to operational expenditures.
iii. Technical support: With XaaS, the third-party provider's staff provisions, maintains,
upgrades and troubleshoots the service so customers don't have to deploy their own on-
premises support personnel.
iv. Scalability: Cloud-based services can be scaled up or down easily depending on business
needs.
Disadvantages
i. Dependency on Internet Connectivity: XaaS solutions rely heavily on internet access. Any
connectivity issues can disrupt service availability and hinder business operations.
ii. Security Risks: Storing sensitive data in the cloud increases vulnerability to cyberattacks.
Users must trust the service provider's security measures, which can vary significantly.
iii. Limited Control: With XaaS, organizations have less control over their infrastructure and
software since they depend on third-party providers for maintenance, updates, and
management.
iv. Potential for Vendor Lock-In: Migrating from one XaaS provider to another can be
complicated and costly, leading to vendor lock-in and limiting flexibility.
v. Performance Variability: XaaS performance can vary based on the provider's
infrastructure, user demand, and network conditions, potentially affecting user
experience.
vi. Cost Over Time: While XaaS can reduce upfront costs, subscription fees can add up over
time, and organizations may end up paying more than they would with traditional on-
premises solutions.
Examples of XaaS:
i. Hardware as a Service (HaaS): Managed Service Providers (MSP) provide and install some
hardware on the customer’s site on demand. The customer uses the hardware according to
service level agreements. This model is very similar to IaaS as computing resources present at
MSP’s site are provided to users substituted for physical hardware.
ii. Communication as a Service (CaaS): This model comprises solutions for different
communication like IM, VoIP, and video conferencing applications that are hosted in the
provider’s cloud. Such a method is cost-effective and reduces time expenses.
iii. Desktop as a Service (DaaS): DaaS provider mainly manages storing, security, and backing up
user data for desktop apps. A client can also work on PCs using third-party servers.
iv. Security as a Service (SECaaS): In this method, the provider integrates security services with
the company’s infrastructure through the internet which includes anti-virus software,
authentication, encryption, etc.
v. Healthcare as a Service (HaaS): The healthcare industry has opted for the model HaaS service
through electronic medical records (EMR). IoT and other technologies have enhanced medical
services like online consultations, health monitoring 24/7, medical service at the doorstep e.g.
lab sample collection from home, etc.
vi. Transport as a Service (TaaS): Nowadays, numerous apps help in mobility and transport in
modern society. The model is both convenient and ecologically friendly e.g. Uber taxi services
is planning to test flying taxis and self-driving planes in the future.
20
2) IaaS
• IAAS is like renting virtual computers and storage space in the cloud.
• You have control over the operating systems, applications, and development frameworks.
• Scaling resources up or down is easy based on your needs.
• Compute: Computing as a Service includes virtual central processing units and virtual
main memory for the Vms that is provisioned to the end- users.
• Storage: IaaS provider provides back-end storage for storing files.
• Network: Network as a Service (NaaS) provides networking components such as routers,
switches, and bridges for the Vms.
• Load balancers: It provides load balancing capability at the infrastructure layer.
3) PaaS
The following features will increase developer productivity if they are effectively implemented on
a PaaS site
• Business Services – The SaaS Provider provides various business services to start up the
business. The SaaS business services include ERP (Enterprise Resource
Planning), CRM (Customer Relationship Management), billing, and sales.
• Document Management - SaaS document management is a software application offered by
a third party (SaaS provider) to create, manage, and track electronic documents.
• Social Networks - As we all know, social networking sites are used by the general public, so
social networking service providers use SaaS for their convenience and handle the general
public's information.
23
• Mail Services - To handle the unpredictable number of users and load on e-mail services,
many e-mail providers offer their services using SaaS.
Advantages:
i. One to Many: SaaS services are offered as a one-to-many model means a single instance
of the application is shared by multiple users.
ii. Less hardware required for SaaS: The software is hosted remotely, so organizations do
not need to invest in additional hardware.
iii. Low maintenance required for SaaS: Software as a service removes the need for
installation, set-up, and daily maintenance for the organizations. The initial set-up cost for
SaaS is typically less than the enterprise software. SaaS vendors are pricing their
applications based on some usage parameters, such as several users using the application.
So SaaS does easy to monitor and automatic updates.
iv. No special software or hardware versions required: All users will have the same version
of the software and typically access it through the web browser. SaaS reduces IT support
costs by outsourcing hardware and software maintenance and support to the IaaS
provider.
v. Multidevice support: SaaS services can be accessed from any device such as desktops,
laptops, tablets, phones, and thin clients.
vi. API Integration: SaaS services easily integrate with other software or services through
standard APIs.
vii. No client-side installation: SaaS services are accessed directly from the service provider
using the internet connection, so do not need to require any software installation.
Disadvantages:
i. Security: Actually, data is stored in the cloud, so security may be an issue for some users.
However, cloud computing is not more secure than in-house deployment.
ii. Latency issue: Since data and applications are stored in the cloud at a variable distance
from the end-user, there is a possibility that there may be greater latency when
interacting with the application compared to local deployment. Therefore, the SaaS model
is not suitable for applications whose demand response time is in milliseconds.
iii. Total Dependency on the Internet: Without an internet connection, most SaaS
applications are not usable.
iv. Switching between SaaS vendors is difficult: Switching SaaS vendors involves the difficult
and slow task of transferring very large data files over the internet and then converting
and importing them into another SaaS also.
➢ DBaaS
• Database as a service (DBaaS) is a cloud computing managed service offering that provides
access to a database without requiring the setup of physical hardware, the installation of
software or the need to configure the database.
• Most database administration and maintenance tasks are handled by the service provider,
enabling users to quickly benefit from the database service.
24
• The use of DBaaS is growing as more organizations shift from on-premises systems to cloud
databases.
• DBaaS vendors include cloud platform providers that sell database software and other
database makers that host their software on one or more of the cloud platforms. Most DBaaS
environments run on public cloud infrastructure, but some cloud providers will also install
their DBaaS technologies in on-premises data centers and manage them remotely for
customers in private clouds or hybrid cloud infrastructures.
DBaaS and on-premises database variations
i. On-Premises Management: In an on-premises setup, the organization's IT staff manages
the database server, including installation and maintenance.
ii. Role of DBA: A database administrator (DBA) is responsible for configuring and managing
the databases on the on-premises server.
iii. DBaaS Management: In a DBaaS model, the cloud provider handles all system
infrastructure and database management as a fully managed service.
iv. Administrative Functions: The provider manages high-level tasks like installation,
configuration, maintenance, backups, patching, and performance management in the
DBaaS model.
v. DBA Responsibilities: Under DBaaS, the DBA’s role shifts to monitoring database usage,
managing access, and coordinating with the provider for provisioning and maintenance.
vi. Cost Structure: DBaaS operates on a subscription basis, where customers pay for system
resources instead of buying software licenses like in on-premises setups.
vii. Flexibility in Resource Use: Customers can pay as they go for resources in DBaaS or
reserve instances for regular workloads at discounted prices.
viii. DSurvey Insights: A 2021 survey found that 49% of organizations used cloud-based
relational databases, and 38% used NoSQL database services, indicating growing DBaaS
adoption.
Advantages:
• The DBaaS model offers some specific advantages over traditional on-premises database
systems, including the following:
• Reduced management requirements. The DBaaS provider takes on many of the routine
database management and administration burdens.
• Elimination of physical infrastructure. The underlying IT infrastructure required to run the
database is provided by the DBaaS vendor or the provider of the cloud platform that's hosting
the DBaaS environment, if they're different companies.
• Reduced IT equipment costs. Because the system infrastructure is no longer on premises,
users don't need to invest in database servers or plan for hardware upgrades on an ongoing
basis.
• Additional savings. In addition to lower capital expenditures, savings can come from
decreased electrical and HVAC operating costs and smaller space needs in data centers, as
well as possible IT staff reductions.
• More flexibility and easier scalability. The infrastructure supporting the database can be
elastically scaled up or down as database usage changes, as opposed to the more complex
and rigorous process required to scale on-premises systems.
25
Disadvantages:
• Lack of Control: Organizations have limited control over the IT infrastructure since they don't
have direct access to the servers and storage used by the DBaaS provider.
• Dependency on Provider: The organization must rely on the cloud provider to manage the
infrastructure effectively, which can be a risk if the provider does not meet expectations.
• Internet Dependency: If the organization’s internet connection fails or the DBaaS provider
experiences an outage, access to the database will be lost until the issue is resolved.
• Security Concerns: Security is primarily managed by the DBaaS provider, which can raise
concerns for organizations about the safety of their data and databases.
• Shared Responsibility Model: Organizations are responsible for certain aspects of data
security, while the vendor secures the database platform and infrastructure, which can
complicate accountability.
• Latency Issues: Accessing data over the internet can introduce latency, leading to slower
performance, especially when handling large datasets.
➢ CLOUD DEPLOYMENT
• Cloud deployment is the process of deploying an application through one or more hosting
models—software as a service (SaaS), platform as a service (PaaS) and/or infrastructure as a
service (IaaS)—that leverage the cloud.
• This includes architecting, planning, implementing and operating workloads on cloud.
1) Security
2) Performance
• An IT department should also consider what kind of applications the organization may put
into the cloud.
• Is it a database-heavy custom web application or a standard office productivity suite?
26
• There’s a huge difference in how IT staff would architect these solutions in terms of
ensuring their performance.
• For example, IT administrators might find that it is more cost-effective to run certain
applications in-house, potentially from a private cloud located onsite, instead of migrating
them to a remote cloud location.
• One way to find out how solutions should be architected for performance is to launch a
pilot and measure the application’s behavior under real-world conditions.
• However, this is not always feasible, such as when the pilot itself would take a massive
effort — perhaps 90 percent of the work for a full deployment.
• Fortunately, experts on cloud-based application performance can conduct an assessment of
planned cloud deployment and point out the likely shortcomings in its expected
performance under typical and peak conditions.
3) Integration
• If an application cannot be fully virtualized, it’s not a good candidate for cloud
deployment, although this happens rarely.
• Much more common is that organizations will deploy different apps in different clouds,
such as software-as-a-service applications that are provided by different third parties.
• Planning for integration is critically important for getting these disparate applications to
work together seamlessly.
• Enterprises seeking to integrate cloud applications with each other or with non-cloud
applications should pay particular attention to the application programming interfaces
(APIs) that the applications make available.
• These APIs can provide a simple and inexpensive way to connect application services and
data to each other. Third-party integration experts can also help IT staff understand how
dissimilar applications can be integrated.
4) Legal Requirements
• When sensitive information is stored in the cloud, questions arise about who is
responsible for any damages caused by a security breach.
• In some cases, organizations may require cloud providers to sign agreements, such as a
HIPAA business associate agreement (BAA), to make the provider liable for any non-
compliance or data loss.
• Different types of data are subject to different compliance regulations, such as HIPAA for
health data, PCI DSS for payment data, and SOX for financial reporting. Migrating to the
cloud, especially the public cloud, can create challenges in meeting these compliance
requirements.
• Both the organization and the cloud provider share responsibility for maintaining security
and compliance, but exact roles must be clearly defined and agreed upon.
27
1. Network Latency: Latency refers to the delay in data transmission from one point to another
in the network. In cloud services, latency can cause delays in accessing cloud resources,
leading to poor user experience, especially in real-time applications.
Resolution: Cloud providers must implement optimized routing algorithms, use content
delivery networks (CDNs), and strategically place data centers close to users to minimize
latency.
2. Network Congestion: Congestion occurs when the network is overwhelmed by too much
traffic, causing packet loss, delays, or slow service. This is particularly problematic in cloud
environments where multiple users share network resources.
Resolution: Providers can mitigate congestion by upgrading network infrastructure, using
load balancers, and implementing traffic prioritization and quality of service (QoS) rules.
3. Bandwidth Constraints: Insufficient bandwidth can limit data transfer rates, making it difficult
for users to access cloud services efficiently. This is critical for applications requiring high
bandwidth, such as video streaming or large data transfers.
Resolution: Cloud providers need to offer scalable bandwidth solutions, ensuring that
customers can adjust bandwidth according to their workload needs.
4. Packet Loss: Packet loss happens when data packets traveling across the network fail to reach
their destination. This can severely impact the performance of applications, especially those
dependent on continuous data streams (e.g., voice, video, or gaming).
Resolution: Implementing error-checking mechanisms, redundant paths for data
transmission, and enhanced network infrastructure can help minimize packet loss.
5. Network Security Breaches:, and unauthorized access can disrupt cloud networks and expose
sensitive information.
Resolution: Cloud providers should implement robust security measures, such as firewalls,
DDoS protection, encryption, and intrusion detection systems (IDS) to safeguard their
networks.
6. Interoperability and Compatibility Issues: Interoperability issues arise when users attempt to
integrate or communicate between different cloud platforms or on-premises systems and the
network protocols, or APIs do not align.
Resolution: Cloud providers should ensure that their services are compatible with various
protocols and offer flexible APIs to enable seamless integration with other systems.
8. Data Transfer Costs and Efficiency: Transferring data between on-premises systems and the
cloud, or between different cloud environments, can be slow and costly, especially for large
datasets.
Resolution: Cloud providers should offer efficient data transfer services such as high-speed
direct connections (e.g., AWS Direct Connect), data compression, and data migration services
to reduce costs and improve transfer speeds.
28
➢ CLOUD NETWORK TOPOLOGIES
•Front-End or User Access Layer: This layer allows users to initiate connections to cloud
services, serving as the entry point for accessing cloud resources.
• Compute Layer: This layer includes the core infrastructure, such as cloud servers, storage,
load balancers, and security devices, responsible for managing and delivering cloud services.
• Network Layer: This can be based on either Layer 2 or Layer 3 network topology:
- Layer 2 Topology: Easier to implement and manage, it facilitates communication within a
local network.
- Layer 3 Topology: Used for transferring data packets from a source in one cloud to an
application layer in another cloud, allowing for wider connectivity across different
networks.
➢ AUTOMATION AND SELF-SERVICE FEATURES IN THE CLOUD
• Cloud automation enables IT admins and Cloud admins to automate manual processes and
speed up the delivery of infrastructure resources on a self-service basis, according to user or
business demand.
• Cloud automation can also be used in the software development lifecycle for code testing,
network diagnostics, data security, software-defined networking (SDN), or version control in
DevOps teams.
➢ CLOUD PERFORMANCE
• Cloud performance refers to how well your applications, workloads, and databases operate
on the cloud.
• It also measures the speed of the network and storage I/O.
• It is primarily measured by round trip response time, i.e. time interval between the user issues
a command and receiving the results from the cloud.
• Another performance impact is from the number of hops.
• Within a data center, resources need to communicate
• The number of network hops between the resources and applications adds significantly to
response delays
• The performance of your underlying cloud platform not only impacts your customers but your
internal processes as well. Your mission-critical workloads such as Microsoft SQL and Oracle
allow your business to stay afloat. Your cloud performance needs to match the processing
demand from internal users as well with high response times.
Cloud performance metrics
i. IOPS (I/O Operations per Second): IOPS measures how many read and write operations your
cloud platform can handle each second. The performance of IOPS depends on factors like the
size of the data being processed and the number of pending operations. Although cloud
providers may advertise a fixed IOPS rate, actual performance will vary depending on your
workload.
ii. Latency: Latency refers to the time it takes for an operation to be completed on your cloud
platform. While cloud platforms aim for performance similar to on-premises servers,
29
throttling by providers in shared environments can slow things down. This delay is caused by
cloud providers managing resource usage to maintain balance across users.
iii. Resource Availability: Resource availability indicates whether your cloud instances are
running smoothly or if any requests are delayed. High availability is crucial for ensuring that
your applications are always accessible when needed, and it prevents downtime or
disruptions for users.
iv. Capacity: Capacity is the cloud platform's ability to provide sufficient storage and processing
power to meet your business needs. Adequate capacity ensures your cloud platform can
handle incoming requests and process data efficiently, without performance slowdowns.
• They have high ease of accessibility, better replication to remote data centers alongside
automation, and better elasticity.
• Problems arising are security, data privacy, multi-tenancy, low barrier to entry of malicious
users, and reliance on a third-party provider for business-critical services.
➢ CLOUD ARCHITECTURE
The architecture of cloud computing is the combination of both SOA (Service Oriented
Architecture) and EDA (Event Driven Architecture). Client infrastructure, application, service,
runtime cloud, storage, infrastructure, management, and security are the components of cloud
computing architecture.
1) Frontend: Frontend of the cloud architecture refers to the client side of a cloud computing
system. This means it contains all the user interfaces and applications that are used by the
client to access the cloud computing services/resources. For example, use of a web browser
to access the cloud platform.
30
2. Backend: Backend refers to the cloud itself which is used by the service provider. It contains the
resources as well as manages the resources and provides security mechanisms. Along with this, it
includes huge storage, virtual applications, virtual machines, traffic control mechanisms,
deployment models, etc.
31
Benefits of Cloud Computing Architecture
• Cloud Service Provider (CSP): A Cloud Service Provider is a company that offers cloud-based
services such as infrastructure, platforms, and software over the Internet. Major CSPs include
AWS, Microsoft Azure, and Google Cloud. CSPs provide services like IaaS, PaaS, and SaaS,
enabling businesses to use resources such as storage, servers, and databases on a pay-as-you-
go basis. The CSP is responsible for maintaining the underlying infrastructure, ensuring high
availability, and managing services such as data storage, processing, and networking.
Businesses rely on the CSP for scalability, flexibility, and cost-efficiency when moving their
operations to the cloud. However, customization is limited to the services and features
provided by the CSP.
• Cloud Service Broker (CSB): A Cloud Service Broker acts as an intermediary between
businesses and multiple cloud service providers. CSBs focus on aggregating, managing, and
optimizing services from different CSPs to provide a solution that best fits a company’s specific
needs. Their role involves negotiating contracts with CSPs, integrating services, managing
workloads, and ensuring cost efficiency. CSBs also handle complex tasks like ensuring seamless
integration between services from different providers, managing security, and offering
centralized control for cloud management. Unlike CSPs, CSBs do not own or operate cloud
infrastructure, but instead, help organizations choose the right services, customize solutions,
and manage ongoing cloud operations efficiently.
• Regardless of where employees are working, they need to access their organization’s resources
like apps, files, and data. The traditional way of doing things was to have the vast majority of
workers work on-site, where company resources were kept behind a firewall. Once on-site and
logged in, employees could access the things they needed.
• Now, however, hybrid work is more common than ever and employees need secure access to
company resources whether they’re working on-site or remotely. This is where identity and access
management (IAM) comes in. The organization’s IT department needs a way to control what users
can and can’t access so that sensitive data and functions are restricted to only the people and
things that need to work with them.
• IAM gives secure access to company resources—like emails, databases, data, and applications—
to verified entities, ideally with a bare minimum of interference. The goal is to manage access so
that the right people can do their jobs and the wrong people, like hackers, are denied entry.
32
Working: There are two parts to granting secure access to an organization’s resources: Identity
management and access management.
Identity management checks a login attempt against an identity management database, which is
an ongoing record of everyone who should have access. This information must be constantly
updated as people join or leave the organization, their roles and projects change, and the
organization’s scope evolves.
Examples of the kind of information that’s stored in an identity management database include
employee names, job titles, managers, direct reports, mobile phone numbers, and personal email
addresses. Matching someone’s login information like their username and password with their
identity in the database is called authentication.
Access management is the second half of IAM. After the IAM system has verified that the person
or thing that’s attempting to access a resource matches their identity, access management keeps
track of which resources the person or thing has permission to access. Most organizations grant
varying levels of access to resources and data and these levels are determined by factors like job
title, tenure, security clearance, and project.
Granting the correct level of access after a user’s identity is authenticated is called authorization.
The goal of IAM systems is to make sure that authentication and authorization happen correctly
and securely at every access attempt.
• Google Inc. developed the Google File System (GFS), a scalable distributed file system (DFS),
to meet the company’s growing data processing needs.
• GFS offers fault tolerance, dependability, scalability, availability, and performance to big
networks and connected nodes. GFS is made up of a number of storage systems constructed
from inexpensive commodity hardware parts.
• The search engine, which creates enormous volumes of data that must be kept, is only one
example of how it is customized to meet Google’s various data use and storage requirements.
• Without hindering applications, GFS is made to meet Google’s huge cluster requirements.
Hierarchical directories with path names are used to store files. The master is in charge of
managing metadata, including namespace, access control, and mapping data. The master
communicates with each chunk server by timed heartbeat messages and keeps track of its
status updates.
• More than 1,000 nodes with 300 TB of disc storage capacity make up the largest GFS clusters.
This is available for constant access by hundreds of clients.
33
Components of GFS
A group of computers makes up GFS. A cluster is just a group of connected computers. There could
be hundreds or even thousands of computers in each cluster. There are three basic entities
included in any GFS cluster as follows:
• GFS Clients: They can be computer programs or applications which may be used to request
files. Requests may be made to access and modify already-existing files or add new files to the
system.
• GFS Master Server: It serves as the cluster’s coordinator. It preserves a record of the cluster’s
actions in an operation log. Additionally, it keeps track of the data that describes chunks, or
metadata. The chunks’ place in the overall file and which files they belong to are indicated by
the metadata to the master server.
• GFS Chunk Servers: They are the GFS’s workhorses. They keep 64 MB-sized file chunks. The
master server does not receive any chunks from the chunk servers. Instead, they directly
deliver the client the desired chunks. The GFS makes numerous copies of each chunk and
stores them on various chunk servers in order to assure stability; the default is three copies.
Every replica is referred to as one.
Advantages of GFS:
• High Availability: Data remains accessible even if some nodes fail, thanks to replication. GFS
is designed to handle frequent component failures without data loss.
• High Throughput: GFS ensures high throughput by allowing many nodes to operate
concurrently, making it ideal for large-scale data processing tasks.
• Reliable Storage: GFS can detect and recover corrupted data by duplicating the data from
healthy copies, ensuring the reliability of the stored information.
Disadvantages of GFS:
• Not Ideal for Small Files: GFS is not optimized for handling small files, which can lead to
inefficiencies in the storage and retrieval of such data.
• Master Node Bottleneck: The master node can become a bottleneck as it manages metadata
and coordinates file access, limiting scalability in certain scenarios.
• Limited Random Writes: GFS does not support random writes well. It is more suited for
operations where data is written once and then read (or appended) later, such as logs or large
datasets.
1) Assess Business Needs: SMBs should evaluate their specific business needs and goals to
determine what cloud services and models (IaaS, PaaS, SaaS) are most appropriate for their
operations.
2) Start Small and Scale Gradually: Begin with a small-scale cloud deployment to test its benefits,
and then gradually scale services as the business grows. Pay attention to the flexibility and
scalability offered by cloud providers.
34
3) Cost Management: Keep track of cloud usage and optimize costs by taking advantage of
pricing models, such as pay-as-you-go, reserved instances, and auto-scaling to avoid
overprovisioning.
4) Choose the Right Cloud Provider: Select a cloud provider that offers reliable service, strong
security, and comprehensive support. Consider factors like pricing, service-level agreements
(SLAs), compliance with regulations, and data security.
5) Ensure Data Security: Implement strong security practices, such as encryption, firewalls,
identity management, and regular backups. Understand the shared responsibility model,
ensuring SMBs secure their own data and applications while relying on the provider to secure
infrastructure.
6) Leverage Automation: Use automation tools offered by the cloud provider for tasks like
scaling, monitoring, and deploying updates. Automation reduces manual intervention and
improves efficiency.
7) Optimize for Performance: Monitor performance metrics such as uptime, latency, and data
access speeds to ensure smooth operations. Regularly review and adjust cloud configurations
to optimize performance.
8) Develop a Cloud Migration Strategy: When transitioning to the cloud, SMBs should create a
step-by-step plan, assessing which applications can be moved first and which ones require
reengineering.
9) Train Employees: Provide adequate training for staff on how to manage and use cloud services
effectively. A well-trained workforce can maximize the benefits of cloud adoption.
➢ AMAZON EC2
• Amazon Elastic Compute Cloud (Amazon EC2) provides on-demand, scalable computing
capacity in the Amazon Web Services (AWS) Cloud.
• Using Amazon EC2 reduces hardware costs so you can develop and deploy applications faster.
We can use Amazon EC2 to launch as many or as few virtual servers as we need, configure
security and networking, and manage storage. We can add capacity (scale up) to handle
compute-heavy tasks, such as monthly or yearly processes, or spikes in website traffic. When
usage decreases, we can reduce capacity (scale down) again.
• An EC2 instance is a virtual server in the AWS Cloud. When you launch an EC2 instance, the
instance type that you specify determines the hardware available to your instance. Each
instance type offers a different balance of compute, memory, network, and storage resources.
Amazon EC2 offers several high-level features that allow users to run scalable and customizable
virtual servers in the cloud:
35
1. Instances: These are virtual servers that run applications on the Amazon EC2 platform. You
can configure and manage these instances based on your computing needs.
2. Amazon Machine Images (AMIs): AMIs are preconfigured templates for launching EC2
instances. They include essential components such as the operating system, software
packages, and configuration settings.
3. Instance Types: Amazon EC2 provides different instance types, offering varying combinations
of CPU, memory, storage, networking capacity, and specialized options like GPU instances for
high-performance workloads.
4. Amazon EBS Volumes: Elastic Block Store (EBS) provides persistent storage volumes for
instances. Data stored in EBS persists independently of the lifecycle of the instance.
5. Instance Store Volumes: These are temporary storage volumes directly attached to an
instance. The data is lost when the instance is stopped, hibernated, or terminated.
6. Key Pairs: Key pairs provide secure login to EC2 instances. AWS retains the public key, while
the private key is stored by the user for secure access.
7. Security Groups: Security groups act as virtual firewalls. They control the inbound and
outbound traffic to instances by defining allowed protocols, ports, and IP address ranges.
Parallelization
1. Multiple Cloud Instances: Tasks are distributed across multiple virtual machines (VMs) or
containers, allowing them to be processed concurrently. This reduces the overall execution
time, making applications more efficient, particularly for big data analysis, scientific
simulations, and machine learning workloads.
2. MapReduce Frameworks: Platforms like Hadoop or Spark enable parallel processing of data-
intensive workloads. The MapReduce model breaks down tasks into "map" functions that
operate in parallel and "reduce" functions that aggregate the results, making it highly efficient
for batch processing and large-scale data analysis.
3. Horizontal Scaling: Cloud platforms support horizontal scaling, where more compute nodes
can be added to a system, distributing the workload across these nodes to optimize resource
utilization.
4. Parallel Programming Models: Cloud applications leverage parallel programming models like
MPI (Message Passing Interface) and OpenMP, enabling them to run tasks simultaneously on
multiple nodes or cores within a cluster. This is essential in high-performance computing (HPC)
scenarios.
In-Memory Operations
In-memory computing refers to storing and processing data directly in the server's memory (RAM)
instead of reading it from slower disk-based storage. In-memory operations provide a significant
36
speed advantage, as memory access is much faster than disk access. Cloud platforms offer
specialized services that enable in-memory data processing:
1. In-Memory Data Stores: Solutions like Amazon ElastiCache, Redis, and Memcached allow data
to be stored in RAM. These data stores cache frequently accessed data, minimizing the need
to repeatedly access slower storage tiers. This is crucial for applications requiring low-latency
data retrieval, such as real-time analytics, gaming, and high-frequency trading.
2. In-Memory Databases: Cloud platforms provide databases specifically designed for in-
memory operations, such as Amazon Aurora, Google Cloud Spanner, and SAP HANA. These
databases store entire datasets in memory to support real-time data processing and analytics.
This allows applications to handle high transaction volumes and large datasets with minimal
delays.
3. Stream Processing: Frameworks like Apache Spark and Apache Flink process data streams in
memory, enabling real-time processing of continuous data flows. This is particularly important
for applications that rely on real-time decision-making, such as fraud detection, IoT analytics,
and real-time monitoring systems.
4. Distributed In-Memory Systems: Systems like Apache Ignite and Hazelcast distribute memory
across a cluster of cloud nodes, ensuring high availability and fault tolerance. By distributing
the memory workload across multiple instances, applications can achieve near-instantaneous
data access, even with large datasets.
➢ AMAZON SIMPLE DB
Amazon SimpleDB is a highly available and flexible NoSQL database service that allows users to
store and query structured data. It is designed to provide a simple interface for developers looking
to leverage cloud storage without the complexities of traditional database systems. Here are the
key characteristics of Amazon SimpleDB:
1. Schema-less Design: SimpleDB allows users to store data without defining a strict schema
beforehand. This flexibility makes it easy to accommodate changing data structures, as different
items can have different attributes.
2. Dynamic Data Model: Users can add or remove attributes from items dynamically, enabling the
database to adapt to evolving application requirements without downtime or complex migrations.
3. High Availability and Reliability: Amazon SimpleDB is designed to offer high availability and
durability, ensuring that data is accessible even in the event of hardware failures. Data is
automatically replicated across multiple servers.
4. Scalable Storage: SimpleDB can automatically scale to accommodate growing datasets without
requiring users to manage storage concerns. It can handle large volumes of data and support high
read and write throughput.
5. Easy Integration with Other AWS Services: Amazon SimpleDB integrates seamlessly with other
AWS services, such as Amazon EC2 and Amazon S3, allowing users to build applications that
leverage the broader AWS ecosystem.
6. Simple Query Language: SimpleDB provides a query language similar to SQL, allowing users to
perform queries on their data. Users can execute structured queries, filtering results based on
specific criteria.
37
7. Eventual Consistency: SimpleDB offers eventual consistency for read operations, meaning that
updates to data may not be immediately visible to all queries. This characteristic is common in
distributed systems and allows for higher availability.
8. Security and Access Control: SimpleDB supports AWS Identity and Access Management (IAM),
allowing users to define permissions and control access to their data at a granular level.
9. Cost-Effectiveness: SimpleDB follows a pay-as-you-go pricing model, where users only pay for
the resources they consume. This makes it a cost-effective solution for applications with varying
workloads.
10. Built-in Indexing: SimpleDB automatically indexes all item attributes, enabling fast and
efficient querying. This reduces the need for manual indexing and enhances query performance.
Google App Engine (GAE) is a platform-as-a-service (PaaS) offering from Google Cloud that enables
developers to build and host web applications in Google-managed data centers. Here are the key
tasks performed by Google App Engine:
1. Application Hosting: GAE provides a fully managed environment for deploying applications.
Developers can host their applications without worrying about the underlying infrastructure.
2. Automatic Scaling: App Engine automatically adjusts the number of instances running based on
incoming traffic. This ensures that applications can handle fluctuations in user demand without
manual intervention.
3. Load Balancing: GAE includes built-in load balancing to distribute incoming traffic across
multiple instances, optimizing resource utilization and ensuring high availability.
5. Data Storage and Management: GAE integrates with various Google Cloud services for data
storage, including Google Cloud Datastore for NoSQL databases, Google Cloud SQL for relational
databases, and Google Cloud Storage for unstructured data.
6. Built-in Security Features: App Engine provides security features such as automatic updates,
SSL certificates, and identity and access management (IAM) to help secure applications and data.
7. Versioning and Traffic Splitting: Developers can deploy multiple versions of an application and
split traffic between them, facilitating A/B testing and gradual rollouts of new features.
8. Monitoring and Logging: GAE includes monitoring and logging capabilities through Google
Cloud Operations (formerly Stackdriver), allowing developers to track application performance,
diagnose issues, and analyze usage patterns.
9. Service Management: GAE allows developers to create and manage microservices, enabling the
development of modular applications that can be scaled independently.
38
10. Cron Jobs and Task Queues: App Engine supports background processing through task queues
and scheduled jobs (cron jobs), allowing developers to offload long-running tasks and execute
them asynchronously.
11. APIs and Services Integration: Google App Engine provides easy integration with other Google
Cloud services, such as Google Cloud Pub/Sub, Google Cloud Functions, and Google Maps APIs,
enhancing the functionality of applications.
12. Flexible Environment Options: GAE offers two environments: Standard and Flexible. The
Standard Environment provides a lightweight, sandboxed runtime, while the Flexible Environment
allows for more customization and supports additional programming languages and frameworks.
Small and Medium-sized Businesses (SMBs) often encounter various challenges that can hinder
their growth and development. Here are some key difficulties they may face:
1. Limited Financial Resources: SMBs typically operate with tighter budgets and may struggle to
secure funding for expansion, marketing, and technology investments. This limitation can restrict
their ability to compete with larger enterprises.
2. Access to Technology: Many SMBs find it challenging to adopt new technologies due to costs
or a lack of expertise. This can lead to outdated systems and processes, making it difficult to
operate efficiently or innovate.
3. Talent Acquisition and Retention: Attracting and retaining skilled employees can be particularly
challenging for SMBs, especially in competitive labor markets. They may struggle to offer the same
salaries and benefits as larger organizations.
4. Marketing and Visibility: SMBs often have limited marketing budgets and may lack the
knowledge or resources to effectively promote their products and services. This can result in low
brand awareness and customer engagement.
6. Scaling Operations: As SMBs grow, they may face difficulties in scaling their operations
efficiently. This includes managing increased customer demand, supply chain challenges, and
maintaining quality control.
7. Competition from Larger Companies: SMBs often compete with larger organizations that have
more resources, established brand recognition, and better economies of scale. This competition
can make it hard for SMBs to gain market share.
8. Economic Uncertainty: Fluctuations in the economy can significantly impact SMBs. Economic
downturns can lead to reduced consumer spending, making it challenging for these businesses to
maintain cash flow.
39
9. Customer Acquisition and Retention: Building a loyal customer base is essential for growth, but
SMBs may struggle to acquire new customers or retain existing ones due to limited marketing
reach and customer service resources.
10. Supply Chain Disruptions: SMBs can be vulnerable to supply chain disruptions, which can
impact their ability to deliver products and services on time. This is particularly critical in times of
global crises or local disasters.
11. Cybersecurity Risks: With increasing reliance on technology, SMBs are also at risk of cyber
threats. Many lack the necessary security measures and protocols, making them targets for
cyberattacks.
12. Changing Consumer Preferences: Rapid changes in consumer behavior and preferences can
catch SMBs off guard. They may struggle to adapt their offerings or marketing strategies in
response to these shifts.
1. Service provider: The service provider is the maintainer of the service and the organization
that makes available one or more services for others to use. To advertise services, the
provider can publish them in a registry, together with a service contract that specifies the
nature of the service, how to use it, the requirements for the service, and the fees charged.
2. Service consumer: The service consumer can locate the service metadata in the registry and
develop the required client components to bind and use the service.
Components of SOA:
40
Guiding Principles of SOA:
Resource-oriented SOA focuses on the resources that are managed by cloud services. These
resources are treated as entities (like files, objects, or data entries), and the interactions are
typically modeled using CRUD operations (Create, Read, Update, Delete). The primary
protocol used is usually HTTP or RESTful APIs.
Key Characteristics:
• Resources are the core concept and are usually represented as URIs (Uniform Resource
Identifiers).
• REST APIs or similar frameworks are used to manipulate these resources over HTTP.
• Clients interact with resources via standard HTTP methods (GET, POST, PUT, DELETE).
• Stateless interactions: Every request from a client to the server must contain all the
information needed to understand and process the request.
41
Example: In a cloud storage application, files (resources) are accessed and managed via API
endpoints like /file/123, where different methods are used to upload, download, or delete
files.
Parallelization in cloud computing involves splitting computational tasks into smaller subtasks
that are processed simultaneously across multiple machines or cores in a distributed
environment. This helps to reduce processing time and efficiently utilize cloud resources.
• Key Concepts:
o Distributed Computing: Cloud applications can run across multiple servers in parallel,
reducing the total time for computation-heavy tasks like data analysis or machine
learning.
o Parallel Algorithms: Algorithms are designed to run concurrently by splitting tasks into
independent units of work that can be processed in parallel. This is common in
MapReduce frameworks like Hadoop.
o Load Balancing: Cloud platforms automatically distribute tasks across multiple
instances or regions, allowing cloud services to scale and meet the demands of
parallelized operations.
• Example: A data processing cloud application might use parallelization to process large
datasets faster by distributing them across different machines using Amazon EMR (Elastic
MapReduce) or Google Cloud Dataproc.
➢ AMAZON S3
• Amazon Simple Storage Service (Amazon S3) is an object storage service offering industry-
leading scalability, data availability, security, and performance.
• Millions of people of all sizes and industries store, manage, analyze, and protect any amount
of data for virtually any use case, such as data lakes, cloud-native applications, and mobile
apps.
• With cost-effective storage classes and easy-to-use management features, you can optimize
costs, organize and analyze data, and configure fine-tuned access controls to meet specific
business and compliance requirements.
Benefits:
1. Scalability: We can store virtually any amount of data with S3 to exabytes with unmatched
performance. S3 is fully elastic, automatically growing and shrinking as you add and remove
data. There’s no need to provision storage, and you pay only for what you use.
2. Durability and availability: Amazon S3 provides the most durable storage in the cloud and
industry-leading availability. Based on its unique architecture, S3 is designed to provide
99.999999999% (11 nines) data durability and 99.99% availability by default, backed by the
strongest SLAs in the cloud.
3. Security and data protection: Protect your data with unmatched security, data protection,
compliance, and access control capabilities. S3 is secure, private, and encrypted by default,
and also supports numerous auditing capabilities to monitor access requests to your S3
resources.
42
4. Lowest price and highest performance: S3 delivers multiple storage classes with the best
price-performance for any workload and automated data lifecycle management, so you can
store massive amounts of frequently, infrequently, or rarely accessed data in a cost-efficient
way. S3 delivers the resiliency, flexibility, latency, and throughput, to ensure storage never
limits performance.
Use cases
1. Build a data lake: A data lake is a centralized repository that allows us to store all our
structured and unstructured data at any scale. We can run data analytics, artificial intelligence
(AI), machine learning (ML), and high-performance computing (HPC) applications to unlock
the value of our data. With an Amazon S3 data lake, users in Salesforce’s organization can
discover, access, and analyze all their data, regardless of where it lives, in a secure and
governed way.
2. Backup and restore critical data: Meet your recovery time objective (RTO), recovery point
objective (RPO), and compliance requirements with S3's robust replication functionality, data
protection with AWS Backup, and various AWS Partner Network solutions. Ancestry uses the
Amazon S3 Glacier storage classes to restore terabytes of images in mere hours instead of
days.
3. Archive data at the lowest cost: Move data archives to the Amazon S3 Glacier storage classes
to lower costs, eliminate operational complexities, and gain new insights. The BBC, a UK public
service broadcaster, safely migrated its 100-year-old flagship archive to Amazon S3 Glacier
Instant Retrieval.
4. Put your data to work: Because Amazon S3 stores more than 350 trillion objects (exabytes of
data) for virtually any use case and averages over 100 million requests per second, it may be
the starting point of your generative AI journey. Grendene is creating a generative AI-based
virtual assistant for their sales team using a data lake built on Amazon S3.
Working:
Amazon S3 stores data as objects within buckets. An object is a file and any metadata that describes
the file. A bucket is a container for objects. To store our data in Amazon S3, we first create a bucket
and specify a bucket name and AWS Region. Then, we upload our data to that bucket as objects in
Amazon S3. Each object has a key (or key name), which is the unique identifier for the object within
the bucket.
S3 provides features that can be configured to support our specific use case. For example, we can use
S3 Versioning to keep multiple versions of an object in the same bucket, which allows us to restore
objects that are accidentally deleted or overwritten. Buckets and the objects in them are
private and can only be accessed with explicitly granted access permissions. We can use bucket
43
policies, AWS Identity and Access Management (IAM) policies, S3 Access Points, and access control
lists (ACLs) to manage access.
Google’s cloud platform (GCP) offers a wide variety of database services. Of these, its NoSQL
database services are unique in their ability to rapidly process very large, dynamic datasets with
no fixed schema.
Google Cloud Datastore, now replaced by Firestore, is Google's NoSQL database service designed
to support the needs of cloud-native, scalable applications.
It is part of the Google Cloud Platform and provides a highly scalable, flexible database that can
handle a wide range of data structures and large volumes of transactions.
2. Scalability: It is designed to automatically scale as the size of your data grows, allowing it to
handle massive amounts of data and large numbers of requests without significant
performance degradation. This makes it ideal for web, mobile, and gaming applications.
4. Serverless Database: Firestore is a fully managed, serverless database, meaning Google Cloud
handles the database infrastructure, scaling, and patching. Users don’t need to worry about
managing servers, which reduces operational overhead.
• Mobile and Web Applications: Ideal for applications where real-time data synchronization
and offline data storage are essential. For example, chat applications, collaboration tools, or
productivity apps.
• Gaming: Supports high-throughput, low-latency data access required in gaming platforms for
features like leaderboards, player statistics, and game state tracking.
• IoT (Internet of Things): Firestore can handle massive amounts of data generated by IoT
devices and provides real-time synchronization and querying for real-time analysis of sensor
data.
• E-commerce: Useful for handling product catalogs, inventory management, and user data in
large-scale e-commerce platforms with high traffic and large datasets.
• Managed Service: Being serverless, Google Firestore takes care of scaling, patches, and
management.
44
• Real-Time Capabilities: Real-time updates to data make it ideal for applications that require
live synchronization.
• Global Scalability: Automatically scales to handle traffic and workload changes without
additional configuration or management.
• Offline Access: With support for offline mode, it enhances the user experience for mobile and
web applications.
Disadvantages/Limitations:
• Limited Query Flexibility: Firestore, like many NoSQL databases, can be restrictive in terms of
complex querying, such as multi-collection joins and advanced filtering.
• Pricing: The pricing model is based on the number of read/write operations, which can be
expensive for high-traffic applications if not optimized.
• Lack of Relational Structure: Although flexible, Firestore’s lack of relational capabilities can
be a disadvantage for applications requiring complex relationships between entities.
Gateways can also be used for archiving in the cloud. This pairs with automated storage
tiering, in which data can be replicated between fast, local disk and cheaper cloud storage
to balance space, cost, and data archiving requirements.
1. File-Based Gateways:
o These gateways use file-based protocols such as NFS (Network File System) or SMB
(Server Message Block) to communicate with local applications.
o They convert the file-based operations into API requests (such as AWS S3 API) to store
data in the cloud.
45
o Example: AWS Storage Gateway (File Gateway).
2. Block-Based Gateways:
o These gateways present cloud storage as local block storage (e.g., iSCSI targets) to
applications and servers.
o Suitable for applications that require block storage (such as databases or virtual
machines).
o Example: AWS Storage Gateway (Volume Gateway).
3. Tape Gateway:
o These are designed to integrate cloud storage with legacy tape backup workflows,
allowing businesses to continue using their existing tape-based backup software while
storing the actual data in the cloud.
o Useful for organizations transitioning from physical tape libraries to cloud storage for
archiving.
o Example: AWS Storage Gateway (Tape Gateway).
1. Seamless Cloud Integration: Allow businesses to move data to the cloud without disrupting
existing workflows or applications.
2. Cost Efficiency: Cloud storage gateways reduce the need for expensive on-premises storage
hardware by shifting data to the cloud, where storage is typically cheaper.
3. Enhanced Scalability: Organizations can scale their storage needs up or down based on
demand without worrying about purchasing additional physical storage hardware.
4. Improved Disaster Recovery: By replicating data to the cloud, businesses can ensure they have
off-site backups in case of a disaster, which improves overall data protection and recovery
options.
5. Optimized Data Transfers: Advanced data management features like compression,
deduplication, and caching reduce network bandwidth and improve access speeds when
retrieving data from the cloud.
1. Latency: Despite local caching, accessing cloud-based data over the internet can introduce
latency, particularly for larger files or high-transaction environments.
2. Security Concerns: Data sent to the cloud must be secured to prevent breaches. Organizations
need to ensure gateways implement strong encryption and access controls.
3. Complexity: Depending on the implementation, integrating cloud storage gateways may
introduce complexity in terms of setup, management, and maintenance.
4. Bandwidth Limitations: Performance can be impacted by available network bandwidth,
especially when large amounts of data are transferred between on-premises systems and the
cloud.
1. AWS Storage Gateway: AWS offers multiple types of storage gateways (File, Volume, and Tape)
that integrate with S3, Glacier, and other AWS services to provide cloud storage as a seamless
extension of on-premises systems.
46
2. Microsoft Azure StorSimple: Microsoft’s StorSimple is a hybrid cloud storage solution that
integrates with Azure Storage, providing organizations with automatic data tiering between
on-premises and cloud storage.
3. Google Cloud Storage Gateway: Google Cloud also provides a cloud storage gateway service
that allows integration between on-premise systems and Google Cloud Storage.
4. NetApp Cloud Volumes ONTAP: NetApp’s Cloud Volumes ONTAP provides a storage gateway
to NetApp's data management services, allowing businesses to store, protect, and manage
their data across hybrid cloud environments.
➢ TRADITIONAL APPLICATION ARCHITECTURE V/S CLOUD COMPUTING
1. Traditional Application Architecture
Traditional application architecture typically follows a monolithic model, where all components
(such as user interface, business logic, and database) are tightly coupled and hosted on on-
premises infrastructure. It is often structured in a 3-tier architecture, which includes:
1. Presentation Layer (Client/Front-End):
o This layer handles the user interface (UI) and interacts with users.
o It can be a desktop application, web browser, or mobile app.
2. Application Layer (Business Logic/Middleware):
o Contains the core functionality or business logic of the application.
o Often built on servers managed by the organization.
o Manages interactions between the UI and the data layer.
3. Data Layer (Back-End/Database):
o Stores and retrieves data using databases like SQL or Oracle.
o All data is usually stored in a centralized database server located on-premises.
Key features:
• On-premises hardware: Servers, storage, and networking components are housed in an
organization’s data center.
• Static scalability: Requires manual upgrades or purchasing of additional hardware to scale.
• Single-point-of-failure: If one part of the system fails (e.g., the server), the entire application
can become unavailable.
• Resource-intensive maintenance: IT teams are responsible for the entire infrastructure,
maintenance, and updates.
2. Cloud Application Architecture
Cloud architecture, in contrast, is based on distributed and flexible models where applications are
built to run on cloud platforms such as AWS, Azure, or Google Cloud. It often follows a
microservices or serverless architecture.
1. Frontend (User Interface) Layer:
o The user interface is accessed via a web browser or mobile app.
o Interacts with APIs to fetch data or invoke services in the cloud.
2. Microservices or Application Services Layer:
o Applications are split into smaller, independent components called microservices.
o Each microservice handles a specific business function and communicates through
APIs.
o Serverless Computing (e.g., AWS Lambda) can also be used to execute functions in
response to events without managing underlying servers.
47
3. Cloud Storage & Database Layer:
o Cloud databases (e.g., Amazon RDS, Google Cloud SQL) are used to store data.
o Data is distributed across multiple data centers for reliability and availability.
Key features:
• Dynamic scalability: Resources can automatically scale up or down depending on demand,
avoiding the need for manual interventions.
• High availability: Cloud platforms offer built-in fault tolerance, load balancing, and disaster
recovery solutions to ensure maximum uptime.
• Pay-as-you-go model: Organizations only pay for the computing resources they use, making it
more cost-effective.
• Security and compliance: Cloud providers offer comprehensive security frameworks and help
ensure compliance with industry standards (e.g., GDPR, HIPAA).
48