0% found this document useful (0 votes)

23 views7 pages

Unit3 - Cloud Data Storage

unit 3 AIMl for diploma

Uploaded by

dhanashree

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views7 pages

Unit3 - Cloud Data Storage

unit 3 AIMl for diploma

Uploaded by

dhanashree

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Cloud Data Storage

 Cloud Storage Types:-

There are three main cloud storage types: object storage, file storage, and
block storage. Each offers its own advantages and has its own use cases.
1. Object storage
Organizations have to store a massive and growing amount of
unstructured data, such as photos, videos, machine learning (ML),
sensor data, audio files, and other types of web content, and finding
scalable, efficient, and affordable ways to store them can be a
challenge. Object storage is a data storage architecture for large
stores of unstructured data. Objects store data in the format it
arrives in and makes it possible to customize metadata in ways that
make the data easier to access and analyze. Instead of being
organized in files or folder hierarchies, objects are kept in secure
buckets that deliver virtually unlimited scalability. It is also less
costly to store large data volumes.
Applications developed in the cloud often take advantage of the
vast scalability and metadata characteristics of object storage.
Object storage solutions are ideal for building modern applications
from scratch that require scale and flexibility, and can also be used
to import existing data stores for analytics, backup, or archive.
2. File storage
File-based storage or file storage is widely used among
applications and stores data in a hierarchical folder and file format.
This type of storage is often known as a network-attached storage
(NAS) server with common file level protocols of Server Message
Block (SMB) used in Windows instances and Network File System
(NFS) found in Linux.
3. Block storage
Enterprise applications like databases or enterprise resource
planning (ERP) systems often require dedicated, low-latency
storage for each host. This is analogous to direct-attached storage
(DAS) or a storage area network (SAN). In this case, you can use a
cloud storage service that stores data in the form of blocks. Each
block has its own unique identifier for quick storage and retrieval.
 Cloud Data Governance :-
Cloud data governance is a concept that organizations need to be familiar
with if they have data in the cloud or plan to migrate their data to the
cloud. Cloud governance is a set of processes that ensure data stored in
cloud environments is secure, accurate, and compliant with all relevant
data regulations and policies. It also helps organizations identify and
classify sensitive data, define usage, and manage access to data.

Why Is It Important?
Cloud data governance is important for numerous reasons. Organizations
that deal with sensitive information and data assets need to ensure that
they are managing and governing data across their cloud platforms and
applications. Proper cloud data governance makes this simpler.
Cloud data governance also helps ensure data is more accurate and
reliable by reducing duplications, inconsistencies, and other errors. This
allows organizations to get more value from data and make better data-
driven decisions.

What Are the Benefits?

i. Data Security and Privacy
ii. Enhanced Collaboration and Data Sharing
iii. Data Quality and Integrity
iv. Scalability and Flexibility
 Key value databases:-
1. Key-value databases, also known as key-value stores or NoSQL
databases, are a type of non-relational database that use a key-value
method to store data. In a key-value database, data is stored as a
collection of key-value pairs, where a key acts as a unique
identifier for a value. The key can be a simple string, and the value
can be a simple object like a number or string, or a complex
compound object.
2. Key-value databases are often considered the simplest and fastest
type of NoSQL database. They are easy to design and implement,
and they don't require the schema to be constantly changed to
accommodate unstructured data. They are also highly partitionable
and allow for horizontal scaling, which other types of databases
can't achieve.

 Some key-value database features include:

i. Retrieving values: Users can retrieve values associated with
a given key.
ii. Deleting values: Users can delete values associated with a
given key.
iii. Setting, updating, and replacing values: Users can set,
update, or replace values associated with a given key.
iv. Links: Links can be used to map the relationship between
pairs of key values.
v. Search: Some key-value databases, like Riak, have search
capabilities for full-text searches.
vi. Secondary indexes: Developers can mark values with the
value of one or more key fields, and then applications can
query the index to return a list of similar keys.

 Batch data and Streaming data on Machine Learning:-

S.No. BATCH PROCESSING STREAM PROCESSING

Batch processing refers to
processing of high volume of Stream processing refers to
data in batch within a specific processing of continuous stream of
01. time span. data immediately as it is produced.
Batch processing processes large Stream processing analyzes
02. volume of data all at once. streaming data in real time.
In Batch processing data size is In Stream processing data size is
04. known and finite. unknown and infinite in advance.
In Batch processing the data is In stream processing generally data
05. processes in multiple passes. is processed in few passes.
Batch processor takes longer time Stream processor takes few seconds
06. to processes data. or milliseconds to process data.
In batch processing the input In stream processing the input
07. graph is static. graph is dynamic.
In this processing the data is In this processing the data is
08. analyzed on a snapshot. analyzed on continuous.
In batch processing the response In stream processing the response is
09. is provided after job completion. provided immediately.
Examples are programming
Examples are distributed platforms like spark streaming and
programming platforms like S4 (Simple Scalable Streaming
10. MapReduce, Spark, GraphX etc. System) etc.
Batch processing is used in Stream processing is used in stock
payroll and billing system, food market, e-commerce transactions,
11. processing system etc. social media etc.
Processes data in batches or sets, Processes data in real-time, as it is
typically stored in a database or generated or received from a
12 file system. source.
Processes data in discrete, finite Processes data continuously and
13 batches or jobs. incrementally.
 Cloud data warehouse:-
A cloud data warehouse is a modern way of storing and managing large
amounts of data in a public cloud. It lets you quickly access and use your
data. This makes it the perfect solution for businesses that rely on data
and require agility, flexibility, and ease of use for their infrastructure
requirements.
 Cloud Data Warehouse Benefits:-
1. Faster Insights: A cloud data warehouse provides more
powerful computing capabilities, and will deliver real-time
cloud analytics using data from diverse data sources much
faster than an on-premises data warehouse, allowing
business users to access better insights, faster.
2. Scalability: A cloud-based data warehouse offers immediate
and nearly unlimited storage, and it’s easy to scale as your
storage needs grow. Increasing cloud storage doesn’t require
you to purchase new hardware as an on-premises data
warehouse does, and you’ll pay a fraction of the cost.
3. Overhead: Maintaining a data warehouse on-premises
requires a dedicated server room full of expensive hardware,
and experienced employees to oversee, manually upgrade,
and troubleshoot issues. A cloud data warehouse requires no
physical hardware or allocated office space, making
operational costs significantly lower.
Cloud Data Warehouse Vendors
There are many popular cloud-based data warehouse platforms to choose
from, including Amazon Redshift, Google BigQuery, Microsoft Azure,
Snowflake, and others — and there are just as many important
considerations when deciding on the right solution for your organization.

 Amazon Redshift:-
For many years, data warehousing was only available as an on-premise
solution. Then in November 2012, Amazon Web Services (AWS)
launched Redshift, a fully managed, petabyte-scale data warehouse
service in the cloud. Although not the first cloud-based data warehouse, it
was the first to gain market share through adoption. Redshift’s SQL
dialect is based on PostgreSQL, which is well understood by analysts
worldwide, and uses an architecture familiar to many on-premises data
warehouses users.

You can start with as little as a few gigabytes of data and scale to
petabytes. This empowers you to acquire new insights from your business
and customer data.

The first step to creating a Redshift data warehouse is to launch a set of

nodes, called an Amazon Redshift cluster. After you provision your
cluster, you upload your data set and then perform data analysis queries.
Regardless of the size of your data set, Amazon Redshift delivers fast
query performance using familiar SQL-based tools and business
intelligence applications.

 GCP(Google Cloud Platform) BigQuery:-

BigQuery is a fully managed, serverless data warehouse that
automatically scales to match storage and computing power needs.
Google doesn’t expect you to manage your data warehouse infrastructure
which is why BigQuery hides many of the underlying hardware, database,
nodes, and configuration details. Its elasticity automatically works out of
the box. And getting started is simply a matter of creating an account with
Google Cloud Platform (GCP), loading a table, and running a query.
Google takes care of the rest.

With BigQuery, you get a columnar and ANSI SQL database that can
analyze terabytes to petabytes of data at incredible speeds. BigQuery also
lets you do spatial analysis using familiar SQL with BigQuery GIS. In
addition, you can quickly build and operationalize ML models on large-
scale structured or semi-structured data using simple SQL with BigQuery
ML. And you can support real-time interactive dashboarding with
BigQuery BI Engine.

The BigQuery architecture is composed of several components. Borg is

the compute. Colossus is the distributed storage. Jupiter is the network.
And Dremel is the execution engine.

Butler Island Plantation Overseers Reports Final PDF
100% (2)
Butler Island Plantation Overseers Reports Final PDF
236 pages
DevsecOps Part 1 Post Quiz - Attempt Review
0% (1)
DevsecOps Part 1 Post Quiz - Attempt Review
2 pages
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
CCD Chapter 3 Notes
No ratings yet
CCD Chapter 3 Notes
11 pages
CCD Unit 3
No ratings yet
CCD Unit 3
8 pages
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
Eb Cloud Data Warehouse Comparison Ebook en
No ratings yet
Eb Cloud Data Warehouse Comparison Ebook en
10 pages
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
Concept of Big Data
No ratings yet
Concept of Big Data
29 pages
Module 6
No ratings yet
Module 6
16 pages
Program: B.E Subject Name: Data Science Subject Code: IT-8003 Semester: 8th
No ratings yet
Program: B.E Subject Name: Data Science Subject Code: IT-8003 Semester: 8th
11 pages
Amazon Data Warehouse
No ratings yet
Amazon Data Warehouse
21 pages
CCD CH 3 & 4 Notes
No ratings yet
CCD CH 3 & 4 Notes
30 pages
Imp Answers CCD Ut
No ratings yet
Imp Answers CCD Ut
14 pages
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
Big Data PDF
No ratings yet
Big Data PDF
18 pages
DWH
No ratings yet
DWH
7 pages
Lecture 5 Distributed Storage Systems
No ratings yet
Lecture 5 Distributed Storage Systems
26 pages
Bring Data Lakes and Data Warehouses Together
100% (1)
Bring Data Lakes and Data Warehouses Together
19 pages
UNIT II Database & Data Warehouse
No ratings yet
UNIT II Database & Data Warehouse
26 pages
Data Warehousing Database Data Warehouse Data Lake
No ratings yet
Data Warehousing Database Data Warehouse Data Lake
17 pages
Big Data Unit 1
No ratings yet
Big Data Unit 1
24 pages
CloudComputing DATABASE
No ratings yet
CloudComputing DATABASE
27 pages
What Is A Data Warehouse - IBM
No ratings yet
What Is A Data Warehouse - IBM
9 pages
Hadoop & BigData (UNIT - 2)
No ratings yet
Hadoop & BigData (UNIT - 2)
22 pages
ADBMS-Module 1 Notes
No ratings yet
ADBMS-Module 1 Notes
18 pages
CC - Lecture 6-Data
No ratings yet
CC - Lecture 6-Data
44 pages
Module 1
No ratings yet
Module 1
29 pages
WK 3
No ratings yet
WK 3
29 pages
Data Cube On Cloud Computing
No ratings yet
Data Cube On Cloud Computing
10 pages
Unit 1 Big Data
No ratings yet
Unit 1 Big Data
79 pages
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Big Data Unit 1 Notes - 240311 - 100703
No ratings yet
Big Data Unit 1 Notes - 240311 - 100703
15 pages
Big Data Deals With Large Data Sets
No ratings yet
Big Data Deals With Large Data Sets
4 pages
Hadoop Report
No ratings yet
Hadoop Report
110 pages
Data Science
No ratings yet
Data Science
87 pages
Emerging IT Trends and Virtualization
No ratings yet
Emerging IT Trends and Virtualization
34 pages
Unit 6 NOSQL Databases and Data Warehousing
No ratings yet
Unit 6 NOSQL Databases and Data Warehousing
29 pages
Module 6
No ratings yet
Module 6
7 pages
3 Assignment
No ratings yet
3 Assignment
5 pages
ACC IT APP MIdterm Bigdata
No ratings yet
ACC IT APP MIdterm Bigdata
12 pages
CC Unit-5
No ratings yet
CC Unit-5
9 pages
Big Data Dan Cloud Computing
No ratings yet
Big Data Dan Cloud Computing
19 pages
Mastering Apache Iceberg: Managing Big Data in a Modern Data Lake
From Everand
Mastering Apache Iceberg: Managing Big Data in a Modern Data Lake
Robert Johnson
No ratings yet
Cloud DBs
No ratings yet
Cloud DBs
36 pages
Big Data
0% (1)
Big Data
2 pages
Enhancing and Scalability in Big Data and Cloud Computing: Future Opportunities and Security
No ratings yet
Enhancing and Scalability in Big Data and Cloud Computing: Future Opportunities and Security
7 pages
Enterprise Data Warehousing On Aws
No ratings yet
Enterprise Data Warehousing On Aws
26 pages
Data Mining
No ratings yet
Data Mining
3 pages
Wa0003.
No ratings yet
Wa0003.
23 pages
The Snowflake Handbook: Optimizing Data Warehousing and Analytics
From Everand
The Snowflake Handbook: Optimizing Data Warehousing and Analytics
Robert Johnson
No ratings yet
Week 7 Final Paper
No ratings yet
Week 7 Final Paper
9 pages
Big Data Analytics
100% (1)
Big Data Analytics
14 pages
Big Data Technologies UNIT 1
No ratings yet
Big Data Technologies UNIT 1
5 pages
BDA UNIT-1 (Lecture-1)
No ratings yet
BDA UNIT-1 (Lecture-1)
5 pages
Presentation 20
No ratings yet
Presentation 20
31 pages
Data Mining 1
No ratings yet
Data Mining 1
13 pages
Uc PDF
No ratings yet
Uc PDF
10 pages
Course Code: CCS334 Course Name: Big Data Analytics Regulation: 2021 Year/Sem: Iii / Vi Faculty Incharge
No ratings yet
Course Code: CCS334 Course Name: Big Data Analytics Regulation: 2021 Year/Sem: Iii / Vi Faculty Incharge
12 pages
WA Data Warehouse
No ratings yet
WA Data Warehouse
16 pages
What Is A Data Platform
No ratings yet
What Is A Data Platform
18 pages
Database 240112 181346
No ratings yet
Database 240112 181346
16 pages
CPP Report
No ratings yet
CPP Report
17 pages
Document
No ratings yet
Document
4 pages
Fam Pracs
No ratings yet
Fam Pracs
29 pages
1 Stfam
No ratings yet
1 Stfam
2 pages
Unit 4
No ratings yet
Unit 4
11 pages
Unit No 5-1
No ratings yet
Unit No 5-1
16 pages
Unit I OSY Handout Revised 24.07.2023
No ratings yet
Unit I OSY Handout Revised 24.07.2023
20 pages
CCD Assignment2
No ratings yet
CCD Assignment2
1 page
Unit 6-CCD
No ratings yet
Unit 6-CCD
23 pages
CCD Assignment1
No ratings yet
CCD Assignment1
1 page
Unit III OSY Handout Revised 24.07.2023
No ratings yet
Unit III OSY Handout Revised 24.07.2023
15 pages
Unit II OSY Handout Revised 01.08.2023
No ratings yet
Unit II OSY Handout Revised 01.08.2023
11 pages
Osy Notes
No ratings yet
Osy Notes
38 pages
Unit IV OSY Handout Revised 07.08.2023
No ratings yet
Unit IV OSY Handout Revised 07.08.2023
33 pages
Unit2 CloudArchitecture Notes
No ratings yet
Unit2 CloudArchitecture Notes
17 pages
MINNU DFD, Table, 5,7
No ratings yet
MINNU DFD, Table, 5,7
55 pages
Employee Work Management System
No ratings yet
Employee Work Management System
25 pages
O211 Comp6590 SL01 09
No ratings yet
O211 Comp6590 SL01 09
19 pages
2023 09 24 - Log
No ratings yet
2023 09 24 - Log
3 pages
Cassandra Brass Tacks Q&A
No ratings yet
Cassandra Brass Tacks Q&A
4 pages
OPTICS: Ordering Points To Identify The Clustering Structure
No ratings yet
OPTICS: Ordering Points To Identify The Clustering Structure
10 pages
Blecher-Cohen2019 Library Anxiety
No ratings yet
Blecher-Cohen2019 Library Anxiety
10 pages
Individual Assignment 1 - STID3034
No ratings yet
Individual Assignment 1 - STID3034
6 pages
DLIS 012 (Cat & C) Course Material
No ratings yet
DLIS 012 (Cat & C) Course Material
27 pages
Syllabus
No ratings yet
Syllabus
2 pages
Preventing Digital Fatigue: Records Management: How It Should Be
No ratings yet
Preventing Digital Fatigue: Records Management: How It Should Be
52 pages
Blood Bank Management System: Te-Extc A/A2 2020-21
0% (1)
Blood Bank Management System: Te-Extc A/A2 2020-21
20 pages
01 Introduction On Data Dictionary
No ratings yet
01 Introduction On Data Dictionary
22 pages
B.tech Python Major Projects List 2023-24
No ratings yet
B.tech Python Major Projects List 2023-24
3 pages
Harshita Srivastava
No ratings yet
Harshita Srivastava
2 pages
ATEKO Et Al DEVELOPMENT OF A NEWS DROID AND EVENT REPORTING SYSTEM
No ratings yet
ATEKO Et Al DEVELOPMENT OF A NEWS DROID AND EVENT REPORTING SYSTEM
7 pages
Development of Smartphone Based Student Attendance System
No ratings yet
Development of Smartphone Based Student Attendance System
5 pages
Agile Diagram Infographics Template
No ratings yet
Agile Diagram Infographics Template
20 pages
DBMS Assignment
No ratings yet
DBMS Assignment
2 pages
MySQL Question Bank
0% (1)
MySQL Question Bank
4 pages
Entity Relationship Diagram - Common ERD Symbols and Notations
No ratings yet
Entity Relationship Diagram - Common ERD Symbols and Notations
10 pages
Case Study: Enhancing Cybersecurity in A Growing IT Services Company
No ratings yet
Case Study: Enhancing Cybersecurity in A Growing IT Services Company
2 pages
Ajay Kumar Garg Engineering College, Ghaziabad: Department of IT
No ratings yet
Ajay Kumar Garg Engineering College, Ghaziabad: Department of IT
2 pages
MCQs Topic 2.1 CRISP-DM Framework
No ratings yet
MCQs Topic 2.1 CRISP-DM Framework
6 pages
Rome
No ratings yet
Rome
26 pages
DBRE - Selection Questions For Recruitment
No ratings yet
DBRE - Selection Questions For Recruitment
2 pages
Unit 2
No ratings yet
Unit 2
63 pages
Android Unit 3
No ratings yet
Android Unit 3
251 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit3 - Cloud Data Storage

Uploaded by

Unit3 - Cloud Data Storage

Uploaded by

Cloud Data Storage

 Cloud Storage Types:-

What Are the Benefits?

 Some key-value database features include:

 Batch data and Streaming data on Machine Learning:-

S.No. BATCH PROCESSING STREAM PROCESSING

The first step to creating a Redshift data warehouse is to launch a set of

 GCP(Google Cloud Platform) BigQuery:-

The BigQuery architecture is composed of several components. Borg is

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.