0% found this document useful (0 votes)
3 views60 pages

Nosql Prepared

NoSQL is a non-relational database management system that allows for flexible schemas and is designed to handle large volumes of data, making it popular among major tech companies. It provides advantages such as scalability, fast performance, and the ability to manage various data types, while also facing challenges like limited query capabilities and a lack of standardization. The CAP theorem and BASE principles are key concepts in understanding NoSQL databases, which can be categorized into four types: key-value, column-oriented, graph, and document-based.

Uploaded by

akashbs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views60 pages

Nosql Prepared

NoSQL is a non-relational database management system that allows for flexible schemas and is designed to handle large volumes of data, making it popular among major tech companies. It provides advantages such as scalability, fast performance, and the ability to manage various data types, while also facing challenges like limited query capabilities and a lack of standardization. The CAP theorem and BASE principles are key concepts in understanding NoSQL databases, which can be categorized into four types: key-value, column-oriented, graph, and document-based.

Uploaded by

akashbs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 60

NOSQL

S SRIDEVI
NoSQL why, what NoSQL why, what and when?

and when? NOSQL –Why, What and When ?


What is NoSQL?

• NoSQL Database is a non-relational Data Management System, that


does not require a fixed schema.
• It avoids joins, and is easy to scale.
• The major purpose of using a NoSQL database is for distributed data
stores with humongous data storage needs.
• NoSQL is used for Big data and real-time web apps.
• For example, companies like Twitter, Facebook and Google collect
terabytes of user data every single day.
Why NoSQL?

• The concept of NoSQL databases became popular with Internet


giants like Google, Facebook, Amazon, etc. who deal with huge
volumes of data. The system response time becomes slow when you
use RDBMS for massive volumes of data.
• To resolve this problem, we could "scale up" our systems by
upgrading our existing hardware. This process is expensive.
• The alternative for this issue is to distribute database load on
multiple hosts whenever the load increases. This method is known
as "scaling out."
Advantages of NoSQL

• Can be used as Primary or Analytic Data Source


• Big Data Capability
• No Single Point of Failure
• Easy Replication
• No Need for Separate Caching Layer
• It provides fast performance and horizontal scalability.
• Can handle structured, semi-structured, and unstructured data with equal effect
• Object-oriented programming which is easy to use and flexible
• NoSQL databases don't need a dedicated high-performance server
• Support Key Developer Languages and Platforms
• Simple to implement than using RDBMS
• It can serve as the primary data source for online applications.
• Handles big data which manages data velocity, variety, volume, and complexity
• Excels at distributed database and multi-data center operations
• Eliminates the need for a specific caching layer to store data
• Offers a flexible schema design which can easily be altered without downtime or service disruption
Disadvantages of NoSQL

• No standardization rules
• Limited query capabilities
• RDBMS databases and tools are comparatively mature
• It does not offer any traditional database capabilities, like consistency
when multiple transactions are performed simultaneously.
• When the volume of data increases it is difficult to maintain unique values
as keys become difficult
• Doesn't work as well with relational data
• The learning curve is stiff for new developers
• Open source options so not so popular for enterprises.
What is the CAP Theorem?

• CAP theorem is also called brewer's theorem. It states that is impossible for a distributed data store to offer
more than two out of three guarantees
1. Consistency
2. Availability
3. Partition Tolerance
• Consistency:
• The data should remain consistent even after the execution of an operation. This means once data is written,
any future read request should contain that data. For example, after updating the order status, all the clients
should be able to see the same data.
• Availability:
• The database should always be available and responsive. It should not have any downtime.
• Partition Tolerance:
• Partition Tolerance means that the system should continue to function even if the communication among the
servers is not stable. For example, the servers can be partitioned into multiple groups which may not
communicate with each other. Here, if part of the database is unavailable, other parts are always unaffected.
• Eventual Consistency
• The term "eventual consistency" means to have copies of data on multiple
machines to get high availability and scalability. Thus, changes made to any data
item on one machine has to be propagated to other replicas.
• Data replication may not be instantaneous as some copies will be updated
immediately while others in due course of time. These copies may be mutually, but
in due course of time, they become consistent. Hence, the name eventual
consistency.
• BASE: Basically Available, Soft state, Eventual consistency
• Basically, available means DB is available all the time as per CAP theorem
• Soft state means even without an input; the system state may change
• Eventual consistency means that the system will become consistent over time
Types of NoSQL Databases:

• Key-value Pair Based


• Column-oriented Graph
• Graphs based
• Document-oriented
Types of NoSQL Databases:
Document based data Model
Key-Value Databases

• A key-value database is a type of nonrelational database that


uses a simple key-value method to store data.
• A key-value database stores data as a collection of key-value
pairs in which a key serves as a unique identifier.
• Both keys and values can be anything, ranging from simple
objects to complex compound objects.
• Key-value databases are highly partitionable and allow horizontal
scaling at scales that other types of databases cannot achieve.
• For example, Amazon DynamoDB allocates additional partitions
to a table if an existing partition fills to capacity and more
storage space is required.
Key-Value Databases
Key-Value Databases

• These databases offer REST-ful APIs as well as protocol


buffers interfaces for data access. Key Value data stores
like Riak also support the following additional features:
• Search: Distributed, full-text search engine with a
query language.
• Secondary Indexes: Tag objects stored with additional
values and query by exact match or range.
• MapReduce: Non-key-based querying for large
datasets.
Key-Value Databases - Benefits
• Scalability: One of the biggest benefits compared to a relational database
is the fact that key-value stores (like NoSQL in general) are infinitely
scalable in a horizontal fashion.
• Compared to relational databases where expansion is vertical and finite,
this can be a big boon to complex and larger databases. More specifically it
manages this through partitioning & replication.
• It also minimizes ACID guarantees by going around things like low-
overhead server calls.
• Simpler Querying: In cases such as sessions, user profiles, shopping
carts, and so on, key-value makes it cheaper to handle since it’s just one
request to read and one request to write (due to the blob-like nature of
how the data is stored).
• Similarly, concurrency issues are easier to handle since you only need to
resolve one key.
Key-Value Databases - Benefits
• Mobility: Since they don’t have a query language, key-
value stores are easy to move from one system to
another without the need for new architecture or
changing the code. As such, moving from an old
operating system to a new one doesn’t cause a severe
disruption as it would with a relational database.
When to Use Key-Value

• Traditional relational databases are not really made to handle a high


volume of read/write operations, which is where key-value stores
shine. Since it’s easily scalable, key-value can handle thousands
upon thousands of users at any given second. Additionally, with the
built-in redundancy, it can handle lost storage or data without any
issues.
• As such, there are a few situations where key-value shines:
• User preferences and profile stores
• Large scale session management for users
• Product recommendations (such as in eCommerce platforms)
• Customized ad delivery to users based on their data profile
• Data cache for rarely updated data
Examples of Popular Key-Value Databases

• Amazon DynamoDB: Probably the most widely used key-value store database, in
fact, it was the research into DynamoDB that really started making NoSQL really
popular.
• Aerospike: Open-source database that is optimized for in-memory storage.
• Berkeley DB: Another open-source database that is a high-performance database
storage library, although it’s relatively basic.
• Memcached: Helps speed up websites by storing cache data in RAM, plus it’s free
and open-source.
• Riak: Made for developing apps, it works well with other databases and apps.
• Redis: A multi-purpose database that also acts as memory cache and message
broker.

• Refer :
https://www.predictiveanalyticstoday.com/top-sql-key-value-store-databases/
Column store NoSQL database

• In column-oriented NoSQL databases, data is stored in cells


grouped in columns of data rather than as rows of data.
• Columns are logically grouped into column families. Column
families can contain a virtually unlimited number of columns
that can be created at runtime or while defining the schema.
• Read and write is done using columns rather than rows.
Column families are groups of similar data that is usually
accessed together.
• As an example, we often access customers’ names and
profile information at the same time, but not the information
on their orders
Column store NoSQL database
• This column‐centric view makes column stores ideal for
running aggregate functions or for looking up records
that match multiple columns.
• Column stores are also sometimes referred to as Big
Tables or Big Table clones, reflecting their common
ancestor, Google’s Bigtable
Column store NoSQL database
• In a column store, each
record (think row in an
RDBMS) doesn’t require a
single value per column.
• Instead, it’s possible to
model column families. A
single record may consist
of an ID field, a column
family for “customer”
information, and another
column family for “order
item” information.
• Each one of these
column families consists
of several fields. One of
these column families
may have multiple
“rows” in its own right.
Benefits of Column Databases

• Column stores are excellent at compression and therefore are efficient in terms of storage.
This means you can reduce disk resources while holding massive amounts of information in
a single column
• Since a majority of the information is stored in a column, aggregation queries are quite fast,
which is important for projects that require large amounts of queries in a small amount of
time.
• Scalability is excellent with column-store databases. They can be expanded nearly
infinitely, and are often spread across large clusters of machines, even numbering in
thousands. That also means that they are great for Massive Parallel Processing
• Load times are similarly excellent, as you can easily load a billion-row table in a few
seconds. That means you can load and query nearly instantly.
• Large amounts of flexibility as columns do not necessarily have to look like each other. That
means you can add new and different columns without disrupting the whole database. That
being said, entering completely new record queries requires a change to all tables.
• Overall, column-store databases are great for analytics and reporting: fast querying speeds
and abilities to hold large amounts of data without adding a lot of overhead make it ideal.
Disadvantages of Column
Databases
• Designing an indexing schema that’s effective is difficult and time
consuming. Even then, the said schema would still not be as effective
as simple relational database schemas.
• While this may not be an issue for some users, incremental data
loading is suboptimal and should be avoided if possible.
• This goes for all NoSQL database types and not just columnar ones.
Security vulnerabilities in web applications are ever present and the
fact that NoSQL databases lack inbuilt security features doesn’t help.
If security is your number one priority, you should either look into
relational databases you could employ or employ a well-defined
schema if possible.
• Online Transaction Processing (OLTP) applications are also not
compatible with columnar databases due to the way data is stored.
Column Databases
• Use cases
• Developers mainly use column databases in:
• Content management systems
• Blogging platforms
• Systems that maintain counters
• Services that have expiring usage
• Systems that require heavy write requests (like log
aggregators)
Examples of Column Database
• Examples of Columnar Database are
• 'Bigtable, Cassandra, HBase, Vertica, Druid, Accumulo, and
Hypertable
Column Database
Graph Based Data Model
Graph Databases
• Data Model:
• Nodes and Relationships
• Examples:
• Neo4j, OrientDB, InfiniteGraph, AllegroGraph
Graph Databases: Pros and Cons
• Pros:
• Powerful data model, as general as RDBMS
• Connected data locally indexed
• Easy to query
• Cons
• Sharding ( lots of people working on this)
• Scales UP reasonably well
• Requires rewiring your brain
What are graphs good for?
• Recommendations
• Business intelligence
• Social computing
• Geospatial
• Systems management
• Web of things
• Genealogy
• Time series data
• Product catalogue
• Web analytics
• Scientific computing (especially bioinformatics)
• Indexing your slow RDBMS
• And much more!
What is a Graph?
What is a Graph?
• An abstract representation of a set of objects where some pairs are
connected by links.

Object (Vertex, Node)

Link (Edge, Arc, Relationship)


Different Kinds of Graphs
• Undirected Graph
• Directed Graph

• Pseudo Graph
• Multi Graph

• Hyper Graph
More Kinds of Graphs
• Weighted Graph

• Labeled Graph

• Property Graph
What is a Graph Database?
• A database with an explicit graph structure
• Each node knows its adjacent nodes
• As the number of nodes increases, the cost of a local step (or hop)
remains the same
• Plus an Index for lookups
Relational Databases
Graph Databases
Neo4j
• Neo4j is the world's leading open source
Graph Database which is developed using Java technology. It
is highly scalable and schema free (NoSQL).
Neo4j Tips
• Each entity table is represented by a label on nodes
• Each row in a entity table is a node
• Columns on those tables become node properties.
• Join tables are transformed into relationships, columns on those
tables become relationship properties
Node in Neo4j
Relationships in Neo4j
• Relationships between nodes are a key part of Neo4j.
Relationships in Neo4j
Twitter and relationships
Properties
• Both nodes and relationships can have properties.
• Properties are key-value pairs where the key is a string.
• Property values can be either a primitive or an
array of one primitive type.
For example String, int and int[] values are valid for properties.
Properties
Paths in Neo4j
• A path is one or more nodes with connecting
relationships, typically retrieved as a query or traversal
result.
Starting and Stopping
Print the data
Remove the data
The Matrix Graph Database
Summary

• NoSQL is a non-relational DMS, that does not require a fixed schema, avoids joins, and is easy to scale
• The concept of NoSQL databases beccame popular with Internet giants like Google, Facebook,
Amazon, etc. who deal with huge volumes of data
• In the year 1998- Carlo Strozzi use the term NoSQL for his lightweight, open-source relational
database
• NoSQL databases never follow the relational model it is either schema-free or has relaxed schemas
• Four types of NoSQL Database are 1).Key-value Pair Based 2).Column-oriented Graph 3). Graphs based
4).Document-oriented
• NOSQL can handle structured, semi-structured, and unstructured data with equal effect
• CAP theorem consists of three words Consistency, Availability, and Partition Tolerance
• BASE stands for Basically Available, Soft state, Eventual consistency
• The term "eventual consistency" means to have copies of data on multiple machines to get high
availability and scalability
• NOSQL offer limited query capabilities

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy