S Harding

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 7

SHARDING

INTRODUCTION
•Sharding is the process of storing data records across multiple machines and it is
MongoDB's approach to meeting the demands of data growth.
•As the size of the data increases, a single machine may not be sufficient to store the
data nor provide an acceptable read and write throughput.
•Sharding solves the problem with horizontal scaling.
•With sharding, you add more machines to support data growth and the demands of
read and write operations.
WHY SHARDING?

•In replication, all writes go to master node


•Latency sensitive queries still go to master
•Single replica set has limitation of 12 nodes
•Memory can't be large enough when active dataset is big
•Local disk is not big enough
•Vertical scaling is too expensive
SHARDING COMPONENTS

To achieve sharding in MongoDB, the following components are required:


Shard is a Mongo instance to handle a subset of original data. Shards are required to
be deployed in the replica set.
Mongos is a Mongo instance and acts as an interface between a client application
and a sharded cluster. It works as a query router to shards.
Config Server is a Mongo instance which stores metadata information and
configuration details of cluster. MongoDB requires the config server to be deployed
as a replica set.
SHARDING ARCHITECTURE

•MongoDB cluster consists of a number of replica sets.


•Each replica set consists of a minimum of 3 or more mongo instances.
•A sharded cluster may consist of multiple mongo shards instances, and each shard
instance works within a shard replica set.
•The application interacts with Mongos, which in turn communicates with shards.
Therefore in Sharding, applications never interact directly with shard nodes.
•The query router distributes the subsets of data among shards nodes based upon the
shard key.
BENEFITS OF SHARDING OVER
REPLICATION

•In replication, the primary node handles all write operations, whereas secondary
servers are required to maintain backup copies or serve read-only operations. But in
sharding along with replica sets, the load gets distributed among numbers of servers.
•A single replica set is limited to 12 nodes, but there is no restriction on the number
of shards.
•Replication requires high-end hardware or verticle scaling for handling large
datasets, which is too expensive compared to adding additional servers in sharding.
•In replication, read performance can be enhanced by adding more slave/secondary
servers, whereas, in sharding, both read and write performance will be enhanced by
adding more shards nodes.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy