0% found this document useful (0 votes)

49 views

Storage Tiering and Erasure Coding in Ceph - 150222

The document discusses erasure coding and cache tiering in Ceph. It provides an overview of the Ceph architecture including RADOS, which is a distributed object store, and higher level services like RBD for block storage and CephFS for a distributed file system. It describes how cache tiering and erasure coding can improve performance and durability in Ceph.

Uploaded by

유중선

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views

Storage Tiering and Erasure Coding in Ceph - 150222

Uploaded by

유중선

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

ERASURE CODING AND CACHE TIERING

SAGE WEIL – SCALE13X - 2015.02.22

AGENDA
● Ceph architectural overview
● RADOS background
● Cache tiering
● Erasure coding
● Project status, roadmap

2
ARCHITECTURE
CEPH MOTIVATING PRINCIPLES
● All components must scale horizontally
● There can be no single point of failure
● The solution must be hardware agnostic
● Should use commodity hardware
● Self-manage whenever possible
● Open source (LGPL)

● Move beyond legacy approaches

– Client/cluster instead of client/server
– Ad hoc HA

4
CEPH COMPONENTS

APP HOST/VM CLIENT

RGW RBD CEPHFS

A web services A reliable, fully- A distributed file
gateway for object distributed block system with POSIX
storage, compatible device with cloud semantics and scale-
with S3 and Swift platform integration out metadata
management

LIBRADOS
A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)

RADOS
A software-based, reliable, autonomous, distributed object store comprised of
self-healing, self-managing, intelligent storage nodes and lightweight monitors

5
ROBUST SERVICES BUILT ON RADOS
ARCHITECTURAL COMPONENTS

APP HOST/VM CLIENT

RGW RBD CEPHFS

LIBRADOS
A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)

RADOS
A software-based, reliable, autonomous, distributed object store comprised of
self-healing, self-managing, intelligent storage nodes and lightweight monitors

7
THE RADOS GATEWAY

APPLICATION APPLICATION

REST

RADOSGW RADOSGW
LIBRADOS LIBRADOS

socket

M M

M
RADOS CLUSTER

8
MULTI-SITE OBJECT STORAGE

WEB WEB
APPLICATION APPLICATION
APP APP
SERVER SERVER

CEPH OBJECT CEPH OBJECT

GATEWAY GATEWAY
(RGW) (RGW)
CEPH STORAGE CEPH STORAGE
CLUSTER CLUSTER
(US-EAST) (EU-WEST)

9
RADOSGW MAKES RADOS
WEBBY
RADOSGW:
 REST-based object storage proxy
 Uses RADOS to store objects
●
Stripes large RESTful objects across
many RADOS objects
 API supports buckets, accounts
 Usage accounting for billing
 Compatible with S3 and Swift applications

11
ARCHITECTURAL COMPONENTS

APP HOST/VM CLIENT

RGW RBD CEPHFS

LIBRADOS
A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)

RADOS
A software-based, reliable, autonomous, distributed object store comprised of
self-healing, self-managing, intelligent storage nodes and lightweight monitors

12
STORING VIRTUAL DISKS

HYPERVISOR
LIBRBD

M M

RADOS CLUSTER

13
KERNEL MODULE

LINUX HOST
KRBD

M M

RADOS CLUSTER

14
RBD FEATURES
● Stripe images across entire cluster (pool)
● Read-only snapshots
● Copy-on-write clones
● Broad integration
– Qemu
– Linux kernel
– iSCSI (STGT, LIO)
– OpenStack, CloudStack, Nebula, Ganeti, Proxmox
● Incremental backup (relative to snapshots)

15
ARCHITECTURAL COMPONENTS

APP HOST/VM CLIENT

RGW RBD CEPHFS

LIBRADOS
A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)

RADOS
A software-based, reliable, autonomous, distributed object store comprised of
self-healing, self-managing, intelligent storage nodes and lightweight monitors

16
SEPARATE METADATA SERVER

LINUX HOST
KERNEL MODULE

metadata 01 data
10

M M

M
RADOS CLUSTER

17
SCALABLE METADATA SERVERS

METADATA SERVER
 Manages metadata for a POSIX-compliant
shared filesystem
 Directory hierarchy
 File metadata (owner, timestamps,
mode, etc.)
 Snapshots on any directory
 Clients stripe file data in RADOS
 MDS not in data path
 MDS stores metadata in RADOS
 Dynamic MDS cluster scales to 10s or
100s
 Only required for shared filesystem

18
RADOS
ARCHITECTURAL COMPONENTS

APP HOST/VM CLIENT

RGW RBD CEPHFS

LIBRADOS
A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)

RADOS
A software-based, reliable, autonomous, distributed object store comprised of
self-healing, self-managing, intelligent storage nodes and lightweight monitors

20
RADOS
● Flat object namespace within each pool
● Rich object API (librados)
– Bytes, attributes, key/value data
– Partial overwrite of existing data (mutable objects)
– Single-object compound operations
– RADOS classes (stored procedures)
● Strong consistency (CP system)
● Infrastructure aware, dynamic topology
● Hash-based placement (CRUSH)
● Direct client to server data path

21
RADOS CLUSTER

APPLICATION

M M

M
RADOS CLUSTER

22
RADOS COMPONENTS

OSDs:
 10s to 1000s in a cluster
 One per disk (or one per SSD, RAID
group…)
 Serve stored objects to clients
 Intelligently peer for replication & recovery

Monitors:
 Maintain cluster membership and state

M  Provide consensus for distributed decision-

making
 Small, odd number (e.g., 5)
 Not part of data path

23
OBJECT STORAGE DAEMONS

OSD OSD OSD OSD

M
xfs
btrfs
ext4

FS FS FS FS

DISK DISK DISK DISK M

24
DATA PLACEMENT
WHERE DO OBJECTS LIVE?

?? M
APPLICATION OBJECT

26
A METADATA SERVER?

1
M
APPLICATION
2

27
CALCULATED PLACEMENT

A-G
M

H-N
M
APPLICATION F
O-T

M
U-Z

28
CRUSH

10 10 01 01 11

01 01 01 10 01

10
OBJECTS
10
01 11 10 10

10 10 01 01
01

PLACEMENT GROUPS CLUSTER

(PGs)
29
CRUSH IS A QUICK CALCULATION

10 01 01 11

01 01 10 01

OBJECT

01 11 10 10

10 10 01 01

RADOS CLUSTER

30
CRUSH AVOIDS FAILED DEVICES

10 01 01 11

01 01 10 01

OBJECT

01 11 10 10

10 10 01 01

RADOS CLUSTER

31
CRUSH: DECLUSTERED PLACEMENT

32●
Each PG independently maps to a
pseudorandom set of OSDs 10 01 01 11

●
PGs that map to the same OSD
generally have replicas that do not
0
01 10 01
●
When an OSD fails, each PG it stored 1

will generally be re-replicated by a

different OSD 01

01 11 10 10
– Highly parallel recovery
– Avoid single-disk recovery bottleneck
10

10 10 01 01

RADOS CLUSTER

32
CRUSH: DYNAMIC DATA PLACEMENT

CRUSH:
 Pseudo-random placement algorithm
 Fast calculation, no lookup
 Repeatable, deterministic
 Statistically uniform distribution
 Stable mapping
 Limited data migration on change
 Rule-based configuration
 Infrastructure topology aware
 Adjustable replication
 Weighted devices (different sizes)

33
DATA IS ORGANIZED INTO POOLS

10 11 10 01
POOL 10 01 01 11
OBJECTS A
01 01 01 10

01 10 11 10
POOL 01 01 10 01
OBJECTS B
10 01 01 01

POOL
OBJECTS C 10 01 10 11
01 11 10 10

01 10 01 01

POOL
OBJECTS D 11 10 01 10
10 10 01 01

01 01 10 01

CLUSTER
POOLS
(CONTAINING PGs)
34
TIERED STORAGE
TWO WAYS TO CACHE
● Within each OSD
– Combine SSD and HDD under each OSD OSD
– Make localized promote/demote decisions
– Leverage existing tools FS
● dm-cache, bcache, FlashCache BLOCKDEV
● Variety of caching controllers
HDD SSD
– We can help with hints
● Cache on separate devices/nodes
– Different hardware for different tiers
● Slow nodes for cold data
● High performance nodes for hot data
– Add, remove, scale each tier independently
● Unlikely to choose right ratios at procurement time
36
TIERED STORAGE

APPLICATION

CACHE POOL (REPLICATED)

BACKING POOL (ERASURE CODED)

CEPH STORAGE CLUSTER

37
RADOS TIERING PRINCIPLES
● Each tier is a RADOS pool
– May be replicated or erasure coded
● Tiers are durable
– e.g., replicate across SSDs in multiple hosts
● Each tier has its own CRUSH policy
– e.g., map cache pool to SSDs devices/hosts only
● librados adapts to tiering topology
– Transparently direct requests accordingly
● e.g., to cache
– No changes to RBD, RGW, CephFS, etc.

38
READ (CACHE HIT)

CEPH CLIENT

READ READ REPLY

CACHE POOL (SSD): WRITEBACK

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

39
READ (CACHE MISS)

CEPH CLIENT

READ READ REPLY

CACHE POOL (SSD): WRITEBACK

PROXY READ

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

40
READ (CACHE MISS)

CEPH CLIENT

READ READ REPLY

CACHE POOL (SSD): WRITEBACK

PROMOTE

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

42
WRITE (HIT)

CEPH CLIENT

WRITE ACK

CACHE POOL (SSD): WRITEBACK

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

43
WRITE (MISS)

CEPH CLIENT

WRITE ACK

CACHE POOL (SSD): WRITEBACK

PROMOTE

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

44
WRITE (MISS) (COMING SOON)

CEPH CLIENT

WRITE ACK

CACHE POOL (SSD): WRITEBACK

PROXY WRITE

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

45
ESTIMATING TEMPERATURE
● Each PG constructs in-memory bloom filters
– Insert records on both read and write
– Each filter covers configurable period (e.g., 1 hour)
– Tunable false positive probability (e.g., 5%)
– Store most recent N periods on disk (e.g., last 24 hours)
● Estimate temperature
– Has object been accessed in any of the last N periods?
– ...in how many of them?
– Informs the flush/evict decision
● Estimate “recency”
– How many periods since the object hasn't been accessed?
– Informs read miss behavior: proxy vs promote
46
FLUSH AND/OR EVICT
COLD DATA

CEPH CLIENT

CACHE POOL (SSD): WRITEBACK

FLUSH ACK EVICT

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

47
TIERING AGENT
● Each PG has an internal tiering agent
– Manages PG based on administrator defined policy
● Flush dirty objects
– When pool reaches target dirty ratio
– Tries to select cold objects
– Marks objects clean when they have been written back
to the base pool
● Evict (delete) clean objects
– Greater “effort” as cache pool approaches target size

48
CACHE TIER USAGE
● Cache tier should be faster than the base tier
● Cache tier should be replicated (not erasure coded)
● Promote and flush are expensive
– Best results when object temperature are skewed
● Most I/O goes to small number of hot objects
– Cache should be big enough to capture most of the
acting set
● Challenging to benchmark
– Need a realistic workload (e.g., not 'dd') to determine
how it will perform in practice
– Takes a long time to “warm up” the cache

49
ERASURE CODING
ERASURE CODING

OBJECT OBJECT

COPY
1 2 3 4 X Y
COPY COPY

REPLICATED POOL ERASURE CODED POOL

CEPH STORAGE CLUSTER CEPH STORAGE CLUSTER

Full copies of stored objects One copy plus parity

 Very high durability  Cost-effective durability
 3x (200% overhead)  1.5x (50% overhead)
52  Quicker recovery  Expensive recovery
ERASURE CODING SHARDS

OBJECT

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

ERASURE CODED POOL

CEPH STORAGE CLUSTER

53
ERASURE CODING SHARDS

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

0 1 2 3 A A'
4 5 6 7 B B'
8 9 10 9 C C'
12 13 14 15 D D'
16 17 18 19 E E'

●
Variable stripe size (e.g., 4 KB)
CEPH STORAGE CLUSTER
●
Zero-fill shards (logically) in partial tail stripe
54
PRIMARY COORDINATES

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

ERASURE CODED POOL

CEPH STORAGE CLUSTER

55
EC READ

CEPH CLIENT
READ

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

ERASURE CODED POOL

CEPH STORAGE CLUSTER

56
EC READ

CEPH CLIENT
READ

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

READS ERASURE CODED POOL

CEPH STORAGE CLUSTER

57
EC READ

CEPH CLIENT
READ REPLY

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

ERASURE CODED POOL

CEPH STORAGE CLUSTER

58
EC WRITE

CEPH CLIENT
WRITE

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

ERASURE CODED POOL

CEPH STORAGE CLUSTER

59
EC WRITE

CEPH CLIENT
WRITE

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

WRITES ERASURE CODED POOL

CEPH STORAGE CLUSTER

60
EC WRITE

CEPH CLIENT
WRITE ACK

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

ERASURE CODED POOL

CEPH STORAGE CLUSTER

61
EC WRITE: DEGRADED

CEPH CLIENT
WRITE

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

WRITES ERASURE CODED POOL

CEPH STORAGE CLUSTER

62
EC WRITE: PARTIAL FAILURE

CEPH CLIENT
WRITE

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

WRITES ERASURE CODED POOL

CEPH STORAGE CLUSTER

63
EC WRITE: PARTIAL FAILURE

CEPH CLIENT

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

B B A B A A
ERASURE CODED POOL

CEPH STORAGE CLUSTER

64
EC RESTRICTIONS
● Overwrite in place will not work in general
● Log and 2PC would increase complexity, latency
● We chose to restrict allowed operations
– create
– append (on stripe boundary)
– remove (keep previous generation of object for some time)
● These operations can all easily be rolled back locally
– create → delete
– append → truncate
– remove → roll back to previous generation
● Object attrs preserved in existing PG logs (they are small)
● Key/value data is not allowed on EC pools
65
EC WRITE: PARTIAL FAILURE

CEPH CLIENT

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

B B A B A A
ERASURE CODED POOL

CEPH STORAGE CLUSTER

66
EC WRITE: PARTIAL FAILURE

CEPH CLIENT

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

A A A A A A
ERASURE CODED POOL

CEPH STORAGE CLUSTER

67
EC RESTRICTIONS
● This is a small subset of allowed librados operations
– Notably cannot (over)write any extent
● Coincidentally, unsupported operations are also
inefficient for erasure codes
– Generally require read/modify/write of affected stripe(s)
● Some can consume EC directly
– RGW (no object data update in place)
● Others can combine EC with a cache tier (RBD,
CephFS)
– Replication for warm/hot data
– Erasure coding for cold data
– Tiering agent skips objects with key/value data
68
WHICH ERASURE CODE?
● The EC algorithm and implementation are pluggable
– jerasure/gf-complete (free, open, and very fast)
– ISA-L (Intel library; optimized for modern Intel procs)
– LRC (local recovery code – layers over existing plugins)
– SHEC (trades extra storage for recovery efficiency – new from Fujitsu)
● Parameterized
– Pick “k” and “m”, stripe size
● OSD handles data path, placement, rollback, etc.
● Erasure plugin handles
– Encode and decode math
– Given these available shards, which ones should I fetch to satisfy a
read?
– Given these available shards and these missing shards, which ones
should I fetch to recover?
69
COST OF RECOVERY

1 TB OSD

70
COST OF RECOVERY

1 TB OSD

71
COST OF RECOVERY (REPLICATION)

1 TB OSD

1 TB

72
COST OF RECOVERY (REPLICATION)

1 TB OSD

.01 TB .01 TB

...
...

73
COST OF RECOVERY (REPLICATION)

1 TB OSD

1 TB

74
COST OF RECOVERY (EC)

1 TB OSD

1 TB

1 TB
75
LOCAL RECOVERY CODE (LRC)

OBJECT

1 2 3 4 X Y
OSD OSD OSD OSD OSD OSD

A B C
OSD OSD OSD

ERASURE CODED POOL

CEPH STORAGE CLUSTER

76
BIG THANKS TO
● Ceph
– Loic Dachary (CloudWatt, FSF France, Red Hat)
– Andreas Peters (CERN)
– Sam Just (Inktank / Red Hat)
– David Zafman (Inktank / Red Hat)
● jerasure / gf-complete
– Jim Plank (University of Tennessee)
– Kevin Greenan (Box.com)
● Intel (ISL plugin)
● Fujitsu (SHEC plugin)

77
ROADMAP
WHAT'S NEXT
● Erasure coding
– Allow (optimistic) client reads directly from shards
– ARM optimizations for jerasure
● Cache pools
– Better agent decisions (when to flush or evict)
– Supporting different performance profiles
● e.g., slow / “cheap” flash can read just as fast
– Complex topologies
● Multiple readonly cache tiers in multiple sites
● Tiering
– Support “redirects” to (very) cold tier below base pool
– Enable dynamic spin-down, dedup, and other features
79
OTHER ONGOING WORK
● Performance optimization (SanDisk, Intel, Mellanox)
● Alternative OSD backends
– New backend: hybrid key/value and file system
– leveldb, rocksdb, LMDB
● Messenger (network layer) improvements
– RDMA support (libxio – Mellanox)
– Event-driven TCP implementation (UnitedStack)
● CephFS
– Online consistency checking and repair tools
– Performance, robustness
● Multi-datacenter RBD, RADOS replication
80
FOR MORE INFORMATION
● http://ceph.com
● http://github.com/ceph
● http://tracker.ceph.com
● Mailing lists
– ceph-users@ceph.com
– ceph-devel@vger.kernel.org
● irc.oftc.net
– #ceph
– #ceph-devel
● Twitter
– @ceph
81
THANK YOU!

Sage Weil
CEPH PRINCIPAL
ARCHITECT

sage@redhat.com

@liewegas

Java 17 Backend Development: Design backend systems using Spring Boot, Docker, Kafka, Eureka, Redis, and Tomcat
From Everand
Java 17 Backend Development: Design backend systems using Spring Boot, Docker, Kafka, Eureka, Redis, and Tomcat
Elara Drevyn
No ratings yet
How To Configure A PRI On A Mitel 3300
100% (2)
How To Configure A PRI On A Mitel 3300
24 pages
ASSIGNMENT 2 (Flags)
100% (1)
ASSIGNMENT 2 (Flags)
26 pages
Ceph An Overview
No ratings yet
Ceph An Overview
8 pages
Backend Development
From Everand
Backend Development
Kai Turing
No ratings yet
Expert Days 2018: SUSE Enterprise Storage
No ratings yet
Expert Days 2018: SUSE Enterprise Storage
15 pages
Mirantis Openstack Planning Guide
No ratings yet
Mirantis Openstack Planning Guide
56 pages
Huawei FusionSphere Compatibility List
No ratings yet
Huawei FusionSphere Compatibility List
46 pages
Lenovo Reference Architecture For OpenShift 4.13
No ratings yet
Lenovo Reference Architecture For OpenShift 4.13
77 pages
How Microsoft Applies Java
No ratings yet
How Microsoft Applies Java
17 pages
CloudEngine 6800&5800 V100R001C00 Configuration Guide - Basic Configuration 04 PDF
No ratings yet
CloudEngine 6800&5800 V100R001C00 Configuration Guide - Basic Configuration 04 PDF
201 pages
Red Hat OpenStack Platform-11-High Availability For Compute Instances-En-US
No ratings yet
Red Hat OpenStack Platform-11-High Availability For Compute Instances-En-US
14 pages
Performance and Best Practices Guide For IBM Spectrum Virtualize 85
No ratings yet
Performance and Best Practices Guide For IBM Spectrum Virtualize 85
784 pages
Red Hat Enterprise Virtualization 3.0 and Netapp Storage Deployment Guide
No ratings yet
Red Hat Enterprise Virtualization 3.0 and Netapp Storage Deployment Guide
26 pages
Step by Step Install Guide OpenQRM Data Center Management and Cloud Computing Platform
100% (1)
Step by Step Install Guide OpenQRM Data Center Management and Cloud Computing Platform
32 pages
Dell Emc Ready Architecture Guide Red Hat v16 1
No ratings yet
Dell Emc Ready Architecture Guide Red Hat v16 1
77 pages
Dynatrace Associate VILT Day 1
No ratings yet
Dynatrace Associate VILT Day 1
122 pages
ST Ceph Storage QCT Object Storage Reference Architecture f7901 201706 v2 en
No ratings yet
ST Ceph Storage QCT Object Storage Reference Architecture f7901 201706 v2 en
56 pages
IBM DS8900F Performance Best Practices and Monitoring
No ratings yet
IBM DS8900F Performance Best Practices and Monitoring
294 pages
Red Hat Openshift Container Storage 4: Dynamic, Shared, and Highly Available Storage For Openshift Applications
No ratings yet
Red Hat Openshift Container Storage 4: Dynamic, Shared, and Highly Available Storage For Openshift Applications
5 pages
Dell R730xd RedHat Ceph Performance SizingGuide WhitePaper
No ratings yet
Dell R730xd RedHat Ceph Performance SizingGuide WhitePaper
37 pages
VXVM Training 1
No ratings yet
VXVM Training 1
70 pages
Red Hat Openstack Platform-17.1-Installing and Managing Red Hat Openstack Platform With Director-En-Us222
No ratings yet
Red Hat Openstack Platform-17.1-Installing and Managing Red Hat Openstack Platform With Director-En-Us222
192 pages
Tower Intro PDF
No ratings yet
Tower Intro PDF
34 pages
Azure Devops - FEST - Course Content - 2
No ratings yet
Azure Devops - FEST - Course Content - 2
4 pages
Software Defined Storage Pre
No ratings yet
Software Defined Storage Pre
29 pages
Ceph at CSC
No ratings yet
Ceph at CSC
27 pages
A Deep Dive Into Linux Namespaces - Chord Simple
No ratings yet
A Deep Dive Into Linux Namespaces - Chord Simple
5 pages
Git Presentation
No ratings yet
Git Presentation
125 pages
Lenovo Reference Architecture For OpenShift - 2024-03-27
No ratings yet
Lenovo Reference Architecture For OpenShift - 2024-03-27
78 pages
8 Linux Parted' Commands To Create, Resize and Rescue Disk Partitions
No ratings yet
8 Linux Parted' Commands To Create, Resize and Rescue Disk Partitions
13 pages
BP 2105 Linux On AHV
No ratings yet
BP 2105 Linux On AHV
34 pages
BRKETI-2003 - Understanding Multicluster Kubernetes Connectivity Options
No ratings yet
BRKETI-2003 - Understanding Multicluster Kubernetes Connectivity Options
73 pages
IBM Network Design Guide
No ratings yet
IBM Network Design Guide
258 pages
Maria DB Server Knowledge Base
No ratings yet
Maria DB Server Knowledge Base
3,812 pages
01 - Intro To Red Hat JBoss 20150702
No ratings yet
01 - Intro To Red Hat JBoss 20150702
52 pages
AppSync 4.0 Installation and Configuration Guide PDF
No ratings yet
AppSync 4.0 Installation and Configuration Guide PDF
46 pages
Btech Trainings Guide
No ratings yet
Btech Trainings Guide
26 pages
CKA Candidate Handbook v1.4
No ratings yet
CKA Candidate Handbook v1.4
26 pages
Red Hat Ceph Storage-5-File System Guide-En-Us
No ratings yet
Red Hat Ceph Storage-5-File System Guide-En-Us
160 pages
Software-Defined Storage Concepts
No ratings yet
Software-Defined Storage Concepts
41 pages
Monitor Traefik With Grafana, Prometheus & Loki - by Sven Van Ginkel - Medium
No ratings yet
Monitor Traefik With Grafana, Prometheus & Loki - by Sven Van Ginkel - Medium
13 pages
OSP8 Roadmap
No ratings yet
OSP8 Roadmap
84 pages
Baremetal Kubernetes Cluster Setup On Windows: Prerequisites
No ratings yet
Baremetal Kubernetes Cluster Setup On Windows: Prerequisites
11 pages
Proxmox VE 8.0 Datasheet
No ratings yet
Proxmox VE 8.0 Datasheet
5 pages
High Performance PostgreSQL For Rails Beta Reliable Scalable Maintainable Database Applications Andrew Atkinson All Chapter Instant Download
100% (13)
High Performance PostgreSQL For Rails Beta Reliable Scalable Maintainable Database Applications Andrew Atkinson All Chapter Instant Download
70 pages
Extending SaltStack - Sample Chapter
No ratings yet
Extending SaltStack - Sample Chapter
15 pages
CIS Docker Community Edition Benchmark v1.1.0
No ratings yet
CIS Docker Community Edition Benchmark v1.1.0
230 pages
Snapmanager 2.0 For Virtual Infrastructure Best Practices: Front Cover
No ratings yet
Snapmanager 2.0 For Virtual Infrastructure Best Practices: Front Cover
118 pages
Architecting Splunk For High Availability and Disaster Recovery
No ratings yet
Architecting Splunk For High Availability and Disaster Recovery
47 pages
Preparation Guide Exin Cloud Foundation English
No ratings yet
Preparation Guide Exin Cloud Foundation English
14 pages
Red Hat Satellite 6.2 ArchitectureGuide
100% (1)
Red Hat Satellite 6.2 ArchitectureGuide
35 pages
RFP KMRL 3.0
No ratings yet
RFP KMRL 3.0
266 pages
Unit 2 - Cumulus Linux Initial Setup
No ratings yet
Unit 2 - Cumulus Linux Initial Setup
19 pages
The Leading Open Source Backup Solution: Bacula Console and Operators Guide
No ratings yet
The Leading Open Source Backup Solution: Bacula Console and Operators Guide
39 pages
Competitive Comparison-Application Servers
100% (1)
Competitive Comparison-Application Servers
64 pages
Github Actions CICD Pipeline
No ratings yet
Github Actions CICD Pipeline
13 pages
Scale Adm
No ratings yet
Scale Adm
808 pages
CS313L Maunual v2 PDF
No ratings yet
CS313L Maunual v2 PDF
104 pages
CIS Red Hat Enterprise Linux 7 Benchmark v2.1.1
No ratings yet
CIS Red Hat Enterprise Linux 7 Benchmark v2.1.1
347 pages
Extending Puppet - Second Edition
From Everand
Extending Puppet - Second Edition
Alessandro Franceschi
No ratings yet
Mastering Active Directory
From Everand
Mastering Active Directory
VICTOR P HENDERSON
No ratings yet
Basic Multicast Troubleshooting Tools
No ratings yet
Basic Multicast Troubleshooting Tools
16 pages
Ceph 스토리지, PaaS로 서비스 운영하기
No ratings yet
Ceph 스토리지, PaaS로 서비스 운영하기
59 pages
ccieRSv5 1 by CertDude - v2016 11 03
100% (2)
ccieRSv5 1 by CertDude - v2016 11 03
139 pages
Implementation of IEEE 802.1 X in Wired Networks
No ratings yet
Implementation of IEEE 802.1 X in Wired Networks
20 pages
Cisco ASA 5500 Configuration Guide (ASDM) Ko
No ratings yet
Cisco ASA 5500 Configuration Guide (ASDM) Ko
127 pages
(Cisco) (WLC) Wireless LAN Controller Restrict Clients Per WLAN Configuration Example
No ratings yet
(Cisco) (WLC) Wireless LAN Controller Restrict Clients Per WLAN Configuration Example
4 pages
PCG-GRT Series t99860145
No ratings yet
PCG-GRT Series t99860145
1 page
Java Pra
No ratings yet
Java Pra
2 pages
2024-02-22
No ratings yet
2024-02-22
11 pages
Sur Gard System IV Spec Sheet North America
No ratings yet
Sur Gard System IV Spec Sheet North America
2 pages
Query Expressions - Django Documentation - Django Part 1
No ratings yet
Query Expressions - Django Documentation - Django Part 1
1 page
Python FTN
No ratings yet
Python FTN
145 pages
Tutorial 2 - Solution
No ratings yet
Tutorial 2 - Solution
15 pages
Service & Support: Configuration of An S7-300 CPU As DP Slave To A CP 342-5 As DP Master
No ratings yet
Service & Support: Configuration of An S7-300 CPU As DP Slave To A CP 342-5 As DP Master
14 pages
Datashet DFPlayer Mini SKU DFR0299
No ratings yet
Datashet DFPlayer Mini SKU DFR0299
10 pages
Esecos Log
No ratings yet
Esecos Log
2 pages
Lampiran 8
No ratings yet
Lampiran 8
10 pages
Stacks and Recursion Lab
No ratings yet
Stacks and Recursion Lab
6 pages
DX Diag 33333
No ratings yet
DX Diag 33333
33 pages
Lecture Notes in Computer Science 6760: Editorial Board
No ratings yet
Lecture Notes in Computer Science 6760: Editorial Board
19 pages
Rocket Revise
No ratings yet
Rocket Revise
28 pages
3161009-BE-WINTER-2022
No ratings yet
3161009-BE-WINTER-2022
2 pages
Domino-Automation-Brochure-US-SCREEN
No ratings yet
Domino-Automation-Brochure-US-SCREEN
4 pages
Computer Project Nishan Pant 11D 21478
No ratings yet
Computer Project Nishan Pant 11D 21478
14 pages
Master List of Experiment: Maharashtra Institute of Technology, Aurangabad Laboratory Manual
No ratings yet
Master List of Experiment: Maharashtra Institute of Technology, Aurangabad Laboratory Manual
90 pages
IT ESSENTIALS Notes
No ratings yet
IT ESSENTIALS Notes
41 pages
MINI-CNC Digital Control Incise Machine System Manual 51-174
No ratings yet
MINI-CNC Digital Control Incise Machine System Manual 51-174
124 pages
Response
No ratings yet
Response
5 pages
Design Document For Library Management System
57% (28)
Design Document For Library Management System
16 pages
Voice Controlled Car: BS Documentation by Hammad Malik (F16-1244) Arslan Ali (F16-1160) Hassam Akram (F16-1153)
100% (1)
Voice Controlled Car: BS Documentation by Hammad Malik (F16-1244) Arslan Ali (F16-1160) Hassam Akram (F16-1153)
16 pages
SYMCLI Cheet Sheet: Leave A Comment
No ratings yet
SYMCLI Cheet Sheet: Leave A Comment
5 pages
2022081512普通日志
No ratings yet
2022081512普通日志
1,899 pages
NSAs MNPv4
100% (1)
NSAs MNPv4
58 pages
Azure Machine Learning Guide
100% (1)
Azure Machine Learning Guide
1,748 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Storage Tiering and Erasure Coding in Ceph - 150222

Uploaded by

Storage Tiering and Erasure Coding in Ceph - 150222

Uploaded by

ERASURE CODING AND CACHE TIERING

SAGE WEIL – SCALE13X - 2015.02.22

● Move beyond legacy approaches

APP HOST/VM CLIENT

RGW RBD CEPHFS

APP HOST/VM CLIENT

RGW RBD CEPHFS

CEPH OBJECT CEPH OBJECT

APP HOST/VM CLIENT

RGW RBD CEPHFS

APP HOST/VM CLIENT

RGW RBD CEPHFS

APP HOST/VM CLIENT

RGW RBD CEPHFS

M  Provide consensus for distributed decision-

OSD OSD OSD OSD

DISK DISK DISK DISK M

PLACEMENT GROUPS CLUSTER

will generally be re-replicated by a

CACHE POOL (REPLICATED)

BACKING POOL (ERASURE CODED)

CEPH STORAGE CLUSTER

READ READ REPLY

CACHE POOL (SSD): WRITEBACK

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

READ READ REPLY

CACHE POOL (SSD): WRITEBACK

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

READ READ REPLY

CACHE POOL (SSD): WRITEBACK

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

CACHE POOL (SSD): WRITEBACK

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

CACHE POOL (SSD): WRITEBACK

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

CACHE POOL (SSD): WRITEBACK

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

CACHE POOL (SSD): WRITEBACK

FLUSH ACK EVICT

BACKING POOL (HDD)

CEPH STORAGE CLUSTER

REPLICATED POOL ERASURE CODED POOL

CEPH STORAGE CLUSTER CEPH STORAGE CLUSTER

Full copies of stored objects One copy plus parity

ERASURE CODED POOL

CEPH STORAGE CLUSTER

ERASURE CODED POOL

CEPH STORAGE CLUSTER

ERASURE CODED POOL

CEPH STORAGE CLUSTER

READS ERASURE CODED POOL

CEPH STORAGE CLUSTER

ERASURE CODED POOL

CEPH STORAGE CLUSTER

ERASURE CODED POOL

CEPH STORAGE CLUSTER

WRITES ERASURE CODED POOL

CEPH STORAGE CLUSTER

ERASURE CODED POOL

CEPH STORAGE CLUSTER

WRITES ERASURE CODED POOL

CEPH STORAGE CLUSTER

WRITES ERASURE CODED POOL

CEPH STORAGE CLUSTER

CEPH STORAGE CLUSTER

CEPH STORAGE CLUSTER

CEPH STORAGE CLUSTER

ERASURE CODED POOL

CEPH STORAGE CLUSTER

You might also like