0% found this document useful (0 votes)
5 views64 pages

Class 3 Cassandra

The document provides a comprehensive overview of Apache Cassandra, a distributed NoSQL database designed for large-scale data management. It covers key features, architecture, data modeling, querying with CQL, performance optimization, and security measures. Additionally, it includes advanced topics and real-world use cases demonstrating Cassandra's application in various industries.

Uploaded by

suojuhe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views64 pages

Class 3 Cassandra

The document provides a comprehensive overview of Apache Cassandra, a distributed NoSQL database designed for large-scale data management. It covers key features, architecture, data modeling, querying with CQL, performance optimization, and security measures. Additionally, it includes advanced topics and real-world use cases demonstrating Cassandra's application in various industries.

Uploaded by

suojuhe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 64

DSA 5600 – NoSQL Database

Systems
Section 3: Apache Cassandra
Instructor: Yutong Zhao
Outline
1. Introduction to Cassandra
2. Architecture & Data Model
3. Working with Cassandra
4. Querying Data with CQL
5. Indexing & Performance Optimization
6. Replication, Consistency, and Security
7. Backup, Monitoring & Recovery
8. Advanced Topics & Real-world Use
Cases
2
1. Introduction to Cassandra

3
What is Cassandra?
• Apache Cassandra is a distributed, NoSQL database designed for handling large volumes of
structured data across many servers without a single point of failure.

•Originally developed at Facebook to power their inbox search, later open-sourced and adopted by the
Apache Software Foundation.

•Key Characteristics:
• Scalability: Easily scales horizontally by adding more nodes.
• High Availability: No single point of failure; data is replicated across multiple nodes.
• NoSQL Model: Uses a schema-less, column-family-based storage model.
• Optimized for Write-Intensive Workloads: Handles high-speed inserts and updates efficiently.
• Eventual Consistency: Ensures availability over strict consistency (CAP theorem).

• Used By: Netflix, Twitter, eBay, Uber, and many more for high-performance, globally
distributed applications.

4
Key Features of Cassandra (1)

5
Key Features of Cassandra (2)

6
When to use Cassandra?

7
Cassandra vs. Relational Database (RDBMS)

8
Cassandra vs. Other NoSQL Database

9
Outline
1. Introduction to Cassandra
2. Architecture & Data Model
3. Working with Cassandra
4. Querying Data with CQL
5. Indexing & Performance Optimization
6. Replication, Consistency, and Security
7. Backup, Monitoring & Recovery
8. Advanced Topics & Real-world Use
Cases
1
2. Architecture & Data Model

11
Architecture Overview: Nodes, Partitions

1
Architecture Overview: Replication

1
Gossip Protocol – How Nodes Communicate

1
Hinted Handoff & Read
Repair

1
Cassandra’s Data Model

1
Key Features of Cassandra

1
Outline
1. Introduction to Cassandra
2. Architecture & Data Model
3. Working with Cassandra
4. Querying Data with CQL
5. Indexing & Performance Optimization
6. Replication, Consistency, and Security
7. Backup, Monitoring & Recovery
8. Advanced Topics & Real-world Use
Cases
1
3. Working with Cassandra

19
Creating a Keyspace

2
Creating a Table with Primary and Clustering
Keys
• product_id is the Partition
Key, which decides which
node stores the data.

• The table is schema-flexible,


allowing new columns to
be added dynamically.

2
Best Practices for Schema Design

2
Best Practices for Schema Design

2
Partitioning Strategies for Performance
Optimization

2
Denormalization vs. Normalization in
Cassandra

2
Outline
1. Introduction to Cassandra
2. Architecture & Data Model
3. Working with Cassandra
4. Querying Data with CQL
5. Indexing & Performance Optimization
6. Replication, Consistency, and Security
7. Backup, Monitoring & Recovery
8. Advanced Topics & Real-world Use
Cases
2
4. Querying Data with CQL

27
Inserting and Querying Data

2
CRUD Operations in Cassandra

2
Query Data with CQL – Basic SELECT

• Basic Querying with SELECT:

3
Query Data with CQL – Filtering Data

• Filtering Data:

3
Query Data with CQL - Aggregation

• Aggregation & Counting Records:

3
Lightweight Transactions (LWT) – Ensuring
Consistency

3
Using TTL (Time-to-Live) for Expiring Data

3
Batch Queries: Benefits and Pitfalls

3
Outline
1. Introduction to Cassandra
2. Architecture & Data Model
3. Working with Cassandra
4. Querying Data with CQL
5. Indexing & Performance Optimization
6. Replication, Consistency, and Security
7. Backup, Monitoring & Recovery
8. Advanced Topics & Real-world Use
Cases
3
5. Indexing & Performance Optimization

37
Secondary Index

3
Materialized
View

3
SASI Indexes – Advanced Searching

4
Read/Write Path Internals – Memtables,
SSTables, and Commitlogs

4
Outline
1. Introduction to Cassandra
2. Architecture & Data Model
3. Working with Cassandra
4. Querying Data with CQL
5. Indexing & Performance Optimization
6. Replication, Consistency, and Security
7. Backup, Monitoring & Recovery
8. Advanced Topics & Real-world Use
Cases
4
6. Replication, Consistency, and Security

43
Performance Optimization & Data Replication

4
Replication Strategies in Cassandra

4
Understanding Consistency Levels

4
Role-Based Access Control (RBAC)
& User Management

4
TLS Encryption & Authentication for Secure
Cassandra

4
Audit Logging & Monitoring Access

4
Tuning Read & Write Performance

5
Using Caching & Compaction Strategies

5
Outline
1. Introduction to Cassandra
2. Architecture & Data Model
3. Working with Cassandra
4. Querying Data with CQL
5. Indexing & Performance Optimization
6. Replication, Consistency, and Security
7. Backup, Monitoring & Recovery
8. Advanced Topics & Real-world Use
Cases
5
7. Backup, Monitoring & Recovery

53
Backup and Restore in Cassandra

5
Multi-Data Center Replication for Global
Availability

5
Handling Node Failures in Production

5
Security Features in Cassandra

5
Monitoring and Troubleshooting Cassandra

5
Advanced Data Modeling in Cassandra

5
Integration with Other
Tools

6
Outline
1. Introduction to Cassandra
2. Architecture & Data Model
3. Working with Cassandra
4. Querying Data with CQL
5. Indexing & Performance Optimization
6. Replication, Consistency, and Security
7. Backup, Monitoring & Recovery
8. Advanced Topics & Real-world Use
Cases
6
8. Advanced Topics & Real-world Use
Cases

62
Case Studies of Cassandra in Production

6
6

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy