ADS_QB
ADS_QB
Advantages:
Greater scalability, as it enables processing large
datasets in parallel.
Load balancing by distributing workload equally among
nodes.
Improved data isolation and fault tolerance.
Disadvantages:
Complex join operations across partitions.
Potential data skew if distribution is uneven.
Challenges in distributed transaction management.
2. Vertical Partitioning:
Description: Divides dataset based on columns or
attributes, with each partition containing a subset of
columns for each row.
Advantages:
Improved query performance by placing frequently
accessed columns in separate partitions.
Efficient data retrieval for queries needing only a subset
of columns.
Simplified schema management for adding or removing
columns.
Disadvantages:
Increased complexity in query execution plans.
Complex joins across partitions.
Limited scalability for datasets continuously growing in
columns.
3. Key-based Partitioning:
Description: Divides data based on a particular key or
attribute value, ensuring data with the same key value is
stored in the same partition.
Advantages:
Even data distribution for efficient key lookups.
Scalability and load balancing across partitions.
Disadvantages:
Potential skew and hotspots if key distribution is
uneven.
Limited query flexibility for queries involving multiple
keys or range queries.
Requires careful partition management.
4. Range Partitioning:
Description: Divides dataset based on predetermined
ranges of values, suitable for datasets with natural
ordering based on a specific attribute.
Advantages:
Natural ordering for efficient data retrieval based on
ranges.
Even data distribution across partitions.
Simplified query planning for range-based conditions.
Disadvantages:
Uneven data distribution across ranges.
Challenges in managing data growth and adjusting
ranges.
Complexity in joins and range queries across partitions.
5. Hash-based Partitioning:
Description: Analyzes data using a hash function to
distribute data randomly among partitions.
Advantages:
Even data distribution for load balancing.
Scalability and parallel processing.
Relatively simple implementation.
Disadvantages:
Inefficient for key-based lookups.
Potential load balancing challenges.
Requires partition management as dataset grows.
6. Round-robin Partitioning:
Description: Distributes data evenly across partitions in
a cyclic manner.
Advantages:
Simple implementation.
Basic load balancing.
Scalability and parallel processing.
Disadvantages:
Unequal partition sizes if data distribution or number of
partitions is not balanced.
Inefficient data retrieval for certain queries.
Limited query optimization for specific patterns.
2. Explain Horizontal fragmentation with
example.
Horizontal fragmentation refers to the process of dividing a
table horizontally by assigning subsets of rows to different
fragments. Each fragment contains a portion of the original
table's rows, and together, these fragments cover the entire
table. Horizontal fragmentation is typically used in
distributed databases to partition data across multiple sites
or servers for improved performance and scalability.
Advantages of ORDBs
2. Query Optimization:
- Finding the Best Way: Imagine you're trying to get from
point A to point B. There could be many different routes
you could take, each with its own advantages and
disadvantages. Similarly, query optimization is like finding
the best route to get the information you need from the
database.
- Making it Faster: Just like finding a faster way to get
from point A to point B can save you time, optimizing a
query can make it run faster and more efficiently.
- Using the Right Tools: Sometimes, you might need to
use a map or GPS to find the best route. In query
optimization, the system uses tools like indexes, statistics,
and algorithms to figure out the best way to retrieve the data
you're asking for.
- Getting the Most Accurate Answer: Optimization also
ensures that the information retrieved is accurate and up-to-
date, so you get the most reliable answer to your query.
9. Write a short note on OID.
OID, or Object Identifier, is a unique identifier assigned to
each object in a database. It acts like a special code or tag
that helps the database keep track of different objects. Just
like each book in a library has its own unique number, each
object in a database has its own OID. This makes it easy for
the database to organize and find specific objects quickly
when needed. OIDs play a crucial role in object-oriented
databases by uniquely identifying objects and facilitating
efficient data retrieval and manipulation.
Disadvantages:
• Lack of ACID transactions, sacrificing consistency
for scalability and performance.
• Limited query capabilities compared to SQL
databases.
• Learning curve for developers new to NoSQL
concepts and query languages.
• Eventual consistency model may not be sufficient for
applications requiring strong consistency.
• Less mature tooling and ecosystem compared to SQL
databases.