ADBMS - Unit3 (Autosaved)
ADBMS - Unit3 (Autosaved)
ADBMS - Unit3 (Autosaved)
• In this model, multiple copies of data are tagged with timestamps or change identifiers
that allow the database to construct a snapshot of the database at a given point in time.
• In this way, MVCC provides for transaction isolation and consistency while
maximizing concurrency.
Multi-version Concurrency Control
• For example, in MVCC, if a database table is subjected to modifications
between the time a session starts reading the table and the time the session
finishes, the database will use previous versions of table data to ensure that the
session sees a consistent version.
• MVCC also means that until a transaction commits, other sessions do not see
the transaction’s modifications—other sessions look at older versions of the
data. These older copies of the data are also used to roll back transactions that
do not complete successfully.
Multi-version Concurrency Control -
Example
• Figure 9-1 illustrates the MVCC model.
• A database session initiates a transaction at time t1 (1).
• At time t2, the session updates data in a database table (2);
• this results in a new version of that data being created (3).
• At about the same time, a second database session queries the database table, but because
the transaction from the first session has not yet been committed, they see the previous
version of the data (4).
• After the first session commits the transaction (5), the second database session will read
from the modified version of the data (6).
Multi-version Concurrency Control
• The big advantage of MVCC is a reduction in lock overhead.
• In the example shown in Figure 9-1, without MVCC the update would
have created a blocking lock that would have prevented the second session
from reading the data until the transaction was completed.
Global Transaction Sequence Numbers
• MVCC can use transaction timestamps to determine which versions of data
should be made visible to specific queries.
• This is called the system change number (SCN) in Oracle and the transaction
sequence number in Microsoft SQL Server.
Global Transaction Sequence Numbers
• This sequence number is incremented whenever a transaction is initiated, and it is
recorded in the structure of modified rows (or database blocks).
• When a query commences, it looks for rows that have a sequence number less than
or equal to the value of the sequence number that was current when the query began.
• If the query encounters a row with a higher sequence number, it knows it must
request an older version of that row.
Two-phase Commit
• MVCC works with the ACID transaction model to provide isolation
between transactions running on a single system.
• Commit phase, in which the coordinator signals all nodes to commit their
transactions if the commit-request phase succeeded across all nodes.
Alternatively, if any node experiences difficulties, a rollback request is
sent to all nodes and the transaction fails
Consistency in MongoDB
• By default—in a single-server deployment—a MongoDB database
provides strict single-document consistency.
• Locks are used to ensure that two writes do not attempt to modify a
document simultaneously, and also that a reader will not see an
inconsistent view of the data.
MongoDB Locking
• We saw earlier how a multi-version concurrency control (MVCC) algorithm can be
used to allow readers to continue to read consistent versions of data concurrently
with update activity.
• MongoDB does not implement an MVCC system, and therefore readers are
prevented from reading a document that is being updated.
MongoDB Locking
• The granularity of MongoDB locks has changed during its history.
• In versions prior to MongoDB 2.0, a single global lock serialized all write
activity, blocking all concurrent readers and writers of any document
across the server for the duration of any write.
MongoDB Locking
• Lock scope was increased to the database level in 2.2, and to the
collection level in 2.8.
• All reads are directed to the primary server, which will always have the latest version of a
document. However, we saw in the previous chapter that we can configure the MongoDB
read preference to allow reads from secondary servers, which might return stale data.
• Eventually all secondary servers should receive all updates, so this behavior can loosely
be described as “eventually consistent.”
HBase Consistency
• HBase provides strong consistency for individual rows: HBase clients
cannot simultaneously modify a row in a way that would cause it to
become inconsistent.
• During an update to any column or column family within a row, the entire
row will be locked by the RegionServer to prevent a conflicting update to
any other column.
HBase Consistency
• Read operations do not acquire locks and reads are not blocked by write
operations.
• When read and write operations occur concurrently, the read will read a
previous version of the row rather than the version being updated.
Eventually Consistent Region Replicas
• In earlier versions of HBase, strong consistency for all reads was
guaranteed—you were always certain to read the most recently written
version of a row.
• However, if consistency for a read is configured for timeline consistency, then a read
request will first be sent to the primary RegionServer, followed shortly by duplicate
requests to the secondary RegionServer.
• The first server to return a result completes the request. Remember that the primary gets a
head start in this contest, so if the primary is available it will usually be the first to return.
Timeline Consistency
• The scheme is called timeline consistency because the secondary RegionServer
always receives region updates in the same sequence as the primary.