Unit Iv
Unit Iv
IMPLEMENTATION TECHNIQUES
JAYAPRIYA K N
AP/CSE
Column Oriented Storage
• Column oriented storage which is also called
as columnar database management system
that stores data in columns rather than rows.
• The advantage of column oriented storage is
to retrieve the result of query very efficiently.
• It also improves disk I/O performances.
EXAMPLE
Indexing and Hashing
• An index is a data structure that organizes
data records on the disk to make the retrieval
of data efficient.
• The search key for an index is collection of one
or more fields of records using which we can
efficiently retrieve the data that satisfy the
search conditions.
• The indexes are required to speed up the
search operations on file of records.
• There are two types of indices -
• Ordered Indices: This type of indexing is based
on sorted ordering values.
• Hash Indices: This type of indexing is based on
uniform distribution of values across range of
buckets. The address of bucket is obtained
using the hash function.
FACTORS
• These techniques are evaluated based on
following factors -
• Access Types: It supports various types of
access that are supported efficiently.
• Access Time: It denotes the time it takes to
find a particular data item or set items.
• Insertion Time: It represents the time
required to insert new data item.
FACTORS
• Deletion Time: It represents the time
required to delete the desired data item.
• Space overhead: The space is required to
occupy the index structure. But allocating such
extra space is worth to achieve improved
performance.
Ordered Indices
• Primary index:
• An index on a set of fields that includes the
primary key is called a primary index. The
primary index file should be always in sorted
order.
• Primary indexing contains the primary key as
its search key.
• Once if you are able to locate the first entry of
the record containing block, other entries are
stored continuously.
Example
Clustered index:
• The index is created on non-primary key
columns which may not be unique for each
record.
• In such cases, in order to identify the records
faster, we will group two or more columns
together to get the unique values and create
index out of them.
• This method is known as clustering index.
Example
• Basically, records with similar characteristics
are grouped together and indexes are created
for these groups.
• For example, students studying in each
semester are grouped together. i.e.; 1st
semester students, 2nd semester students,
3rd semester students etc. are grouped.
Dense and Sparse Indices
• 1) Dense index:
• An index record appears for every search key
value in file.
• This record contains search key value and a
pointer to the actual record.
Dense index
Sparse index:
• Index records are created only for some of the
records.
• To locate a record, we find the index record
with the largest search key value less than or
equal to the search key value we are looking
for.
Example
Single and Multilevel Indices
• Single level indexing:
• A single-level index is an auxiliary file that makes
it more efficient to search for a record in the
data file.
• The index is usually specified on one field of the
file.
• A binary search on the index yields a pointer to
the file record.
• The types of single level indexing can be primary
indexing, clustering index or secondary indexing.
Example
Multilevel indexing:
• If single-level index is used, then a large size
index cannot be kept in memory which leads
to multiple disk accesses.
• Multi-level Index helps in breaking down the
index into several smaller indices in order to
make the outermost level so small that it can
be saved in a single disk block, which can
easily be accommodated anywhere in the
main memory.
Multilevel indexing:
Secondary Indices
• In this technique two levels of indexing are
used in order to reduce the mapping size of
the first level and in general.
• Initially, for the first level, a large range of
numbers is selected so that the mapping size
is small. Further, each range is divided into
further sub ranges.
• It is used to optimize the query. processing
and access records in a database with some
information other than the usual search key.
Example
B+ Tree Index Files
• The B+ tree is similar to binary search tree. It is a balanced tree in which
the internal nodes direct the search.
• The leaf nodes of B+ trees contain the data entries.
Structure of B+ Tree
• The typical node structure of B+ node is as follows –
• It contains up to n – 1 search-key values k1, k2, ……, kn-1 and n pointers P1,
P2,..., Pn
• The search-key values within a node are kept in sorted order; thus, if i < j,
then Ki<Kj.
• To retrieve all the leaf pages efficiently we have to link them using page
pointers. The sequence of leaf pages is also called as sequence set.
Example
Characteristics of B+ Tree
Following are the characteristics of B+ tree.
• 1) The B+ tree is a balanced tree and the
operations insertions and deletion keeps the
tree balanced.
• 2) A minimum occupancy of 50 percent is
guaranteed for each node except the root.
• 3) Searching for a record requires just traversal
from the root to appropriate leaf.
Insertion Operation
• Algorithm for insertion :
• Step 1: Find correct leaf L.
• Step 2: Put data entry onto L.
• i) If L has enough space, done!
• ii) Else, must split L (into L and a new node L2)
• Allocate new node
• Redistribute entries evenly
• Copy up middle key.
• Insert index entry pointing to L2 into parent of L.
Continuation
• Step 3: This can happen recursively
• i) To split index node, redistribute entries
evenly, but push up middle key. (Contrast with
leaf splits.)
• Step 4: Splits "grow" tree; root split increases
height.
• i) Tree growth: gets wider or one level taller at
top.
Example
• Construct B+ tree for following data.
30,31,23,32,22,28,24,29, where number of po
• Solution: In B+ tree each node is allowed to
have the number of pointers to be 5. That
means at the most 4 key values are allowed in
each node.
• Step 1: Insert 30,31,23,32. We insert the key
values in ascending order.
Step 2: Now if we insert 22, the sequence will be 22, 23, 30, 31, 32. The
middle key 30, will go up.
Step 4: Insert 29. The sequence becomes 22, 23, 24, 28, 29. The middle key
24 will go up. Thus we get the B+ tree.