B+ Tree Indexing
B+ Tree Indexing
B+ Tree Indexing
An
illustration of a B+tree. | Image: Dhanushka Madushan
The 13, 30, 9, 11, 16 and 38 non-leaf values are again repeated
in leaf nodes.
Leaf nodes include all values and all of the records are in sorted
order. B+tree allows you to do the same search as B-tree, but it
also allows you to travel through all the values in a leaf node if
we put a pointer to each leaf node as follows.
Illustration
of a B+tree with leaf node referencing. | Image: Dhanushka
Madushan
Structure of B+ Trees
B+ Trees
B+ Trees contain two types of nodes:
• Internal Nodes: Internal Nodes are the nodes that are
present in at least n/2 record pointers, but not in the root
node,
• Leaf Nodes: Leaf Nodes are the nodes that have n pointers.
The Structure of the Internal Nodes of a B+ Tree of Order
‘a’ is as Follows
1. Each internal node is of the form: <P1, K1, P2, K2, …..,
Pc-1, Kc-1, Pc> where c <= a and each Pi is a tree pointer
(i.e points to another node of the tree) and, each Ki is a
key-value (see diagram-I for reference).
2. Every internal node has : K1 < K2 < …. < Kc-1
3. For each search field value ‘X’ in the sub-tree pointed at by
Pi, the following condition holds: Ki-1 < X <= Ki, for 1 < I
< c and, Ki-1 < X, for i = c (See diagram I for reference)
4. Each internal node has at most ‘aa tree pointers.
5. The root node has, at least two tree pointers, while the other
internal nodes have at least \ceil(a/2) tree pointers each.
6. If an internal node has ‘c’ pointers, c <= a, then it has ‘c –
1’ key values.
Data stored on the leaf node makes Searching is slow due to data stored
the search more accurate and faster. on Leaf and internal nodes.
Deletion is not difficult as an Deletion of elements is a
element is only removed from a leaf complicated and time-consuming
node. process.
Linked leaf nodes make the search
You cannot link leaf nodes.
efficient and quick.
Search Operation
In B+ Tree, a search is one of the easiest procedures to execute
and get fast and accurate results from it.
The following search algorithm is applicable:
• To find the required record, you need to execute the binary
search on the available records in the Tree.
• In case of an exact match with the search key, the
corresponding record is returned to the user.
• In case the exact key is not located by the search in the
parent, current, or leaf node, then a “not found message” is
displayed to the user.
• The search process can be re-run for better and more
accurate results.
Search Operation Algorithm
1. Call the binary search method on the records in the B+ Tree.
2. If the search parameters match the exact key
The accurate result is returned and displayed to the user
Else, if the node being searched is the current and the
exact key is not found by the algorithm
Display the statement "Recordset cannot be found."
Output:
The matched record set against the exact key is displayed to the
user; otherwise, a failed attempt is shown to the user.
Insert Operation
The following algorithm is applicable for the insert operation:
• 50 percent of the elements in the nodes are moved to a new
leaf for storage.
• The parent of the new Leaf is linked accurately with the
minimum key value and a new location in the Tree.
• Split the parent node into more locations in case it gets
fully utilized.
• Now, for better results, the center key is associated with the
top-level node of that Leaf.
• Until the top-level node is not found, keep on iterating the
process explained in the above steps.
Insert Operation Algorithm
1. Even inserting at-least 1 entry into the leaf container does
not make it full then add the record
2. Else, divide the node into more locations to fit more
records.
a. Assign a new leaf and transfer 50 percent of the node
elements to a new placement in the tree
b. The minimum key of the binary tree leaf and its new key
address are associated with the top-level node.
c. Divide the top-level node if it gets full of keys and
addresses.
i. Similarly, insert a key in the center of the top-level
node in the hierarchy of the Tree.
d. Continue to execute the above steps until a top-level node
is found that does not need to be divided anymore.
3) Build a new top-level root node of 1 Key and 2 indicators.
Output:
The algorithm will determine the element and successfully insert
it in the required leaf node.
1. Copy Up: When a key is inserted into a leaf node and causes
the node to become full, the key is copied up to the parent node.
This process is called "copying up" because the key is
propagated up the tree to maintain the B+ tree's structure and
balance.
Both "copy up" and "push up" operations are important for
maintaining the integrity and structure of a B+ tree index, which
is commonly used in database systems for efficient data
retrieval.
Advantages of Indexing
Important pros/ advantage of Indexing are:
• It helps you to reduce the total number of I/O operations
needed to retrieve that data, so you don’t need to access a
row in the database from an index structure.
• Offers Faster search and retrieval of data to users.
• Indexing also helps you to reduce tablespace as you don’t
need to link to a row in a table, as there is no need to store
the ROWID in the Index. Thus you will able to reduce the
tablespace.
• You can’t sort data in the lead nodes as the value of the
primary key classifies it.
Disadvantages of Indexing
Important drawbacks/cons of Indexing are:
• To perform the indexing database management system, you
need a primary key on the table with a unique value.
• You can’t perform any other indexes in Database on the
Indexed data.
• You are not allowed to partition an index-organized table.
• SQL Indexing Decrease performance in INSERT,
DELETE, and UPDATE query.
Summary:
• Indexing is a small table which is consist of two columns.
• Two main types of indexing methods are 1)Primary
Indexing 2) Secondary Indexing.
• Primary Index is an ordered file which is fixed length size
with two fields.
• The primary Indexing is also further divided into two types
1)Dense Index 2)Sparse Index.
• In a dense index, a record is created for every search key
valued in the database.
• A sparse indexing method helps you to resolve the issues of
dense Indexing.
• The secondary Index in DBMS is an indexing method
whose search key specifies an order different from the
sequential order of the file.
• Clustering index is defined as an order data file.
• Multilevel Indexing is created when a primary index does
not fit in memory.
• The biggest benefit of Indexing is that it helps you to
reduce the total number of I/O operations needed to retrieve
that data.
• The biggest drawback to performing the indexing database
management system, you need a primary key on the table
with a unique value.