0% found this document useful (0 votes)
126 views

B - Tree

B-Trees and B+ Trees are multi-level index structures that allow efficient retrieval of records from disk-based databases by organizing data pointers in leaf nodes and index keys in internal nodes, with B+ Trees additionally having leaf nodes linked through sibling pointers to allow sequential access to records. The structures are balanced through splitting and merging of nodes as records are inserted and deleted to maintain a minimum occupancy and keep search paths equal in length.

Uploaded by

Devinder Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
126 views

B - Tree

B-Trees and B+ Trees are multi-level index structures that allow efficient retrieval of records from disk-based databases by organizing data pointers in leaf nodes and index keys in internal nodes, with B+ Trees additionally having leaf nodes linked through sibling pointers to allow sequential access to records. The structures are balanced through splitting and merging of nodes as records are inserted and deleted to maintain a minimum occupancy and keep search paths equal in length.

Uploaded by

Devinder Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

+

B -Trees
Indexing Problem
Solution – Simple Index File
A
B-Trees B
C

G
D
M E
F

X
B+ Trees – Multi-level Index
Structure
Internal nodes
Leaf Nodes
Order
• The order of a B+ -Tree is the maximum number of
pointers that an internal node can hold.

• A B+ -Tree of order m can hold


– At most m – 1 keys and m pointers.
– At least ceil(m / 2) pointers.

• Example: A node of order 4.


Order Size & Tree Height

Small Order Size Large Order Size

• When two B+ -Trees have the same number of keys,


the order size determines the height.
Calculating the Order Size
• B = Size of block
• K = Size of Key
• P = Size of Pointer
• O = Order of tree

B  O  P   O  1 K  BK


O
PK
• Example: B = 512, K = 9, P = 6
521
 34.73  O
15
Size
• On an average 69% part of the nodes are full.
• If, order is 34
– Average number of pointers = 0.69 x 34 = 23
– 23 pointers and 22 keys per node.
Nodes Entries Pointers
Root 1 22 23
Level 1 23 506 529
Level 2 529 11,638 12,167
Leaf 12,167 267,674
Height
• Let, m be the order of B+-Tree, then minimum
number of nodes at each level.
Contd…
• A B+ -Tree with N keys has N + 1 leaf nodes,
• Therefore,
Example
Properties
• All paths from root to leaf are of the same length.
• The root node points to at least two nodes.
• All non-root nodes are at least half full.
• A B+ -Tree grows upwards.
• A B+ -Tree is balanced.
• Sibling pointers allow sequential searching.
B+-Tree Queries
• Exact match queries.
• Search for record with key 42.

– 42 < 51? Yes. Choose left sub-tree.


– 42 < 40? No. Choose right sub-tree.
– 42 == 40? No. Move to next key in the leaf.
– 42 == 42? Found. Access the record with key 42.
Contd…
• Range queries.
• Search for records with keys >= 40 and <= 80.

– Start a search for lower range, i.e. 40.


– 40 < 51? Yes. Choose left sub-tree.
– 40 < 40? No. Choose right sub-tree.
– 40 == 40? Yes. Access the record with key 40 and keep on moving
right using a sibling pointer, till key is <= 80.
Insertion
• Similar to B-Trees, i.e. a new value gets inserted at
the leaf only.
• Use post split.
– First insert the key,
– Then check for overflow.
• How to handle an overflow?
– Depends on the type of a node being overflowed.
• Leaf node, or
• Internal node.
Overflow – Leaf Node
• Split into two nodes.
• First node contains ceil((m-1)/2) values.
• Second node contains the remaining values.
• Copy the smallest search-key value of the second
node to the parent node.

• Example: Insert 8 in the leaf node.


Contd…
Overflow – Internal Node
• Split into two nodes:
• First node contains ceil(m/2) – 1 values.
• Move the smallest of the remaining values,
together with pointer, to the parent.
• Second node contains the remaining values.

• Example: Insert 8 in the internal node.


Contd…
Insertion
• Construct a B+ tree for (1, 4, 7, 10, 17, 21, 31, 25,
19, 20, 28, 42) with order 4.
• Insert 1, 4, 7.

• Insert 10.

• Insert 17, 21.


Insert 1, 4, 7, 10, 17, 21, 31, 25, 19, 20, 28, 42
• Insert 31, 25.

• Insert 19, 20.


Insert 1, 4, 7, 10, 17, 21, 31, 25, 19, 20, 28, 42
• Insert 28, 42.
Example 2
• Insert 4 into the following B+Tree (order = 3).
Contd…
Construct B-Tree and B+-Tree of order 3
• 8, 5, 1, 7, 3, 12, 9, 6.
• At most 2 keys and 3 pointers.
• At least (ceil(3/2) – 1 = 1) key and (ceil(3/2) = 2)
pointers.
• Odd order: use post split for B-tree as well.
5 8 B-tree 7 B+-tree

1 3 6 7 9 12 5 8 9

1* 3* 5* 6* 7* 8* 9* 12*
Insert 30
31 60

6 11 19 26 38 45

31* 35*

38* 42*

1* 2* 11* 13* 15* 26* 27* 28* 29*

6* 8* 10* 19* 25*


Contd…
19 31 60

38 45
6 11
31* 35*
38* 42*
26 28

1* 2* 11* 13* 15* 26* 27* 28* 29* 30*


6* 8* 10* 19* 25*
Algorithm Insert(root, k, v)
• Input: The root pageID of a B+-tree, the key k and the value
v of a new object.
• Prerequisite: The object with key = k does not exist in the
tree.
• Action: Insert the new object into the B+-tree.
1. Starting with the root node, perform an exact-match for
key = k till a leaf node. Let the search path be x1, x2, ...,
xh, where x1 is the root node, xh is the leaf node where
the new object should be inserted into, and xi is the
parent node of xi+1 where 1 ≤ i ≤ h - 1.
2. Insert the new object with key = k and value = v into xh.
Contd…
3. Let i = h.
while xi overflows
a. Split xi into two nodes, by moving the larger half of the keys
into a new node x'i. If xi is a leaf node, link x'i into the
double-linked list among leaf nodes.
b. Identify a key kk to be inserted into the parent level along
with the child pointer pointing to x'i. The choice of kk
depends on the type of node xi. If xi is a leaf node, perform
Copy-up. That is, the smallest key in x'i is copied as kk to
the parent level. On the other hand, if xi is an index node,
perform Push-up. This means the smallest key in x'i is
removed from x'i and then stored as kk in the parent node.
Contd…
c. if i == 1 /* the root node overflows */
i. Create a new index node as the new root. In the new root,
store one key = kk and two child pointers to xi and x'i.
ii. return
d. else
i. Insert a key kk and a child pointer pointing to x'i into node
xi – 1 .
ii. i = i – 1.
e. end if
end while
Deletion
• Actual values are present only at the leaf nodes as
internal nodes are index nodes only, so again deletion
will take place at the leaf.
• Use post merge.
– First delete the key,
– Then check for underflow.
• How to handle an underflow?
– Depends on the type of a node being underflowed.
• Leaf node, or
• Internal node.
Underflow – Leaf Node
• Delete 10. n = 5. Sibling has keys, so borrow.
Contd…
• Delete 10. n = 5. Sibling does not has keys, so
merge.

• Discard parent.
Underflow – Internal Node
• Delete 10. n = 5. Sibling has keys, so borrow.
Contd…
• Delete 10. n = 5. Sibling does not has keys, so
merge.

• Keep parent.
Example (Delete 3. n = 3)
Example - Delete 28, 31, 21, 25, 19.

• Delete 28. Merging.


Example - Delete 28, 31, 21, 25, 19.

• Delete 31 (simple deletion), 21 (Merging at two levels).


Example - Delete 28, 31, 21, 25, 19.

• Delete 25 (simple deletion), 19 (Merging).


Algorithm Delete(root, k)
• Input: The root pageID of a B+-tree, the key k of the object
to be deleted.
• Prerequisite: The object with key = k exists in the tree.
• Action: Delete the object with key = k from the B+-tree.
1. Starting with the root node, perform an exact-match for
key = k. Let the search path be x1, x2, ..., xh, where x1 is
the root node, xh is the leaf node that contains the object
with key = k, and xi is the parent node of xi+1 where 1 ≤ i
≤ h - 1.
2. Delete the object with key = k from xh.
Contd…
3. If h == 1, return. This is because the tree has only one node
which is the root node which can underflow.
4. Let i = h.
while xi underflows
a. if an immediate sibling node s of xi has least one more entry
than minimum occupancy.
i. Re-distribute entries evenly between s and xi.
ii. Corresponding to the re-distribution, a key kk in the parent
node xi – 1 needs to be modified. If xi is an index node, kk is
dragged down to xi and a key from s is pushed-up to fill in
the place of kk. Otherwise, kk is simply replaced by a key
in s.
iii. return.
Contd…
b. else
i. Merge xi with a sibling node s. Delete the
corresponding child pointer in xi – 1.
ii. If xi is an index node, drag the key in xi – 1, which
previously divides xi and s, into the new node xi.
Otherwise, delete that key in xi – 1.
iii. i = i – 1.
c. end if
end while

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy