Tugas 2 SMBD
Tugas 2 SMBD
Tugas 2 SMBD
Mata Kuliah:
INDEXING
Nama
Muhammad Bachtyar Rosyadi
NRP
5214201011
1. A PARTS file with Part# as the key field includes records with the
following Part# values: 23, 65, 37, 60, 46, 92, 48, 71, 56, 59, 18, 21, 10,
74, 78, 15, 16, 20, 24, 28, 39, 43, 47, 50, 69, 75, 8, 49, 33, 38. Suppose
that the search field values are inserted in the given order in a B+-tree
of order p = 4 and pleaf = 3; show how the tree will expand and what
the final tree will look like.
Answer:
A B + -tree of order p=4 implies that each internal node in the tree
(except possibly the root) should have at least 2 keys (3 pointers) and
at most 4 pointers. For p leaf =3, leaf nodes must have at least 2 keys
and at most 3 keys. The figure on page 50 shows how the tree
progresses as the keys are inserted. We will only show a new tree when
insertion causes a split of one of the leaf nodes, and then show how the
split propagates up the tree. Hence, step 1 below shows the tree after
insertion of the first 3 keys 23, 65, and 37, and before inserting 60
which causes overflow and splitting. The trees given below show how
the keys are inserted in order. Below, we give the keys inserted for each
tree:
1 :23, 65, 37; 2:60; 3:46; 4:92; 5:48, 71; 6:56; 7:59, 18; 8:21; 9:10; 10:7
4;
11:78; 12:15; 13:16; 14:20; 15:24; 16:28, 39; 17:43, 47; 18:50, 69; 19:7
5;
20:8, 49, 33, 38;
2. Repeat Exercise 1,but use a B-tree of order p = 4 instead of a B+-tree.
3. Supposethatthefollowingsearchfieldvaluesaredeleted,inthegivenorder,
from the B+-tree of Exercise 2; show how the tree will shrink and show
the final tree. The deleted values are 65, 75, 43, 18, 20, 92, 59, 37.
Answer:
An important note about a deletion algorithm for a B + -tree is that
deletion of a key value from a leaf node will result in a reorganization of
the tree if: (i) The leaf node is less than half full; in this case, we will
combine it with the next leaf node (other algorithms combine it with
either the next or the previous leaf nodes, or both), (ii) If the key value
deleted is the rightmost (last) value in the leaf node, in which case its
value will appear in an internal node; in this case, the key value to the
left of the deleted key in the left node replaces the deleted key value in
the internal node. Following is what happens to the tree number 19
after the specified deletions
Deleting 65 will only affect the leaf node. Deleting 75 will cause a leaf
node to be less than half full, so it is combined with the next node; also,
75 is removed from the internal node leading to the following tree:
Deleting 43 causes a leaf node to be less than half full, and it is combined with
the next node. Since the next node has 3 entries, its rightmost (first) entry 46 can replace 43
in both the leaf and internal nodes, leading to the following tree:
Next, we delete 18, which is a rightmost entry in a leaf node and hence appears in an internal
node of the B + -tree. The leaf node is now less than half full, and is combined with the next
node. The value 18 must also be removed from the internal node, causing underflow in the
internal node. One approach for dealing with underflow in internal nodes is to reorganize the
values of the underflow node with its child nodes, so 21 is moved up into the underflow node
leading to the following tree
Deleting 20 and 92 will not cause underflow. Deleting 59 causes underflow, and the remaining
value 60 is combined with the next leaf node. Hence, 60 is no longer a rightmost entry in a
leaf node and must be removed from the internal node. This is normally done by moving 56 up
to replace 60 in the internal node, but since this leads to underflow in the node that used to
to the left subtree. In this case, the resulting tree may look as follows:
4. A PARTS file with Part# as the hash key includes records with the
following Part# values: 2369, 3760, 4692, 4871, 5659, 1821, 1074,
7115, 1620, 2428, 3943, 4750, 6975, 4981, and 9208. The file uses
eight buckets, numbered 0 to 7. Each bucket is one disk block and holds
two records. Load these records into the file in the given order, using
the hash function h(K) = K mod 8. Calculate the average number of
block accesses for a random retrieval on Part#.
Answer:
The records will hash to the following buckets:
K h(K) (bucket number)
RECORD
H(K)(Bucket
Number)
Record 1
236
9
Record 2
376
0
Record 3
469
2
Record 4
487
1
Record 5
565
9
Record 6
182
1
Record 7
107
4
Record 8
711
5
Record 9
162
0
Record 10
242
Binary h(k)
overflow
8
Record 11
394
3
Record 12
475
0
Record 13
697
5
Record 14
498
1
Record 15
920
8
overflow
H(K)(Bucket
Number)
Binary h(k)
Record 1
2369
65
1000001
Record 2
3760
48
0110000
Record 3
4692
84
1010100
Record 4
4871
0000111
Record 5
5659
27
0011011
Record 6
1821
29
0011101
Record 7
1074
50
0110010
Record 8
7115
75
1001011
Record 9
1620
84
1010100
Record 10
2428
124
1111100
Record 11
3943
103
1100111
Record 12
4750
14
0001110
Record 13
6975
91
1011011
Record 14
4981
117
1110101
Record 15
9208
120
1111000
6. Load the records of Exercise 3 into an expandable hash file, using linear
hashing. Start with a single disk block, using the hash function h0 = K
mod 20, and show how the file grows and how the hash functions
change as the records are inserted. Assume that blocks are split
whenever an overflow occurs, and show the value of n at each stage.