Indexing: Contents
Indexing: Contents
Indexing: Contents
7. Indexing
Contents:
• Single-Level Ordered Indexes
• Multi-Level Indexes
• B+ Tree based Indexes
• Index Definition in SQL
Basic Concepts
• Index files are typically much smaller than the original file
because only the values for search key and pointer are stored.
• There are two basic types of indexes:
– Ordered indexes: Search keys are stored in a sorted order
(main focus here in class).
– Hash indexes: Search keys are distributed uniformly across
“buckets” using a hash function.
Secondary Indexes
Multi-Level Index
Index Data
block 0 block 0
Data
block 1
Index
block 1
outer index
inner index
record file
Example of a B+-Tree
S25 S70
Pn
Pn Pn Pn
Li Lj
S51 2 TID TID S55 n TID TID ....... TID S60 2 TID TID
• If Li, Lj are leaf nodes and i < j , Li’s search key values are
less than Lj ’s search key values.
• Pn points to next leaf node in search key order.
Queries on B+-Trees
Find all records with a search key value of k
• Start with the root node
– Examine the node for the smallest search key value > k.
– If such a value exists, assume it is Ki. Then follow Pi to
the child node.
– Otherwise, k ≥ Km−1, where are m pointers in the node.
Then follow Pm to the child node.
• Further comments:
– If there are V search key values in the file, the path from
the root to a leaf node is no longer than dlogdn/2e(V )e.
– In general a node has the same size as a disk block, typically
4KB, and n ≈ 100 (40 bytes per index entry).
– With 1, 000, 000 search key values and n = 100, at most
log50(1, 000, 000) = 4 nodes are accessed in the lookup!
Data records
(blocks)
Data records
(blocks)
• Example:
create index city name idx on CITY(name);