CP264 - Final Review
CP264 - Final Review
Queue
Abstract Queue: Linearly ordered collection of data elements with enqueue, dequeue, and peek operations
o Enqueue operation inserts element into collection
Linearly order of elements determined by time they're inserted
Front element is earliest inserted element; Rear is last inserted element
o Dequeue operation removes/deletes front element
o Peek operation gets front element
Queue Data Structure: Implementation of abstract queue
Queue Characteristic → First-In-First-Out (FIFO): Deletes first inserted element
Time and space complexities for queue operations → time O(1), space O(1)
Array Queue
Simple Array Queue: Queue implementation with array representation, where front and rear
variable represent index positions that deletions and insertions are done respectively
Drawbacks:
o Length of queue is bound by length of its array
o Wastes space if length of queue is shorter than length of array
Linked List Queue
Fixes drawbacks of array queues → length is dynamic + doesn’t waste space
Linked List Queue: Uses singly linked list to store queue data values; uses two pointers front and
rear to represent front and rear positions
Priority Queue
Priority Queue: Queue where each element is assigned a priority; priority of element used to
determine order in which elements are processed
Rule of processing elements of priority queue:
1. Element with higher priority is processed before element with lower priority
2. Two elements of same priority processed by first come first serve order
Applications of Queues
Whenever an algorithm needs to remember data items and a FIFIO is required for item processing,
queue data structure can be an option
1. Used to store intermediate data within algorithm for FIFO retrieval (e.g. Breadth-first search)
2. Used as waiting lists for single shared resource (e.g. printer, disk, CPU, network device)
3. Used in OS for handling interrupts; If interrupts must be handled in order of arrival, FIFO queue
is appropriate data structure
4. Used to transfer data asynchronously (e.g. Pipes, file I/O, sockets)
5. Used in simulations for services
6. Used for play lists, adding songs to the end, playing from the front of the list.
7. Priority queues are in OS for processes execution management.
8. Used to remember the path of explorations if FIFO retrieval is needed.
1
Stack
Abstract Stack: Linearly ordered collections of data elements with push, pop, and peek operations
Stack Characteristic → Last-In-First-Out (LIFO): Pops the most recently pushed element
Stack Data Structure: Implementation of abstract stack
Stack: Linear data structure with push and pop operations satisfying LIFO property
Why is only one variable needed for accessing a stack data structure?
1. Stack operations only work on one side of the stack list, one variable does the jobs.
2. It’s more time efficient to maintain one variable than more variables.
3. It’s more space efficient to use one variable than more stack data structure.
2. Time and space complexities for stack operations -> time O(1), space O(1)
Array Stack
Array Stack: Stack representation by an array, top variable represents index position where
push, pop, and peek operations are done
Linked List Stack
Linked List Stack: Use singly linked list to implement stack, pointer top points to first node
Array vs. Linked List Stack
2
Call Stack and Recursive Calls
Call Stack: Array stack
Stack Region: Array holding call stack
When function is called → function data including arguments, local variables, registers, and
return addresses pushed onto call stack
When function call is done → function data popped off
Recursion: Method of solving problem by recursive function; has two major cases:
o Base Case - where problem is simple enough to solve directly without making further
calls to the same function
o Recursive Case -
Problem is divided into smaller sub-parts
Function calls itself but with sub-parts of problem obtained in first step
Result obtained by combining solution of sub-parts
Recursion can be classified based on:
o Direct vs. Indirect Recursion: Whether function calls itself directly or indirectly
Direct Recursion: Function explicitly calls itself
Example: Recursive factorial is a directly recursive function
Indirect Recursion: Contains call to another function that ultimately calls the
function
o Tail-Recursive vs. Not: Whether any operation is pending each recursive call
Tail Recursion: No operations are pending to be performed when recursive
function returns to its caller, i.e. When called function returns, returned value is
immediately returned from calling function
Tail-recursive functions preferred because they can be optimized to an iterative
type function
o Linear vs. Tree-Recursive: Structure of calling pattern
Linear Recursion: Pending operation involves on recursive call to the function,
i.e. Recursion tree is a path
Tree Recursion: Pending Operation makes more than one recursive call to
function
Why is stack used to manage function calls?
o After a function call, it needs to return to the call position of the calling function and all
local data of the calling function should be available
The running time of a recursion function call is proportional to?
o The number of its recursion function calls
The space usage of a recursion function call is proportional to?
o The depth of the recursive function calls
Why should recursive functions be avoided in practical programming?
o Not time efficient due to the number of function calls.
o Memory usage is proportional to the function parameter.
3
Linked List
Singly Linked List
Singly Linked Lists: Linear collection of nodes with the following properties
o Node contains data and one pointer pointing to next node, last node points to NULL
o Accessed by pointer pointing to first node of linked list
Array vs. Linked List
Cons: Cons:
Maximum number of array elements must be Linked list need extra space to store the links
given at compile time either by a constant or and cost more time in maintaining the links in
by a variable. operations.
Waste memory space if there are not many It needs to traverse a linked list to access a
data objects to store and run out of memory
specific node, so random access operation on
space when there are more data objects to
linked lists is less efficient than that on arrays.
store.
Cost more in insert and delete operations due
to shifting.
Doubly Linked List
Doubly Linked List: Linear collection of nodes with the following properties
o Node structure contains data members and two pointers
next pointing to next node
prev pointing to previous node
o First node has prev pointing to NULL and last node has next pointing to NULL
o Has two accessing pointers, start pointing to first node and end pointing to last node
Circular Linked List
o Circular Linked List: Linear collection of nodes with the following properties
Node structure consists of data members and one address pointer next pointing to the
next node
Last node has next pointing to first node
Accessed by pointer start pointing to start node Advantage -> supports traversing
o Circular Doubly Linked Lists: Has the following properties cyclically in both directions efficiently
next of the last node points to the first node
prev of the first node points to the last node
Has one access pointer start pointing to any node
Need to know all operations and functions on the above data structures, both iterative and recursive
4
Tree Data Structures
Abstract Tree: Collection of nodes connected in a tree structure with set of operations
Tree Data Structure: Implementation of abstract tree in programming language where parent-child
relations of nodes represented by specific method
o Provides more efficient way of executing certain algorithms (e.g. Search)
Terms and Notation
Parent: If node N has subtree T1, N1 is root of T1 (i.e. (N, N1) is an edge) then N is parent of N1
Child: If node N has subtree T1, N1 is root of T1 (i.e. (N, N1) is an edge) then N1 is child of N
Siblings: Nodes n2 and n3 have the same parent n1, i.e. Both (n1, n2) and (n1, n3) are edges of the tree
Ancestor: If node n3 is in sub-tree n1, n1 is ancestor of n3
Descendent: If node n3 is in sub-tree n1, n3 is descendent of n1
Leaf: Node without child
Root: Node without ancestor
Path: Sequence of different edges such that child node of edge is parent node of next edge
o Length of path is number of edges in path
o Example: (n1, n2),(n2, n3), ..., (nk, nk+1)
Cycle: Path + edge connecting last node to first node in path
o Example: Edge sequence (n1, n2),(n2, n3),(n3, n1) is a cycle of length 3
Depth of Node: Length of path from root to node
o Root of tree has depth 0
Level: Set of nodes of the same depth
o Level i consists of all nodes of depth i
Width: Maximum number of nodes over all three levels
Height of Tree: 1 + Maximum height of sub-trees
o Height of tree = height of its root
o If tree contains one node (the root) height is 1
Forest: Disjoint union of general trees, i.e. Ordered set of zero or more general trees
Classification of Trees
General Trees: No restrictions, each node has zero or more children
Binary Trees: Each node has at most two children
Ternary Tree: Each node has at most three children
Binary Search Trees: Binary trees for efficient searching
o AVL Tree: Height balanced binary search trees
o Red-black-trees: Type of balanced binary search tree
o Splay Trees: Binary search trees efficient for searching recent accessed nodes
Multi-way Search Trees: Generalization to binary search trees
o B-trees: Balanced multi-way search trees
Complete Binary Tree: Binary tree where every node but last two have two children and every node in last
level has either left or no child
Perfect Binary Tree: Last level is fully filled, properties:
o Number of nodes in perfect binary tree is determined by its height
o Level k has 2k nodes -> ∴ last level has 2k nodes where h is the height
o Width of perfect binary tree of n nodes is 2h
Tree Theorems
Theorem 1.
1. A perfect binary tree of height h has2 h−1 nodes
2. A perfect binary tree of n nodes has height log (n+1)
n+1
3. A perfect binary tree of n nodes has width
2
5
Note: If given an expression tree you
Theorem 2. Let M(h) be the maximum number of nodes over all binary trees of height h, then M(h) ≤ 2h-1
should be able to write the in-order
Theorem 3. The height of a binary tree of n nodes is at least log (n+1)
and pre-order AND you should be
Application of Binary Trees able to draw a Huffman tree
Expression Tree
Expression Tree: Binary tree where each non-leaf node represents operator and each leaf node
represents operand
o Provides alternative method to represent algebra expression
Huffman Tree
ASCII code uses fixed-length code (i.e. Every character has node of 8 bits)
Huffman Coding: Variable length coding method to code symbols depending on frequencies of
symbols
o Codes of symbols by Huffman coding have different lengths
High frequent symbol has short length code
Low frequent symbol has longer length code
o Huffman code often used in data comparison
How to draw Huffman tree:
1. Get and order frequency of each character
2. Start from the bottom of the list; draw least frequent items from most to least frequent
from right to left
3. Starting from the right and moving left, give each pair of nodes a parent whose value is
equal to the sum of their frequencies
4. Continue doing step (3) until each node has a parent
Huffman code of symbol is derived from Huffman tree from path connecting root to leaf node of symbol
o Left child edge gives 0
o Right child edge gives 1
Other
1. Store data in a hierarchical structure, representing the relations of data elements
2. Represent collection of data objects for efficient search
3. Implement other types of data structures like hash tables, sets, and maps
4. Binary search trees, elf-balancing trees, AVL trees, red-black trees used to store record data for
efficient search, insert and delete operation
5. B-trees used to store tree structures on disc and to index a large number of records, and
secondary indexes in databases, where the index facilitates a select operation to answer some
range criteria
6. Compiler construction
7. Database design
8. File system directories
9. Binary trees used to represent algebraic expressions and Huffman trees for encoding/decoding
6
Binary Trees and Operations
Pre/In/Post-Order Traversal
Pre-order Traversal: Define recursively as -> Visit tree in order of root, left sub-tree, and right
sub-tree; same rules apply traversing left and right sub-trees
In-order Traversal: Visiting tree in order of left-subtree, root, right-subtree
Post-order Traversal: Visiting tree in order of left sub-tree, right sub-tree, root
Depth-First Order
Depth-first Order: Explores nodes starting at roots leftmost-subtree until all nodes are visited,
then moves to right sub-trees
Depth-First-Search (DFS) Algorithm: Using pre-order traversal method that checks data value of
node against search key and, if matched, returns node and stops traversal
Breadth-First Order
Breadth-first Order: Visits tree nodes using queue by level from low level to high level, and at
each level from left to right
Breadth-First-Search (BFS): Using breadth-first traversal with action of checking node data
against search key and returning node if it matches
Insert
// your code here
Time: O(h), space: O(1)
Delete
// your code here
time: O(h), space: O(1)
7
AVL Tree
Height-balanced Tree
Balanced Binary Tree: Left and right subtrees height are equal or don’t differ greatly at any node
Balancedness is defined and measured precisely by the balance factor
Balance Factor: Height of left sub-tree minus height of right sub-tree
o Balance Factor=h eig h t ( ¿ )−h eig ht (rig h t subtree)
Left Heavy: Nodes balance factor is greater than 0
Right Heavy: Nodes balance factor is less than 0
Balanced: Balance factor is -1, 0, or 1
Unbalanced: Balance factor is not -1, 0, or 1
Height Balanced: Binary tree where all nodes are balanced
Theorem 1. Let T be a height balanced tree of height h, then the number of nodes in T is at least 2h /2−1
Theorem 2. The height of a height balanced tree of n nodes is O(log n)
Theorem 3-4. The self-balancing insert/delete operation on AVL tree derives an AVL tree.
8
4. Case 4: Balance-factor(x) = -2, balance-factor(x->right) > 0
Action: Do right-left-rotation, x->right = rotate_right(x->right); rotate_left(x)
9
Concepts of Splay Trees
Splay Tree: Binary search tree (BST) with self-adjusting search, insert, and delete operations),
moves most recently accessed nodes closer to root node -> nodes are ∴ accessed faster
Splay Node: Last accessed node
Amortized Analysis: Value worst case average resource usage of an algorithm in a sequence of runs. i.e.
If a sequence of M operations takes O(M f(n)) time, we say the amortized runtime is O(f(n))
o Worse Case -> Splay trees are not balanced BST, worse case time per operation is O(n)
o Splay tree amortized time per operation for search, insertion, and deletion is O(log n)
Self-adjusting: Search/insert/delete operation consists of two major steps
1. Do BST search/insert/delete
2. Move splay node N up to root by splay operation, which consists of sequence of zig-zig
or zig-zag operations, followed by zig operation in case of need
Splay Operations
o Zig-zig Operation: Right-right-rotation or left-left-rotation, moves splay node up by two levels
o Zig-zag Operation: Left-right-rotation or right-left-rotation, moves splay node up by two levels
o Zig Operation: Single right or left rotation, moves splay node up by one level
Concepts of B-trees
When m > 2, balanced m-way search tree is shorter than balanced search, so it's more efficient
for search operation than BST
B-tree of order m: Self-balancing m-way search tree, balancedness of B-trees define by following
properties
1. Every node in B-tree except root node and leaf nodes have at least m/2 children
2. Root node has at least two children if it isn't a leaf node
3. All leaf nodes are on the same level
Properties guarantee height of B-tree of order m is O(log n); search operation has time
complexity O(log n)
Self-balancing means insert and delete operations on B-tree are done by:
1. Doing m-way search tree insertion/deletion
2. Doing self-balancing operation to restore B-tree balancedness
10
Hash Tables
Hash Tables: Data structures that has average time O(1) for search, insert, and delete operations
Concepts of Hash Tables
Abstract Hash Table: Defined by following properties
1. Stores collection of data values of certain type
One field is used as a key, other (if any) used as value so data can be viewed as value
2. Array of type of length m is used to store data records, where m is size of the hash table
3. Function h is used to map key k to integer h(k) to integer between 0 and m-1
4. h(k) is the hash/index value of k
5. Insert operation puts key-value pair (k, x) at array position of hash value index h(k),
namely hash_array[h(k)]; if array position is taken, an alternative position is used
6. Delete element by key k to remove data element of key k at hash_array[h(k)]
7. Search element by key k (called look-up) is to check data element at array position h(k).
If key value of hash_array[h(k)] marches with k, then return hash_array[h(k)], otherwise
check the data element at alternative positions
Hash Table: Data structure that implements abstract hash table with concrete hash_data_type,
hash table size, hash function h, insert, look-up, and delete operations
Hash Functions: Division Method
Hash Function: Mathematical function which maps any key k in set U to integer an integer h(k)
of given modular m. h(k) is called the hash value of k by h
o i.e. h : U → {0, 1, ..., m-1} where m is a given positive integer
Division function: h(k) = k mod m, where k is any natural number, and mod is the modulo
operator. i.e. h(k) is the remainder of k divided by m
o How to choose m? Best to choose m to be a prime number and not too close to exact
powers of 2 (e.g. m = 10)
o When k = 18, h(k) = h(18) = 18 mod 10 = 8
Drawbacks of division method:
o Consecutive keys map to consecutive hash values, ensures consecutive keys don't collide
o Consecutive locations will be occupied, may lead to degradation of performance
Collisions and How to Handle Them
Collided/Collision: If two keys are hashed to same hash value h
o Ideal Situation -> No collisions with given hash function and set of keys, meaning insert,
search, and delete operation can be done in O(1) time
o BUT collisions are unavoidable
Collision Handling/Resolving: In hash table, two collided keys will have the same hash value
index, but their corresponding data records can't be stored at the same position in the array.
Alternative locations must be found to store collided records
11
Solutions
Solution 1: When collision happens, find open position by probing and store data at open position.
Probe by:
Linear Probing
o For value stored at location generated by h(k), use hash function h(k ,i)=[h(k )+i]mod m
where
m is the size of the hash table
i is the probe number, varies from 0 to m-1
to find the first i such that position h(k, i) is open
o We know array position is open because hash table contains two types of values:
sentinel (e.g. -1) or data values
o Presence of sentinel value means location contains no data value ∴ open for storage
o Pros and cons of linear probing
Pros Cons
Linear probing finds an empty location It results in clustering, and thus a higher
by doing a linear search in the array risk that where there has been one
beginning from position h(k). collision there will be more.
The algorithm provides good memory The performance of linear probing is
caching, through good locality of sensitive to the distribution of input values.
reference
Good performance if no deletion is The performance get degraded after
applied sequence of insert and deletion operations.
Solution 2: Use alternative data structure to store all data records at the same index value
Criteria of a Good Hash Function
1. Efficient to compute hash value for any given key
o Efficient algorithm to calculate checksum of data record together with division
2. Less number of collisions for given set of keys
Chained Hash Table and Operations
Linked Hash Table: Each location in hash table array stores pointer to linked list or bst that
stored all data records of some hash value
o If no key hashes to an index, pointer at hash table index position is NULL
Chained Hash Table: Linked data structure used to store data records
Insert, search, and delete operations on linked hash table have two steps
o Compute hash index value of given key
o Do insert/search/delete operation on linked data structure at hash index position
Application of Hash Tables
1. Store key-value pair data records for fast search by key
2. Implement symbol tables for variable names and their values
3. Compilers for symbol tables
4. Database indexing
5. Caching information for quick look-up Potential final exam question:
Applying min/max heap properties
6. Internet search engines to store keyword-URL as key-value pairs in databases
Heapify for quick search
Heap sort
Heaps
Heaps: Data structure efficient for finding and deleting minimum element; inserting any element
12
Heap Data Structure: Implementation of abstract heap
Three heap data structures:
1. Binary heaps
2. Binomial heaps
3. Fibonacci heaps
Min-heap
Minimum Heap (aka min-heap): Abstract min-heap has following properties:
1. One component/property of data element used as a key
Denoted as key(A), where A is a node
2. Data elements (called nodes) of min-heap are connected by trees (called heap trees or
heap-ordered trees)
If node A has child node B, then key(A) <= key(B)
3. Basic heap operation is to find the element of the minimum key value (find-min)
4. Other min-heap operations include:
1. Create: Create a new min-heap
2. Insert: Insert a data element into a min-heap
3. Delete-min: Delete the minimum key node from a min-heap
4. Extract-min: Delete and return the minimum element from a min-heap
5. Decrease-key/Increase-key: Decrease/increase key value of a node
6. Merge: Merge two min-heaps into one min-heap
Lemma 1. A node of a min-heap is the node of the minimum key value over all nodes in the sub-
tree rooted at the node.
Max-heap
At abstract level, maximum heap (aka max-heap) has following properties:
1. One component/property of data element used as a key
Denoted as key(A), where A is a node
2. Data elements (nodes) of max-heap connected by trees (aka heap/heap-ordered trees)
If node A has child node B, then key(A) >= key(B)
3. Basic heap operation is to find the element of the maximum key value (find-max)
4. Other max-heap operations include:
1. Create: Create a new max-heap
2. Insert: Insert a data element into a max -heap
3. Delete-min: Delete the minimum key node from a max -heap
4. Extract-min: Delete and return the maximum element from a min-heap
5. Decrease-key/Increase-key: Decrease/increase key value of a node
6. Merge: Merge two max -heaps into one max -heap
Binary Heap
Binary Heap: Heap whose heap tree is a complete binary tree
Complete Binary Tree: All levels but the last one are fully filled. All nodes in last level are left
filled
Linked Binary Heap: Uses linked binary tree to represent complete binary tree
o Flexible on the number of elements
o More efficient when the number of heap elements is unknown.
Array Implementation
Advantage of using complete binary trees for heap -> efficient array representation
13
Implementation of array binary heap uses array of given maximum length and variable n to
denote number of elements in heap
Complete binary tree of n elements can be represented by array of n elements
o Order complete binary tree node data elements in breadth-first and left-first order, put
element into array in same order
o Root has index 0
Node of index i has left child at 2 i+1 and right child at 2 i+ 2; parent at index (i−1)/2
Height of complete binary tree is log n
Element at index 0 is min element
Application of Heaps
1. Heap Sort
o Sorting problem:
Input: array x[n]
Output: sorted array of x[n] in increasing order
o Sorting solution:
Insert x[i] into binary max-h
2. Used as priority queue to improve the performance of Prim’s and Dijkstra’s algorithms
14
o Set of vertices together with set of edges, where an edge is a binary relation of two
vertices
o Generalization to trees
Graphs used to model anything where entities are related to each other in pairs
o Example: Family trees, web graph, social network
Abstract Graph Data Structure: Specification of collection of data objects related in graph
structure and set of operations on data objects through graph
o Used in algorithm design and analysis
Graph Data Structure: Implementation of abstract graph data structure; used in programs
Node -> Vertex
Graph Definitions
Graphs
Graph G is an ordered pair (V, E) where V is a set of vertices (nodes) and E is a collection of
vertex pairs (edges)
o Written as G = (V, E)
V(G) represents node set; |V(G)| represents order (# of nodes) of graph
o Order of graph: Number of nodes
E(G) represents edge set; |E(G)| represents size (# of edges) of graph
o Size of graph: Number of edges
Dense: m = O(n2)
Sparse: m = O(n)
Loop Edge: Edge where both endpoints are the same
Multiple Edges: Two edges have the same pair of nodes
Subgraphs
Subgraph: Graph H = (V', E') is a subgraph of G = (V, E) if V' is a subset of V and E' is a subset of E
Supergraph: G = (V, E) is a contains / is a supergraph of H if H = (V', E') is a subgraph of G
Spanning Subgraph: H is a spanning subgraph of G if V' = V
Paths
Path: List of nodes and distinct edges: v0, e0, v1, e1, … , vk-1, ek-1, vk, such that vi and vi+1 are end
nodes of ei for i = 0, ..., k-1
o Path is connected to first node v0 and last node vk
o Path can be represented by list of edges: e0, e1, …, ek-1
o Path can be represented by list of nodes: v0, v1, …, vk
Terms used and General Knowledge:
o Length of path = Number of edges in the list
o Simple Path: v0, v1, …, vk-1, vk are distinct
o Closed Path: v0 and vk represent the same node
o Cycle: Path is closed and v0, v1, …, vk-1, vk are distinct
o Graph G contains path P if nodes and edges of P are from G
o Reachable: List of some nodes and edges of G form path with u as the first node and v as
the last node
Undirected Graphs
o Undirected Graph: Edges have no orientation (non-ordered pair of nodes)
o Edge represented by set of its two nodes
o Simple Graph: Undirected and does not contain loop or multiple edges
15
o For undirected edge e = {u, v}, following sayings used:
o e connects u and v
o u and v are adjacent
o u is a neighbour of v
o u and v are incident with e
o e is incident with u and v
o Degree of node u: Total number of edges incident with u
o Isolated: deg(u) = 0, i.e. u isn't incident with any edges
o Handshaking Theorem: Sum of degrees of all nodes of graph G is equal to 2|E(G)|
o K-Regular: Every node has degree k
o Complete Graph: Graph is simple and there is an edge between every pair of nodes
o Complete graph of order n is (n-1)-regular and has (n-1)/2 edges
o Connected: Any two nodes graph contains have a path connecting them
o Tree: Graph is connected and contains no cycles
o Tree has n-1 edges, so a tree is a sparse graph
o Component: Subgraph H of G is a component of G if H is connected and G contains no edge
from V(H) to V(G)-V(H)
o If G is not connected, then it can be decomposed into disjoint union components of G
Directed graphs
o Directed Graph (Digraph): Edges are ordered pair of nodes
o Directed edge e connecting node u to node v written as (u, v) or e = (u, v)
o Arrow from u to v is used in drawing of directed edge e
o For directed edge e = (u, v), following sayings used:
o u is tail/origin/initial of e
o v is head/target/terminal of e
o u is parent/predecessor of v
o v is child/successor of u
o e is directed edge from u to v
o e is from u to v
o e connects u to v
o e begins/originates from u; ends/terminates at v
o e is an outgoing edge of u
o e is an incoming edge of v
o Out-degree of node u: Number of outgoing edges of u
o Notation: outdeg(u) or deg+(u)
o In-degree of node u: Number of incoming edges of u
o Notation: indeg(u) or deg-(u)
o Degree of node u: deg(u) = indeg(u) + outdeg(u)
o Source: Node with indeg(u) = 0
o Sink: Node with outdeg(u) = 0
o Directed Path: v0, e0, v1, e1, … , vk-1, ek-1, vk, is a path where each edge connects the previous
node to the next nodes in the list, i.e., ei = (vi, vi+1) for i=0,…,i-1
o P connects v0 to vi+1, and P is a path from v0 to vi+1
o Reachable: Node v is reachable from node u if G contains a directed path connecting u to v
o Connected (or Strongly Connected): For any pair of nodes u and v there is a directed path
from u to v and a directed path from v to u
o Directed Acyclic Graph (DAG): Digraph does not contain a directed cycle
16
o Bidirectional Graph: Undirected graph transferred to a diagraph by replacing each edge by
two directed edges of different directions
Weighted Graphs
o Weighted Graph: Graph where each edge is associated with value (weight)
o Graph Notation: G = (V, E, w), where w is a function w: E -> W, where W is the set of
weights
o Edge Notation: Edge e written as w(e)
o Weight: Sum of weights of all its edges of G
o Notation: w(G)
o Weight can be applied to both undirected and directed graphs
Graph Representations
Adjacent Matrix
Adjacency Matrix: Adjacency matrix A (or AG) of a simple graph G with vertices v₁ , v₂ ,…, vₙ is a
n×n 0-1 matrix whose (i, j)ᵗʰ entry is 1 when vᵢ and vⱼ are adjacent and 0 when they are not
adjacent
o For directed graph G = ({v0, v1 ,… , vn-1} , E): Adjacency matrix of G is an n by n matrix M =
[ai,j]n x n and ai,j = 1 if (vi, vj) is an edge of G; otherwise ai,j = 0
o For undirected graph G = ({v0, v1 ,… , vn-1} , E): Adjacency matrix of G is matrix M = [ai,j]n x
n, ai,j = aj,i = 1 if {vi, vj} is an edge of G; otherwise ai,j = aj,i = 0
i.e. adjacency matrix of an undirected graph is a symmetric matrix
o For weighted digraph G = ({v0, v1 ,… , vn-1} , E, w): Adjacency matrix of G is matrix M =
[ai,j]n x n, ai,j = w((v_i, v_j)) if (vi, vj) is an edge of G; otherwise ai,j = 0 (a sentinel value)
o For weighted undirected graph, G = ({v0, v1 ,… , vn-1} , E, w): Adjacency matrix of G is
matrix M = [ai,j]n x n, ai,j = aj,i = w((vi, vj)) if (vi, vj) is an edge of G; otherwise ai,j = aj,i = 0
Adjacent List
Adjacency List Representation: Represents graph by list of nodes and it's neighbours
Edge List
Edge List: List of edges
17
Graph Traversal
BF-traversal
Begins at start node and explores all neighbour nodes
For each of the neighbour nodes, algorithm explores their unexplored neighbour nodes, and so
on until all reachable nodes are explored
Needs queue data structure to remember nodes at distance
Needs status array to store state of each node
DF-traversal
Depth-first traversal starts from given node and explores unvisited node deeper and deeper
until node has no neighbours (i.e. dead-end)
When dead-end if reached, algorithm backtracks, returning to most recent node that hasn't
been completely explored
o ∴ it needs to remember a path for backtracking, Think pre-order traversal
Graph Algorithms
Kruskal’s Algorithm
Idea of Kruskal's algorithm:
1. Sort edges of G in increasing order of weights
2. Starting forest of no edge, it adds edges in sorted order to forest as long as edges are
connecting two trees in the forest until tree is derived and no more edges are left
time: O(m log m), space: O(m)
Theorem. If input weighted graph G is connected, Kruskal's algorithm outputs minimum
spanning tree of G
Prim’s Algorithms for MST
Drawback of Kruskal's algorithm -> Sorting of edges because it costs more time and space
Prim's algorithm avoids this drawback using efficient data structures with better time and space
performance
Idea of Prim's Algorithm:
Grow a tree T starting from a node by adding edges until a spanning tree is derived
In each iteration, it adds an edge of minimal weight from a node of T to a node not in T
time: O(nm), space: O(n)
Prim's Theorem. If the input weighted graph G is connected, then Prim's algorithm gives a
minimum spanning tree of G
Dijkstra's Algorithm for Shortest Path/Tree
Shortest Path Tree (SPT): Tree subgraph rooted at source node such that path from source node
to each node in the tree is the shortest path connecting the source and destination node in the
super-graph (application: find shortest route from point A to B on a map)
Dijkstra's Algorithm: Efficient algorithm to find costs of shortest path from source node to
destination node
o Greedy algorithm
Idea of Dijkstra's Algorithm:
1. Starts with tree T consisting of source node s
2. In each iteration, it adds node with minimum distance to s thorugh T
3. Iteration stops when no more nodes can be added
Like Prim's algorithm, Dijkstra's algorithm grows current SPT by Greedy strategy by adding
shortest reachable node and edge
18