Advanced Data Structures
Advanced Data Structures
Definition:
Linear data structures are data structures where elements are arranged in a sequential manner,
and each element is connected to its previous and next element (except the first and last).
🔗 Key Characteristics:
Note: Actual complexity may vary depending on implementation (e.g., circular queue, doubly
linked list).
🧠 Use Cases
Examples in Python
python
CopyEdit
# Array
arr = [1, 2, 3]
arr.append(4)
# Stack
stack = []
stack.append(5)
stack.pop()
# Queue
queue = deque()
queue.append(1)
queue.popleft()
class Node:
self.val = val
self.next = None
🔗 Key Characteristics:
Ideal for representing real-world relationships such as maps, organization charts, and file
systems.
Binary Tree Each node has at most two children Expression trees, Huffman coding
🧠 Detailed Overview:
1. Tree
No cycles.
3. Heap
Min-Heap: Parent ≤ Children
4. Trie
5. Graph
Can be:
o Directed or Undirected
o Weighted or Unweighted
o Cyclic or Acyclic
Represented using:
📝 Example in Python
python
CopyEdit
class TreeNode:
self.val = val
self.left = None
self.right = None
python
CopyEdit
graph = {
'B': ['D'],
'C': ['E'],
'D': [],
'E': ['B']
Linked Lists
A Linked List is a linear data structure where each element (called a node) contains:
1. Data (value)
Unlike arrays, linked lists do not use contiguous memory, allowing dynamic memory allocation.
Singly Linked List Each node points to the next node only A → B → C → None
None ← A ⇄ B ⇄ C →
Doubly Linked List Each node points to both next and previous nodes
None
Circular Linked List Last node points to the first node A→B→C→A
Search O(n)
Traverse O(n)
📝 Python Example
python
CopyEdit
class Node:
self.data = data
self.next = None
class LinkedList:
def __init__(self):
self.head = None
# Insert at head
new_node = Node(data)
new_node.next = self.head
self.head = new_node
# Print list
def print_list(self):
current = self.head
while current:
current = current.next
print("None")
# Example usage
ll = LinkedList()
ll.insert_at_head(3)
ll.insert_at_head(2)
ll.insert_at_head(1)
Applications
Stacks
A Stack is a linear data structure that follows the LIFO principle — Last In, First Out.
This means the last element inserted is the first one to be removed.
🧠 Real-World Analogies:
Stack of plates
Using List:
python
CopyEdit
stack = []
# Push elements
stack.append(10)
stack.append(20)
# Peek top
print(stack[-1]) # Output: 20
# Pop element
stack.pop() # Removes 20
# Check empty
print(len(stack) == 0) # False
python
CopyEdit
stack = deque()
stack.append(1)
stack.append(2)
stack.pop()
💡 Applications of Stacks
python
CopyEdit
def is_balanced(expr):
stack = []
if char in '([{':
stack.append(char)
return False
# Test
print(is_balanced("([{}])")) # True
print(is_balanced("([)]")) # False
Stacks are essential in evaluating arithmetic expressions, especially when converting between or
evaluating:
Infix (e.g., 3 + 4)
Postfix (Reverse Polish Notation - RPN) (e.g., 3 4 +)
Algorithm:
3. If it's an operator, pop two elements, apply the operator, and push the result.
Python Example:
python
CopyEdit
def evaluate_postfix(expr):
stack = []
if token.isdigit():
stack.append(int(token))
else:
b = stack.pop()
a = stack.pop()
return stack[0]
# Test
Steps:
Goal: Sort a stack such that the smallest element is on top using only stack operations.
✅ Python Example
python
CopyEdit
def sort_stack(input_stack):
temp_stack = []
while input_stack:
tmp = input_stack.pop()
input_stack.append(temp_stack.pop())
temp_stack.append(tmp)
return temp_stack
# Test
sorted_stack = sort_stack(s)
Queues
Queues
A Queue is a linear data structure that follows the FIFO (First In, First Out) principle.
This means the first element added to the queue will be the first one to be removed.
🧱 Structure
✅ Basic Operations
🧠 Types of Queues
Type Description
Circular Queue Connects the end back to the front to use space efficiently
Using collections.deque:
python
CopyEdit
q = deque()
# Enqueue
q.append(10)
q.append(20)
# Dequeue
print(q.popleft()) # Output: 10
# Peek
print(q[0]) # Output: 20
python
CopyEdit
q = Queue()
q.put(1) # enqueue
q.put(2)
print(q.get()) # dequeue → 1
print(q.qsize()) # Output: 1
💡 Applications of Queues
python
CopyEdit
visited = set()
q = deque([start])
while q:
node = q.popleft()
visited.add(node)
q.extend(graph[node])
graph = {
'B': ['D'],
'C': ['E'],
'D': [],
'E': ['B']
ChatGPT said:
The goal is to simulate LIFO (Last-In-First-Out) behavior of a Stack using FIFO (First-In-First-Out)
behavior of Queue(s).
🔧 Logic:
🧪 Code:
python
CopyEdit
class StackUsingQueues:
def __init__(self):
self.q1 = deque()
self.q2 = deque()
self.q1.append(x)
def pop(self):
self.q2.append(self.q1.popleft())
popped = self.q1.popleft()
return popped
def top(self):
self.q2.append(self.q1.popleft())
top = self.q1.popleft()
self.q2.append(top)
def isEmpty(self):
🔧 Logic:
1. Push:
🧪 Code:
python
CopyEdit
class StackUsingOneQueue:
def __init__(self):
self.q = deque()
self.q.append(x)
self.q.append(self.q.popleft()) # rotate
def pop(self):
return self.q.popleft()
def top(self):
return self.q[0]
def isEmpty(self):
⚖️Comparison
Page Replacement Algorithms can be implemented using queues to simulate memory frame usage.
This works well especially for FIFO, and with some modifications for LRU and Clock (Second
Chance).
FIFO (First-In First-Out) removes the page that was added first to memory.
🔧 Logic:
🧪 Python Code:
python
CopyEdit
memory = deque()
page_faults = 0
if len(memory) == capacity:
memory.popleft()
memory.append(page)
return page_faults
# Test
pages = [7, 0, 1, 2, 0, 3, 0, 4]
capacity = 3
Least Recently Used (LRU) evicts the page that hasn't been used for the longest time.
🔧 Logic:
If the page is accessed again, move it to the rear (most recently used).
🧪 Python Code:
python
CopyEdit
memory = deque()
page_faults = 0
if page in memory:
memory.remove(page)
else:
page_faults += 1
if len(memory) == capacity:
memory.popleft()
memory.append(page)
return page_faults
# Test
pages = [7, 0, 1, 2, 0, 3, 0, 4]
capacity = 3
A Deque (pronounced "deck") is a linear data structure that allows insertion and deletion from
both ends — front and rear.
🧠 Key Features:
Insertion ✅ ✅
Deletion ✅ ✅
Access (peek) ✅ ✅
This makes it more powerful than stacks (LIFO) and queues (FIFO).
🔧 Types of Deques:
python
CopyEdit
dq = deque()
# Insert at rear
dq.append(10)
dq.append(20)
# Insert at front
dq.appendleft(5)
dq.pop() # 20
dq.popleft() # 5
print("Front:", dq[0])
print("Rear:", dq[-1])
📦 Use Cases:
Palindrome checking
Undo/Redo functionality
Job scheduling
Maximum/Minimum in a subarray
🧪 Example: Sliding Window Maximum (Using Deque)
python
CopyEdit
q = deque()
result = []
for i in range(len(nums)):
q.popleft()
q.pop()
q.append(i)
if i >= k - 1:
result.append(nums[q[0]])
return result
# Example:
print(max_sliding_window([1,3,-1,-3,5,3,6,7], 3))
✅ Approach
Use one end of the deque for both push and pop operations.
For a stack, typically use the rear (right side) of the deque.
🧪 Python Code
python
CopyEdit
class StackUsingDeque:
def __init__(self):
self.dq = deque()
self.dq.append(x)
def pop(self):
if self.is_empty():
return self.dq.pop()
def top(self):
if self.is_empty():
return self.dq[-1]
def is_empty(self):
return len(self.dq) == 0
# Usage example:
stack = StackUsingDeque()
stack.push(10)
stack.push(20)
print(stack.top()) # Output: 20
print(stack.pop()) # Output: 20
print(stack.pop()) # Output: 10
⚡ Explanation
pop() removes from the right end, behaving exactly like a stack.
This implementation has O(1) time complexity for push, pop, and top.
A Heap is a specialized complete binary tree that satisfies the heap property:
🧠 Key Characteristics:
Complete binary tree: all levels filled except possibly the last, which is filled from left to
right.
Operations
extract_max / extract_min Remove and return root element (max or min) O(log n)
Parent at index i
python
CopyEdit
import heapq
min_heap = []
heapq.heappush(min_heap, 5)
heapq.heappush(min_heap, 3)
heapq.heappush(min_heap, 8)
python
CopyEdit
class MaxHeap:
def __init__(self):
self.heap = []
self._heapify_up(len(self.heap) - 1)
parent = (index - 1) // 2
self._heapify_up(parent)
def extract_max(self):
if not self.heap:
return None
if len(self.heap) == 1:
return self.heap.pop()
root = self.heap[0]
self.heap[0] = self.heap.pop()
self._heapify_down(0)
return root
largest = index
left = 2 * index + 1
right = 2 * index + 2
largest = left
largest = right
if largest != index:
# Usage
h = MaxHeap()
h.insert(10)
h.insert(20)
h.insert(5)
print(h.extract_max()) # Output: 20
minheap
A Min-Heap is a complete binary tree where each node’s value is less than or equal to the values of
its children.
The minimum element is always at the root (top).
🧠 Properties:
Complete binary tree (all levels fully filled except possibly the last, filled left to right).
Parent ≤ Children
🔢 Array Representation
o Parent at (i-1)//2
🧪 Python Example: Min-Heap using heapq module
python
CopyEdit
import heapq
min_heap = []
# Insert elements
heapq.heappush(min_heap, 10)
heapq.heappush(min_heap, 5)
heapq.heappush(min_heap, 20)
python
CopyEdit
class MinHeap:
def __init__(self):
self.heap = []
self.heap.append(val)
self._heapify_up(len(self.heap) - 1)
parent = (index - 1) // 2
self._heapify_up(parent)
def extract_min(self):
if not self.heap:
return None
if len(self.heap) == 1:
return self.heap.pop()
root = self.heap[0]
self.heap[0] = self.heap.pop()
self._heapify_down(0)
return root
smallest = index
left = 2 * index + 1
right = 2 * index + 2
smallest = left
smallest = right
if smallest != index:
self._heapify_down(smallest)
# Example Usage
h = MinHeap()
h.insert(15)
h.insert(10)
h.insert(20)
print(h.extract_min()) # Output: 10
print(h.extract_min()) # Output: 15
A Max-Heap is a complete binary tree where each node’s value is greater than or equal to the
values of its children.
The maximum element is always at the root (top).
🧠 Properties:
Parent ≥ Children.
🔢 Array Representation
o Parent at (i-1)//2
python
CopyEdit
class MaxHeap:
def __init__(self):
self.heap = []
self.heap.append(val)
self._heapify_up(len(self.heap) - 1)
parent = (index - 1) // 2
self._heapify_up(parent)
def extract_max(self):
if not self.heap:
return None
if len(self.heap) == 1:
return self.heap.pop()
root = self.heap[0]
self.heap[0] = self.heap.pop()
self._heapify_down(0)
return root
largest = index
left = 2 * index + 1
right = 2 * index + 2
largest = left
if right < len(self.heap) and self.heap[right] > self.heap[largest]:
largest = right
if largest != index:
self._heapify_down(largest)
# Example Usage
h = MaxHeap()
h.insert(10)
h.insert(20)
h.insert(5)
print(h.extract_max()) # Output: 20
print(h.extract_max()) # Output: 10
To simulate max-heap with heapq, insert negative values and negate them back when
popping.
Binomial Heap
A Binomial Heap is a collection of binomial trees that are linked together, supporting efficient
priority queue operations. It generalizes a heap by allowing fast merge (union) operations.
🧠 Key Concepts:
o 2^k nodes
o Height = k
o Constructed recursively by linking two B_(k-1) trees, where the root of one
becomes the leftmost child of the other.
Structure Summary
Merge two root lists (like merging two sorted linked lists by order).
Combine trees of the same order by linking one tree under the other (min-heap order).
Brief code is complex to write fully here, but core operations revolve around:
Heap sort is a comparison-based sorting algorithm that uses a heap data structure (usually a max-
heap) to sort elements in ascending order.
4. Reduce the heap size by one (exclude the last element from the heap).
Complexity:
Heap sort is in-place (requires no extra space) and not stable (does not preserve order of equal
elements).
Visual Explanation:
Swap root with last → largest element moves to the sorted position.
python
CopyEdit
largest = left
largest = right
if largest != i:
def heap_sort(arr):
n = len(arr)
heapify(arr, n, i)
# Example usage
heap_sort(arr)
Output:
pgsql
CopyEdit
Summary:
Key Terminology
Term Meaning
Types of Trees
Binary Tree: Each node has at most 2 children (left and right).
Binary Search Tree (BST): Binary tree with ordered property: left < root < right.
Balanced Trees: Height balanced for efficiency (e.g., AVL tree, Red-Black tree).
Basic Operations
python
CopyEdit
class TreeNode:
def __init__(self, val):
self.val = val
self.left = None
self.right = None
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
Applications of Trees
A Binary Tree is a hierarchical data structure in which each node has at most two children called:
Left child
Right child
Key Characteristics:
The height of a binary tree is the number of edges on the longest path from the root to a
leaf.
All levels completely filled except possibly the last, which is filled
Complete Binary Tree
from left to right
Perfect Binary Tree All internal nodes have 2 children and all leaves are at the same level
Balanced Binary Tree Height difference between left and right subtree of any node is ≤ 1
Degenerate (Pathological)
Each parent has only one child (like a linked list)
Tree
python
CopyEdit
class Node:
self.data = data
self.left = None
self.right = None
root = Node(1)
root.left = Node(2)
root.right = Node(3)
root.left.left = Node(4)
root.left.right = Node(5)
Traversal Example (Inorder):
python
CopyEdit
def inorder(root):
if root:
inorder(root.left)
inorder(root.right)
# Usage:
inorder(root) # Output: 4 2 5 1 3
Tree traversal means visiting all the nodes in a tree in a specific order. Traversal is key to many tree
operations like searching, printing, or modifying the tree.
DFS explores as far as possible along each branch before backtracking. It has 3 common types:
python
CopyEdit
def preorder(root):
if root:
preorder(root.left)
preorder(root.right)
python
CopyEdit
def inorder(root):
if root:
inorder(root.left)
inorder(root.right)
python
CopyEdit
def postorder(root):
if root:
postorder(root.left)
postorder(root.right)
print(root.data, end=' ')
python
CopyEdit
def level_order(root):
if not root:
return
queue = deque([root])
while queue:
node = queue.popleft()
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
Summary Table
Traversal Type Order of Nodes Visited Data Structure Used Common Uses
Preorder (DFS) Root → Left → Right Recursion/Stack Copy tree, prefix notation
Level Order (BFS) Level by level (top to bottom) Queue Shortest path, tree levels
Why Stacks?
Used for converting between notations and for evaluating postfix/prefix expressions.
Example: Evaluate 6 2 3 + - 3 8 2 / + *
Step-by-step:
Result = 7
python
CopyEdit
def evaluate_postfix(expression):
stack = []
stack.append(int(token))
else:
b = stack.pop()
a = stack.pop()
if token == '+':
stack.append(a + b)
stack.append(a - b)
stack.append(a * b)
return stack.pop()
print("Result:", evaluate_postfix(expr))
2. Evaluating Infix Expression (Using Two Stacks)
o If operator, pop operators with higher or equal precedence and apply them.
Operator Precedence
Operator Precedence
() Highest
*/ 2
+- 1
Expression: (3 + 5) * 2
Push '('
Push 3
Push '+'
Push 5
Push 8
Push '*'
Push 2
Apply '*' → 16
A Binary Search Tree (BST) is a special kind of binary tree with a sorted structure that allows
efficient search, insertion, and deletion operations.
Properties of BST
For each node:
o All values in its left subtree are less than the node’s value.
o All values in its right subtree are greater than the node’s value.
Why BST?
o Searching
o Range queries
Basic Operations
python
CopyEdit
class BSTNode:
self.val = val
self.left = None
self.right = None
Insert Operation
python
CopyEdit
def insert(root, key):
if root is None:
return BSTNode(key)
return root
Search Operation
python
CopyEdit
return root
Delete Operation
2. Cases:
o Node has two children → replace node with inorder successor (smallest node in
right subtree) or inorder predecessor (largest in left subtree), then delete
successor/predecessor.
python
CopyEdit
def min_value_node(node):
current = node
while current.left:
current = current.left
return current
if not root:
return root
else:
if not root.left:
return root.right
return root.left
temp = min_value_node(root.right)
root.val = temp.val
return root
python
CopyEdit
def inorder(root):
if root:
inorder(root.left)
inorder(root.right)
Example Usage
python
CopyEdit
root = None
inorder(root) # Output: 20 30 40 50 60 70 80
# Search for 40
# Delete 20
inorder(root) # Output: 30 40 50 60 70 80
An AVL Tree is a self-balancing Binary Search Tree (BST) named after its inventors Adelson-Velsky
and Landis. It maintains a strict balance to ensure operations like search, insert, and delete run in
O(log n) time, even in the worst case.
Normal BST can become skewed (like a linked list), leading to O(n) operations.
AVL Tree keeps the tree balanced by enforcing a balance factor on every node.
Height of node
Rotations to Rebalance
python
CopyEdit
class AVLNode:
self.key = key
self.left = None
self.right = None
Key Functions
1. Get Height
python
CopyEdit
def height(node):
if not node:
return 0
return node.height
python
CopyEdit
def get_balance(node):
if not node:
return 0
3. Right Rotation
python
CopyEdit
def right_rotate(y):
x = y.left
T2 = x.right
# Perform rotation
x.right = y
y.left = T2
# Update heights
return x
4. Left Rotation
python
CopyEdit
def left_rotate(x):
y = x.right
T2 = y.left
# Perform rotation
y.left = x
x.right = T2
# Update heights
return y
python
CopyEdit
if not node:
return AVLNode(key)
else:
# Update height
balance = get_balance(node)
return right_rotate(node)
return left_rotate(node)
node.left = left_rotate(node.left)
return right_rotate(node)
node.right = right_rotate(node.right)
return left_rotate(node)
return node
python
CopyEdit
def inorder(node):
if node:
inorder(node.left)
inorder(node.right)
Example Usage
python
CopyEdit
root = None
inorder(root) # Output: 10 20 25 30 40 50
Summary
Search O(log n)
Insertion O(log n)
Deletion O(log n)
AVL trees maintain a balanced tree structure automatically after insertions/deletions by rotations,
ensuring logarithmic height.
A Red-Black Tree is a kind of self-balancing binary search tree that ensures the tree remains
approximately balanced during insertions and deletions, guaranteeing O(log n) time for search,
insert, and delete operations.
Like AVL trees, it maintains balance but with less strict rules, leading to faster insertion and
deletion in practice.
Used widely in system libraries (e.g., C++ STL map, Java TreeMap).
4. If a node is red, then both its children are black. (No two reds in a row.)
5. For each node, every path from the node to descendant leaves contains the same number
of black nodes. (Black-height property)
Terminology
Black-height: Number of black nodes from a node to leaves (excluding node itself)
They enforce balance by limiting the longest path to be no more than twice the shortest path,
keeping the tree approximately balanced.
Basic Operations
Rotations
Recoloring
The main fix involves ensuring no two consecutive red nodes and black-height consistency.
3. Parent and uncle are red: Recolor parent and uncle black, grandparent red, recurse up.
4. Parent is red, uncle is black (or NIL): Rotate and recolor depending on node’s position (left-
left, left-right, right-left, right-right).
python
CopyEdit
class RBNode:
self.key = key
self.left = None
self.right = None
self.parent = None
Balancing
More strict Less strict
strictness
Insert/Delete
Slower (more rotations) Faster (fewer rotations)
speed
B-Tree — Overview
A B-Tree is a self-balancing multi-way search tree designed to work efficiently on disk storage or
large databases. Unlike binary trees, each node can have more than two children, allowing it to
store large blocks of sorted data and minimize disk reads/writes.
Why B-Tree?
Optimized for systems that read/write large blocks of data (like databases, file systems).
7. Child subtrees have keys that fall between keys in the parent node.
Terminology
Order (t): Minimum degree of the tree (defines the minimum and maximum number of
keys/children).
Operations
Search
Else, recurse into the child subtree that fits the key’s value.
Insertion
Deletion
More complex, involves merging and redistribution of keys to maintain minimum keys
property.
python-repl
CopyEdit
[10 | 20 | 30]
/ | | \
Keys are sorted inside each node, and children cover key ranges accordingly.
python
CopyEdit
class BTreeNode:
self.leaf = leaf
Advantages
Summary
Search O(log n)
B+ Tree — Overview
A B+ Tree is an extension of the B-Tree designed specifically for database and file systems to
optimize range queries and sequential access. It maintains all data in the leaf nodes and uses an
internal node structure primarily for indexing.
Data Storage Keys and data stored in all nodes Data stored only in leaf nodes
Internal
Store keys and data Store only keys (index values)
Nodes
Range
Less efficient Efficient due to linked leaves
Queries
Structure of B+ Tree
Leaf nodes contain keys and actual data (or pointers to data).
Leaf nodes are linked using a doubly linked list for fast ordered traversal.
Operations
Search
Insertion
Promote the smallest key of the new right sibling to parent node.
Deletion
Internal nodes:
python-repl
CopyEdit
[10 | 20 | 30]
/ | | \
css
CopyEdit
[5, 7, 9] <-> [12, 15, 18] <-> [22, 25, 28] <-> [35, 40]
Advantages of B+ Tree
Internal nodes are smaller (no data), so more keys fit per node.
Summary Comparison
A Segment Tree is a versatile data structure used for answering range queries and performing
updates efficiently on an array or sequence of elements.
Ideal for problems like range sum, range minimum/maximum, range gcd, etc.
Each node stores information about a segment (e.g., sum, min, max).
The children represent the left and right halves of the parent segment.
Structure
Operations
Build
Recursively check if current node segment is fully inside, outside, or partially overlaps [L,
R].
Update
python
CopyEdit
class SegmentTree:
self.n = len(data)
self._build(data, 0, 0, self.n - 1)
if start == end:
self.tree[node] = data[start]
else:
if start == end:
self.tree[node] = val
else:
else:
return 0
return self.tree[node]
Use Cases
Frequency counting.
Inversion count.
Summary
Build O(n)
Query O(log n)
Update O(log n)
Graphs — Overview
A graph is a fundamental data structure used to model pairwise relationships between objects. It
consists of vertices (nodes) and edges (connections).
Basic Terminology
Directed Graph (Digraph): Edges have direction (from one vertex to another).
Cycle: Path where first and last vertices are the same.
Graph Representation
1. Adjacency Matrix
2D matrix of size V x V.
Easy to implement.
Space: O(V²).
2. Adjacency List
Graph Types
Traversal
Depth-First Search (DFS): Explore as far as possible along a branch before backtracking.
Shortest Path
Cycle Detection
python
CopyEdit
graph = {
0: [1, 2],
1: [2],
2: [0, 3],
3: [3]
Summary
Graphs model complex relationships and support many important algorithms for traversal,
optimization, and connectivity.
Hashing — Overview
Hashing is a technique to uniquely identify a specific object from a collection of objects. It converts
input (keys) into a fixed-size integer value called a hash code using a hash function.
Key Concepts
Hash Function: Maps input data of arbitrary size to fixed-size hash values (indices).
Hash Table: Data structure that uses hash functions to index data for fast access.
Load Factor (α): Ratio of number of elements to table size (n / m), affects performance.
Insertion O(1)
Search O(1)
Deletion O(1)
1. Chaining:
o Each bucket stores a linked list (or another data structure) of elements.
2. Open Addressing:
o Types of probing:
Applications of Hashing
Caching data.
Database indexing.
python
CopyEdit
class HashTable:
self.size = size
idx = self._hash(key)
if k == key:
return
# Else insert
self.table[idx].append((key, value))
idx = self._hash(key)
for k, v in self.table[idx]:
if k == key:
return v
return None
def delete(self, key):
idx = self._hash(key)
if k == key:
del self.table[idx][i]
return True
return False
The Two Pointer Approach is a popular technique used to solve problems involving arrays or linked
lists efficiently. It uses two pointers (indices) that traverse the data structure simultaneously to
reduce time complexity, often from quadratic to linear time.
When to Use?
When you need to find pairs, triplets, or subarrays matching certain conditions.
Key Idea
Common Patterns
Remove duplicates in sorted array Use one pointer to track unique items
python
CopyEdit
s = arr[left] + arr[right]
if s == target:
left += 1
else:
right -= 1
return None
python
CopyEdit
def remove_duplicates(arr):
if not arr:
return 0
write = 1
if arr[read] != arr[read-1]:
arr[write] = arr[read]
write += 1
Benefits
Efficient: Reduces complexity from O(n²) to O(n) in many problems.
Simple implementation.
The Sliding Window technique is a powerful approach for solving problems involving subarrays or
substrings in arrays or strings, especially where you need to find a contiguous segment meeting
certain criteria.
When to Use?
Examples: max sum subarray of size k, longest substring without repeating characters.
Type Description
Fixed Size Window size is fixed (e.g., size k). Slide by one.
Basic Approach
Maintain two pointers: start and end defining the current window.
python
CopyEdit
max_sum = 0
window_sum = sum(arr[:k])
max_sum = window_sum
return max_sum
python
CopyEdit
def length_of_longest_substring(s):
char_index = {}
start = 0
max_len = 0
char_index[char] = end
return max_len
Benefits
Tries — Overview
A Trie (pronounced "try"), also called a prefix tree, is a specialized tree used to store associative
data structures, typically strings. It is especially useful for efficient retrieval of keys in datasets like
dictionaries or word lists.
Key Characteristics
Trie Structure
Common Operations
Search O(L)
Use Cases
Spell checking.
Word games.
python
CopyEdit
class TrieNode:
def __init__(self):
self.children = {}
self.isEndOfWord = False
class Trie:
def __init__(self):
self.root = TrieNode()
node = self.root
node.children[char] = TrieNode()
node = node.children[char]
node.isEndOfWord = True
node = self.root
return False
node = node.children[char]
return node.isEndOfWord
node = self.root
return False
node = node.children[char]
return True
A Suffix Tree is a compressed trie (prefix tree) containing all the suffixes of a given string. It’s a
powerful data structure for string processing tasks and allows many substring-related queries to be
answered efficiently.
Key Characteristics
No two edges starting from a node begin with the same character (compact
representation).
Allows fast pattern matching, substring checks, longest repeated substring, etc.
Properties
Construction can be done in O(n) time using advanced algorithms (Ukkonen’s algorithm).
Supports queries like substring search, pattern matching, longest repeated substring in
O(m) time, where m is pattern length.
Example
For the string "banana$" (where $ is a unique end symbol), the suffix tree contains all suffixes:
banana$
anana$
nana$
ana$
na$
a$
$
Basic Structure
plaintext
CopyEdit
banana$
├─ b
│ └─ anana$
├─ a
│ ├─ na
│ │ └─ na$
│ └─ $
├─ n
└─ ana$
Applications
Related Structures
A Ternary Search Tree is a type of trie (prefix tree) where nodes are arranged like a binary search
tree but with three children:
Left child: Nodes with characters less than the current node's character.
Middle child: Nodes with the same character (continue the word).
Right child: Nodes with characters greater than the current node's character.
Key Characteristics
Efficient for storing strings and supporting fast search, insert, and prefix queries.
Useful when the alphabet is large or memory is constrained (compared to tries with large
arrays).
Benefits
Balances between trie speed and binary search tree space efficiency.
Node Structure
A character.
Search O(L)
python
CopyEdit
class TSTNode:
self.char = char
self.left = None
self.eq = None
self.right = None
self.isEnd = False
class TernarySearchTree:
def __init__(self):
self.root = None
char = word[index]
if not node:
node = TSTNode(char)
else:
if index + 1 == len(word):
node.isEnd = True
else:
return node
if not node:
return False
char = word[index]
else:
if index == len(word) - 1:
return node.isEnd
Use Cases
When you want a compromise between space usage and fast prefix queries.