Advance Algorithm
Advance Algorithm
com
Advance Algorithm
SECTION A
(b) What is recurrence relation? How is a recurrence solved using the master’s theorem?
A recurrence relation describes the time complexity of a recursive algorithm in terms of smaller sub-
problems. For example, T(n) = 2T(n/2) + n² describes an algorithm that splits the problem into two sub-
problems of size n/2 and combines them with O(n²) effort.
The Master Theorem solves recurrences of the form T(n) = aT(n/b) + f(n). It compares the growth rates
of n^log_b(a) and f(n) to determine the overall time complexity. Depending on whether f(n) is larger or
smaller than n^log_b(a), we can directly conclude the time complexity.
• Binary Search Tree (BST): Follows the properties of a binary search tree.
• Red Node Rule: No two red nodes can be adjacent (a red node cannot have a red child).
• Black Height: The path from any node to its leaf nodes must contain the same number of black
nodes.
• Balanced: Red-black trees ensure balanced height, ensuring logarithmic time complexity for
search operations.
1. Decomposition: Efficiently breaking down the problem into smaller tasks that can be executed in
parallel.
3. Synchronization: Ensuring that parallel tasks are executed in the correct order without conflicts
or data corruption.
4. Load Balancing: Ensuring all processors are used effectively, avoiding idle times.
5. Scalability: Ensuring the algorithm performs well as the number of processors increases.
6. Fault Tolerance: The algorithm must handle processor or task failures without crashing the
system.
• Backtracking explores all possible solutions by trying one solution at a time and discarding non-
promising paths as soon as a dead end is reached. It’s typically used for problems like the N-
Queens problem or Sudoku.
• Branch and Bound also explores possible solutions but uses a systematic approach to prune
branches that cannot lead to an optimal solution. It’s often used for optimization problems, like
the Traveling Salesman Problem, where the goal is to find the best solution.
• Balanced: All leaf nodes are at the same level, ensuring logarithmic height.
• Self-balancing: Rebalances itself after insertions and deletions to maintain efficient search times.
• Multi-way: Each node can have more than two children, making it ideal for databases where
large blocks of data are read and written.
• Sorted: Data is kept in sorted order within the tree, allowing for fast search and retrieval
operations.
• Efficient: Due to its structure, B-trees provide efficient search, insert, and delete operations in
O(log n) time, even for large datasets.
https://sandeepvi.medium.com
SECTION B
o n^log_2(47) ≈ n^5.58.
o Since f(n) = n² grows slower than n^5.58, the time complexity is dominated by the
n^log_b(a) term.
List: [14, 33, 21, 45, 67, 20, 40, 59, 12, 36]
• Merge Sort divides the list into halves, recursively sorts them, and then merges them.
o Step 1: Divide [14, 33, 21, 45, 67, 20, 40, 59, 12, 36] into two halves: [14, 33, 21, 45, 67]
and [20, 40, 59, 12, 36].
Sorted List: [12, 14, 20, 21, 33, 36, 40, 45, 59, 67]
• Time Complexity: O(n log n) due to the repeated splitting and merging steps.
OR
Merge sort is a divide-and-conquer algorithm that sorts a list by repeatedly dividing it into smaller
sublists, sorting each sublist, and then merging the sorted sublists back together.
https://sandeepvi.medium.com
Steps:
3. Combine: Merge the two sorted halves back together to form a single sorted list.
Example:
Let's sort the list: [14, 33, 21, 45, 67, 20, 40, 59, 12, 36]
1. Divide:
o Divide the list into two halves: [14, 33, 21, 45, 67] and [20, 40, 59, 12, 36]
2. Conquer:
3. Combine:
▪ [14, 33, 21, 45, 67] becomes [14, 21, 33, 45, 67]
▪ [20, 40, 59, 12, 36] becomes [12, 20, 36, 40, 59]
▪ Finally, merge [14, 21, 33, 45, 67] and [12, 20, 36, 40, 59] to get the sorted list:
[12, 14, 20, 21, 33, 36, 40, 45, 59, 67]
Time Complexity:
Merge sort has a time complexity of O(n log n), which means it is efficient even for large lists.
Space Complexity:
Merge sort has a space complexity of O(n), which means it requires extra memory to store the sublists
during the merging process.
(c) Discuss the various cases for insertion of a key in red-black tree:
Inserting a key into a Red-Black Tree involves maintaining the tree’s properties, and there are a few
important cases to consider:
2. Case 2: If the parent node is black, the tree is still valid and no further action is needed.
https://sandeepvi.medium.com
3. Case 3: If the parent node is red, this violates the Red-Black property (two consecutive red
nodes). In this case:
o Recoloring: Change the parent and uncle to black and the grandparent to red.
o Rotation: If recoloring doesn’t solve the issue, perform rotations (left or right) to
rebalance the tree and restore the properties.
OR
5. Every path from a given node to any of its descendant NIL nodes contains the same number of
black nodes (black-height).
Insertion in a red-black tree starts like a regular binary search tree insertion. The new node is always
inserted as a red node. This might violate the red-black tree properties, so we need to rebalance the
tree. The rebalancing involves rotations and recoloring.
Here's a breakdown of the cases, categorized by the "uncle" of the newly inserted node (z):
• Scenario: The new node (z)'s parent (p) and uncle (u) are both red.
• Solution:
1. Recolor: Change the colors of p, u, and z's grandparent (g) to their opposites (red
becomes black, black becomes red).
2. Move up: Treat g as the new z and repeat the process from Case 1 or Case 2/3, if
necessary.
• Example:
G (Black) G (Red)
/ \ / \
/\ /\ /\
• Scenario: The new node (z)'s parent (p) is red, the uncle (u) is black (or NIL), and z is the right
child of p.
• Solution:
1. Left Rotation: Perform a left rotation on p. This makes z the parent and p the left child of
z.
• Scenario: The new node (z)'s parent (p) is red, the uncle (u) is black (or NIL), and z is the left child
of p.
• Solution:
/ \ / \ / \
/\ /\ / \
4. These steps ensure that the tree remains balanced and satisfies all Red-Black properties after
each insertion.
In the EREW (Exclusive Read, Exclusive Write) model, no two processors can read or write the same
memory location at the same time. This creates challenges when merging data.
• Exclusive Read: No two processors can read the same memory location simultaneously.
• Exclusive Write: No two processors can write to the same memory location simultaneously.
https://sandeepvi.medium.com
These restrictions make merging on EREW PRAM a bit more complex than on models with concurrent
access. We need to be careful about how we distribute the data and manage the writes to avoid
conflicts.
The basic idea is to divide the two sorted input arrays among the processors and have each processor
perform a portion of the merge. Since we can't have concurrent writes, we need a strategy to ensure
that each element is written to the correct location in the output array without conflicts.
Algorithm (Simplified):
1. Divide: Divide the two sorted input arrays (A and B) into roughly equal parts, one part for each
processor.
2. Local Merge: Each processor performs a local merge of its assigned portions of A and B. This
results in several small sorted sub-arrays.
3. Global Merge: The sorted sub-arrays from each processor need to be merged together. This is
the tricky part due to the EREW restriction. One common technique is to use a binary tree-like
reduction.
Example:
• A = [2, 5, 7, 9]
• B = [1, 3, 4, 6, 8, 10]
1. Divide:
2. Local Merge:
o Level 1:
https://sandeepvi.medium.com
▪ P1 and P2 "merge" their results: [1, 2, 3, 5] and [4, 6, 7, 9] -> conceptually [1, 2,
3, 4, 5, 6, 7, 9]
o Level 2:
▪ The result from P1/P2 and P3/P4 are now merged. This would conceptually be
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
• Writing to the output array: We must ensure no two processors write to the same location. The
binary tree reduction approach naturally handles this because each merge step writes to a
distinct portion of the output.
• Data distribution: Distributing the data evenly among processors is important for load balancing.
• Handling uneven input sizes: The input arrays might not be perfectly divisible by the number of
processors. The algorithm needs to handle these cases gracefully.
DFS explores as far down a branch as possible before backtracking. It’s commonly used to traverse or
search tree or graph data structures.
Depth-First Search (DFS) is an algorithm for traversing or searching tree or graph data structures. It starts
at the root (or some arbitrary node) and explores as far as possible along each branch before
backtracking. Think of it like exploring a maze by always choosing one path and going as deep as you can
before hitting a dead end and then going back to try another path.
How it works:
3. Explore adjacent nodes: For each unvisited node adjacent to the current node:
Example:
/\
B C
/\ \
https://sandeepvi.medium.com
D E F
pplications of DFS:
• Finding connected components: In a graph, DFS can be used to identify groups of connected
nodes.
• Topological sorting: For directed acyclic graphs (DAGs), DFS can be used to produce a topological
sort.
• Path finding: DFS can be adapted to find a path between two given nodes.
Key Points:
• Stack: The recursive calls use a stack (the call stack) to keep track of the nodes being explored.
You can also implement DFS iteratively using an explicit stack.
• Backtracking: The algorithm "backtracks" when it reaches a dead end (a node with no unvisited
neighbors).
SECTION C
(a) Write down the algorithm for Max Heapify in heap sort.
largest = i
left = 2 * i + 1
right = 2 * i + 2
largest = left
largest = right
if largest != i:
MaxHeapify(arr, n, largest)
• Time Complexity: O(log n), as the algorithm may need to move down the tree (height of the tree
is log n).
A good algorithm, whether for a simple task or a complex problem, should possess several key properties
and fulfill certain requirements. These can be broadly categorized into factors relating to correctness,
efficiency, and clarity.
Correctness:
https://sandeepvi.medium.com
• Correctness: The most fundamental requirement. An algorithm must produce the correct output
for all valid inputs. It should adhere to the problem's specification and handle edge cases
gracefully.
• Finiteness: The algorithm must terminate after a finite number of steps. It shouldn't get stuck in
an infinite loop.
• Unambiguity: Each step of the algorithm must be precisely defined, with no room for
interpretation. The instructions should be clear and unambiguous.
• Completeness: For a given input, the algorithm should either produce a solution or indicate that
no solution exists. It shouldn't leave the user in doubt.
Efficiency:
• Time Efficiency: An algorithm should use a reasonable amount of time to execute. Time
complexity is often expressed using Big O notation (e.g., O(n), O(log n), O(n log n), O(n²)). Lower
time complexity is generally better.
• Space Efficiency: An algorithm should use a reasonable amount of memory (space) during its
execution. Space complexity is also often expressed using Big O notation. Minimizing space
usage is important, especially for large datasets or resource-constrained environments.
• Optimality (Ideal): Ideally, an algorithm should be the most efficient possible for the given
problem. However, achieving optimality isn't always feasible, especially for complex problems.
We often settle for algorithms that are "good enough" in terms of efficiency.
• Readability: An algorithm should be easy to understand and follow, both for the person who
wrote it and for others who might need to implement or modify it later. Clear variable names,
comments, and a well-structured code are essential.
• Simplicity: An algorithm should be as simple as possible while still being effective. Avoid
unnecessary complexity. Simpler algorithms are often easier to understand, implement, and
debug.
• Modularity: Breaking down a complex algorithm into smaller, self-contained modules can
improve readability, maintainability, and reusability.
(7 marks each)
(a) What do you mean by parallel sorting networks? Also discuss the enumeration sort algorithm.
For example, in a Bitonic Sort or Odd-Even Merging Sort, multiple elements are compared and swapped
simultaneously, making them suitable for parallel processing.
Enumeration Sort:
The Enumeration Sort algorithm is a comparison-based sorting algorithm where each element is placed
at its correct position by counting the number of elements smaller than it. The steps are:
1. For each element, count how many elements in the array are smaller than it.
Time Complexity: The time complexity of Enumeration Sort is O(n^2), making it inefficient for large
datasets. However, it’s useful in scenarios with limited unique elements.
OR
Imagine you have a bunch of numbers that you need to sort, and you have multiple processors available
to help speed things up. A parallel sorting network is a way to arrange comparisons between these
numbers in a specific pattern so that the sorting can happen simultaneously.
Key Ideas
• Comparators: The basic building block of a sorting network is a comparator. A comparator takes
two numbers as input and outputs them in sorted order (the smaller one first, then the larger
one).
• Wires: Wires connect the comparators, carrying the numbers from one comparator to the next.
• Network Structure: The comparators are arranged in a network-like structure, where numbers
flow through the network, getting compared and swapped along the way.
• Parallelism: The magic of a sorting network is that many comparisons can happen at the same
time (in parallel). This is how it achieves speedup compared to traditional sorting algorithms that
do one comparison at a time.
https://sandeepvi.medium.com
Example
How it works:
3. Each comparator compares the two numbers on its input wires and swaps them if they are in the
wrong order.
4. The numbers continue flowing through the network until they reach the output wires on the
right, now in sorted order.
Enumeration sort is a straightforward sorting algorithm that works by comparing each element in the list
with every other element. It counts how many elements are smaller than each element, and this count
determines the final position of the element in the sorted list.
https://sandeepvi.medium.com
Steps
1. Comparison: For each element in the list, compare it with all other elements.
2. Counting: For each element, count the number of elements smaller than it.
3. Placement: Place each element in its correct position based on the count.
Example
1. 3:
o 3 > 1 (count = 1)
o 3 < 4 (count = 1)
o 3 > 2 (count = 2)
2. 1:
o 1 < 3 (count = 0)
o 1 < 4 (count = 0)
o 1 < 2 (count = 0)
3. 4:
o 4 > 3 (count = 1)
o 4 > 1 (count = 2)
o 4 > 2 (count = 3)
4. 2:
o 2 < 3 (count = 1)
o 2 > 1 (count = 1)
o 2 < 4 (count = 1)
Key Points
• Inefficient: It has a time complexity of O(n^2), making it inefficient for large lists.
• Parallelizable: The comparisons can be done independently, making it suitable for parallel
implementation
(b) Explain Branch & Bound method. How 0/1 Knapsack problem can be solved using branch and
bound method?
• Bounding: Calculating an upper or lower bound to compare and prune unpromising branches.
• Pruning: Discarding branches that cannot produce better results than already found solutions.
1. Branching: At each step, decide whether to include an item in the knapsack or not.
2. Bounding: Calculate an upper bound for the value that can be obtained from the current
subproblem. This could be the total value if all remaining items are added, respecting the weight
constraint.
3. Pruning: If the bound of a branch is less than the best solution found so far, prune that branch.
OR
Branch and Bound is like a smart way to try different combinations without checking every single one.
1. Make a Tree: Think of all the possible combinations as a tree. Each branch in the tree is like a
decision: "Do I take this item or not?"
2. Estimate (Bound): Before you explore a branch too far, you make a quick estimate. You pretend
you could take parts of items (like taking half a sandwich). This gives you a rough idea of the best
value you could get if you went down that branch. It's an optimistic estimate.
3. Cut Off Bad Branches (Prune): If your estimate for a branch is already worse than the best full
combination you've found so far, there's no point exploring that branch. You "cut it off" (prune
it). It's like saying, "I don't need to go down that path; I already know it won't be the best."
https://sandeepvi.medium.com
4. Explore: You keep going down the tree, making estimates and cutting off branches, until you've
explored enough to be sure you've found the absolute best combination.
• Backpack limit: 10 kg
1. Start: You haven't packed anything yet. You estimate you could take item 2 and a bit of item 3 for
a total value of around $70.
2. Try Item 1:
o Take Item 1: You estimate you could then take item 2 for a total value of $50.
o Don't take Item 1: You estimate you could still take item 2 and a bit of item 3, for a value
of around $70.
3. Keep Going: You keep exploring, but if you find a combination that's worth more than your
current estimate for a branch, you don't bother with that branch.
Basically, Branch and Bound helps you find the best solution by being smart about what you check. You
don't have to try every single combination; you can rule out many of them early on. It's like a shortcut to
the best answer!
(7 marks each)
(a) Find the optimal solution to the Knapsack instances n = 5, w = (5, 10, 20, 30, 40) v = (30, 20, 100, 90,
160) and W = 60, by using Greedy approach (Fractional Knapsack)
For each item, calculate the ratio of its value to its weight (v/w). This tells us how much value we get per
unit of weight for each item.
• Item 1: 30/5 = 6
• Item 2: 20/10 = 2
• Item 3: 100/20 = 5
• Item 4: 90/30 = 3
• Item 5: 160/40 = 4
https://sandeepvi.medium.com
2. Sort by Ratio:
Start adding items to the knapsack, beginning with the item that has the highest value-to-weight ratio.
• Item 1: Add the entire item 1 (weight 5, value 30). Knapsack capacity remaining: 60 - 5 = 55.
• Item 3: Add the entire item 3 (weight 20, value 100). Knapsack capacity remaining: 55 - 20 = 35.
• Item 5: Add the entire item 5 (weight 40, value 160). Knapsack capacity remaining: 35 - 40 = -5.
Since we cannot add the whole item 5, we add a fraction of it.
• Fraction of Item 5: Calculate the fraction of item 5 that can fit in the remaining capacity: 35/40 =
0.875. Add this fraction of item 5 to the knapsack. The value added is 0.875 * 160 = 140.
The maximum value that can be carried in the knapsack is the sum of the values of the items (or
fractions of items) added.
Therefore, the optimal solution using the Greedy approach (Fractional Knapsack) yields a maximum
value of 270.
(b) What is sum of subset problem? Draw a state space tree for Sum of subset problem using
backtracking? Let n = 6, m = 30, and w[1:6] = (5, 10, 12, 13, 15, 18)
The Sum of Subsets Problem is a classic combinatorial problem where, given a set of non-negative
integers (w) and a target sum (m), the goal is to find a subset (or subsets) of those integers whose
elements sum up to exactly m.
A state-space tree is a graphical representation of the possible solutions to a problem. Each node in the
tree represents a decision point, and the branches represent the different choices that can be made. In
the Sum of Subsets problem, each level of the tree corresponds to an element in the set, and the
branches represent whether to include that element in the subset or not.
https://sandeepvi.medium.com
Here's how the state-space tree is constructed for the given instance (n = 6, m = 30, w = (5, 10, 12, 13,
15, 18)) using backtracking:
1. Root Node: Represents the starting point with an empty subset and a sum of 0.
2. Level 1:
o Left Branch: Include the first element (5). The current sum is 5.
o Right Branch: Exclude the first element (5). The current sum is 0.
4. Continue: Repeat this process for each subsequent level, considering whether to include or
exclude the corresponding element.
5. Pruning: At each node, check if the current sum exceeds the target sum (m = 30). If it does,
prune that branch (stop exploring it) as it cannot lead to a valid solution.
6. Solution Nodes: When a node is reached where the current sum equals the target sum (m = 30),
it represents a valid solution.
Backtracking:
The backtracking algorithm explores this tree in a depth-first manner. It starts at the root node and
explores each branch until it reaches a solution node or a dead end (where the sum exceeds m). If it
reaches a dead end, it backtracks (goes back up the tree) to the last decision point and explores the
other branch. This process continues until all possible paths have been explored.
https://sandeepvi.medium.com
Solutions:
By traversing the state-space tree and applying backtracking, we find the following subsets that sum to
30:
• {12, 18}
(7 marks each)
1. Speedup: The ratio of the time taken by the best sequential algorithm to the time taken by the
parallel algorithm.
2. Efficiency: How effectively the parallel resources are being utilized. It is the ratio of speedup to
the number of processors.
3. Scalability: The ability of the algorithm to handle increasing problem sizes or more processors
without a significant drop in performance.
4. Load Balancing: How evenly the work is distributed among processors. If one processor is idle,
the algorithm is inefficient.
5. Communication Overhead: The time spent on exchanging data between processors. Minimizing
communication is crucial for good parallel performance.
6. Memory Usage: The amount of memory used by the algorithm. More processors often mean
more memory is required.
(b) What is parallel searching and also explain the CREW searching?
Parallel Searching refers to techniques that search for an element in a dataset using multiple processors
simultaneously to reduce the time it takes to find the element. It divides the dataset into smaller parts
and searches them in parallel, which can lead to faster search times than sequential searching.
https://sandeepvi.medium.com
CREW Searching:
CREW (Concurrent Read, Exclusive Write) model allows multiple processors to read the same memory
location simultaneously but restricts multiple processors from writing to the same location at the same
time. This ensures data consistency while allowing efficient parallel searching.
Example:
• CREW Parallel Search: Multiple processors search different portions of the data in parallel,
reading elements concurrently. When they find a match, the result is written to a shared
variable, ensuring only one processor writes at a time.
(7 marks each)
Breadth First Search (BFS) is a graph traversal algorithm that explores all the vertices at the present
depth level before moving on to the next level. It uses a queue to keep track of nodes to visit next.
Algorithm:
2. Dequeue a node, visit it, and enqueue its neighbors if they are unvisited.
Example:
Consider the graph:
/\
B C
| |
D E
OR
Breadth-First Search (BFS) is an algorithm for traversing or searching tree or graph data structures. It
starts at the root 1 (or some arbitrary node) and explores all the neighbor nodes at the present depth
prior to moving on to nodes at the next depth level. 2 Imagine searching a building floor by floor, rather
than going deep into one room and then another.
How it works:
Example:
/\
B C
https://sandeepvi.medium.com
/\ \
D E F
2. Dequeue A:
3. Dequeue B:
4. Dequeue C:
5. Dequeue D:
(b) Explain P, NP, NP Complete, and NP hard problems with suitable examples.
• NP-Hard: A problem is NP-hard if it is at least as hard as the hardest problems in NP, but it might
not be in NP. Example: Halting Problem.
https://sandeepvi.medium.com
o Correctness: The algorithm must give the correct output for all valid inputs.
o Efficiency: The algorithm should use minimal resources (time and space).
OR
Think of it like this: You follow an algorithm every time you bake a cake. The recipe is the algorithm, the
ingredients are the input, and the cake is the output.
A good algorithm, whether for a simple task or a complex problem, should possess several key
properties. These can be broadly categorized into factors relating to correctness, efficiency, clarity, and
other practical considerations.
Correctness:
• Correctness: This is the most crucial aspect. An algorithm must produce the correct output for
all valid inputs. It should adhere precisely to the problem's specification and handle edge cases
(unusual or boundary inputs) gracefully. A partially correct algorithm is often not very useful.
• Finiteness: The algorithm must terminate after a finite number of steps. It shouldn't get stuck in
an infinite loop. Even if the number of steps is large, it must eventually come to an end.
• Unambiguity (Precision): Each step of the algorithm must be precisely defined, with no room for
interpretation. The instructions should be clear, unambiguous, and deterministic. There should
be no guesswork involved.
• Completeness: For a given input, the algorithm should either produce a solution or indicate that
no solution exists. It shouldn't leave the user in doubt about whether a solution was found.
Efficiency:
• Time Efficiency: An algorithm should use a reasonable amount of time to execute. Time
complexity measures how the execution time grows as the input size increases. We often use Big
https://sandeepvi.medium.com
O notation (e.g., O(n), O(log n), O(n log n), O(n²)) to express time complexity. Lower time
complexity is generally better. We want algorithms that scale well with larger inputs.
• Space Efficiency: An algorithm should use a reasonable amount of memory (space) during its
execution. Space complexity measures how the memory usage grows with input size. Minimizing
space usage is important, especially for large datasets or resource-constrained environments
(like mobile devices).
• Optimality (Ideal): Ideally, an algorithm should be the most efficient possible for the given
problem. However, achieving true optimality isn't always feasible, especially for complex
problems. We often settle for algorithms that are "good enough" in terms of efficiency.
• Readability: An algorithm should be easy to understand and follow, both for the person who
wrote it and for others who might need to implement or modify it later. Clear variable names,
comments, and a well-structured code are essential. A well-documented algorithm is much more
useful.
• Simplicity: An algorithm should be as simple as possible while still being effective. Avoid
unnecessary complexity. Simpler algorithms are often easier to understand, implement, and
debug.
• Modularity: Breaking down a complex algorithm into smaller, self-contained modules (functions
or subroutines) can improve readability, maintainability, and reusability. This makes the
algorithm easier to understand and modify.
• Maintainability: An algorithm should be easy to modify or update if needed (e.g., to fix bugs,
improve performance, or adapt to new requirements). This is closely related to readability and
modularity.
• Answer: Big O notation is crucial in algorithm analysis because it provides a standardized way to
describe how an algorithm's runtime scales with the size of its input data, essentially giving a
"worst-case scenario" estimate of how long an algorithm will take to run, allowing developers to
compare different algorithms and choose the most efficient one for a given problem, even as the
input size grows significantly.
• Upper bound:
It represents the upper limit on an algorithm's execution time, signifying the worst possible scenario for
a given input size.
• Scalability analysis:
By looking at the Big O notation, developers can understand how an algorithm's performance will change
as the input size increases, which is critical for large datasets.
• Comparison tool:
Big O notation allows for direct comparison between different algorithms, even if implemented
differently, by focusing on the dominant growth rate of their time complexity.
Example:
• O(n):
An algorithm with a time complexity of O(n) means the execution time grows linearly with the input size
(n). This is considered relatively efficient for most applications.
• O(log n):
An algorithm with a time complexity of O(log n) is considered very efficient as the execution time
increases much slower than the input size, typically seen in algorithms like binary search.
• O(n²):
An algorithm with a time complexity of O(n²) is less efficient as the execution time increases
quadratically with the input size, making it less suitable for large datasets.
(a) Explain Merge Sort with an example and its time complexity.
Merge Sort is a popular sorting algorithm that follows the divide and conquer paradigm. It recursively
breaks down the input list into smaller sublists until each sublist contains only one element (which is
considered sorted). Then, it repeatedly merges these sorted sublists back together until you have a
single sorted list.
1. Divide:
o Recursively repeat this process for each half until you have sublists of size 1.
2. Conquer:
3. Combine:
Example:
1. Divide:
o [5] and [2] and [4] and [6] and [1] and [3]
2. Conquer:
3. Combine:
o [1, 2, 3, 4, 5, 6]
Time Complexity:
Merge Sort has a time complexity of O(n log n) in all cases (worst, average, and best). This is because:
• Divide: Dividing the list in half takes logarithmic time (log n).
https://sandeepvi.medium.com
Since we have to divide and combine the list multiple times, the overall time complexity becomes O(n
log n).
• Well-suited for linked lists: It can be efficiently implemented for linked lists.
• Not in-place: It requires additional memory to store the sublists during the merging process.
In summary, Merge Sort is a reliable and efficient sorting algorithm with a time complexity of O(n log n).
It's a good choice for sorting large datasets when stability is important.
(b) Explain Quick Sort with an example and its time complexity.
1. Choose a Pivot: Select an element from the array to be the pivot. There are various strategies for
choosing a pivot (e.g., first element, last element, random element, median of three).
2. Partition: Rearrange the array so that all elements less than the pivot are placed before it, and all
elements greater than the pivot are placed after it. Elements equal to the pivot can go either
way. The pivot is now in its final sorted position.
3. Recurse: Recursively apply steps 1 and 2 to the sub-arrays created in the partitioning step.
2. Partition:
o Initialize two pointers, left and right, at the beginning and end of the array (excluding the
pivot).
o Move the left pointer to the right until you find an element greater than or equal to the
pivot.
https://sandeepvi.medium.com
o Move the right pointer to the left until you find an element less than the pivot.
o If left and right haven't crossed, swap the elements at left and right.
o Swap the pivot with the element at the left pointer (which is now where the pivot
belongs).
After partitioning, the array becomes: [4, 1, 2, 3, 5, 9, 8] (5 is now in its sorted position).
3. Recurse:
The recursive calls will continue partitioning and sorting the sub-arrays until the entire array is sorted: [1,
2, 3, 4, 5, 8, 9].
Time Complexity:
• Best Case: O(n log n). This occurs when the pivot selection consistently divides the array into
roughly equal halves.
• Average Case: O(n log n). On average, with a good pivot selection strategy, Quick Sort performs
very well.
• Worst Case: O(n^2). This occurs when the pivot selection is consistently poor (e.g., always
choosing the smallest or largest element). In this scenario, one sub-array is always empty, and
the recursion depth becomes n.
• In-place: Requires minimal extra memory (although the recursive calls use some stack space).
• Random pivot selection: Choosing a random pivot makes it less likely to encounter the worst-
case scenario.
• Median-of-three pivot selection: Choosing the median of the first, middle, and last elements as
the pivot often provides a better pivot.
(c) Explain Heap Sort with an example and its time complexity.
https://sandeepvi.medium.com
Answer: Heap Sort is a comparison-based sorting algorithm that uses a binary heap data
structure. It's an in-place algorithm, meaning it doesn't require significant extra memory. It
works by first building a max-heap (or min-heap) from the input array and then repeatedly
removing the root (maximum or minimum element) and placing it at the end of the array.
1. Build a Heap:
o Transform the input array into a max-heap (or min-heap). In a max-heap, the value of
each node is greater than or equal to the value of its children.
2. Sort:
o Repeatedly remove the root (which is the maximum element in a max-heap) from the
heap and place it at the end of the array.
o After each removal, heapify the remaining heap to maintain the heap property.
1. Build a Max-Heap:
o The array is transformed into a max-heap. The exact steps are a bit involved, but the
general idea is to start from the middle of the array and work your way up, "sifting
down" elements that are smaller than their children. After building the heap, the array
might conceptually be represented like this (though the actual array is still just [10, 5, 4,
3, 1]):
o 10
o / \
o 5 4
o /\
o 3 1
2. Sort:
o Step 1: Swap the root (10) with the last element (1). The array becomes [1, 5, 4, 3, 10].
o Step 2: Reduce the heap size by 1 (conceptually removing the 10). Heapify the remaining
part of the array (the first four elements) to restore the max-heap property. The array
becomes [5, 3, 4, 1, 10].
o 5
o / \
https://sandeepvi.medium.com
o 3 4
o /
o 1
o Step 3: Swap the new root (5) with the second-to-last element (1). The array becomes
[1, 3, 4, 5, 10].
o Step 4: Reduce the heap size and heapify. The array becomes [4, 3, 1, 5, 10].
o 4
o / \
o 3 1
o Continue this process until the entire array is sorted: [1, 3, 4, 5, 10].
Time Complexity:
• Sorting: Removing the root and heapifying takes O(log n) time for each element. Since we do
this n times, this part takes O(n log n) time.
Therefore, the overall time complexity of Heap Sort is O(n log n) in all cases (worst, average, and
best).
• Not stable: Doesn't necessarily preserve the relative order of equal elements.
• Can be slower than other O(n log n) algorithms: While the time complexity is the same, the
constant factors can make it slightly slower than algorithms like Merge Sort in practice, especially
on some datasets. However, its in-place nature often makes it preferable when memory is
limited.
https://sandeepvi.medium.com
• Answer: A Red-Black Tree is a self-balancing binary search tree with the following properties:
3. Red nodes cannot have red children (no two consecutive red nodes).
4. Every path from a node to its descendant NULL nodes must have the same number of
black nodes.
5. Operations like insertions and deletions are balanced to ensure the tree remains
approximately balanced.
• Answer:
B-tree Explained
A B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential
access, insertions, and deletions in logarithmic time.
It's a generalization of a binary search tree, allowing nodes to have more than two children. This makes it
particularly efficient for working with large amounts of data, especially when data is stored on disk.
• Balanced: All leaf nodes are at the same level, ensuring consistent search times.
• Ordered: Keys within each node are stored in sorted order, facilitating efficient searching.
• Multiple keys per node: Each node can hold multiple keys and have multiple children, reducing
the tree's height and the number of disk accesses required for operations.
How it works:
o The root node has at least two children (unless it's a leaf).
2. Operations:
https://sandeepvi.medium.com
o Search: Similar to a binary search, but each node is examined to find the key or
determine the appropriate child node to explore further.
o Insertion: New keys are inserted into leaf nodes. If a leaf node becomes full, it's split,
and the middle key is moved up to the parent node. This process may propagate up the
tree.
o Deletion: Keys are deleted from leaf nodes. If a node becomes too empty, it may be
merged with a sibling or borrow keys from a sibling to maintain the minimum number of
children.
Applications:
B+ tree Explained
A B+ tree is a variation of the B-tree with some key differences that make it even more suitable for
certain applications, especially databases.
1. Data storage:
o B-tree: Stores keys and data in both internal nodes and leaf nodes.
o B+ tree: Stores data only in leaf nodes. Internal nodes only store keys to guide the
search.
o B+ tree: Leaf nodes are linked together in a sequential order, forming a linked list. This
allows for efficient range queries (retrieving all data within a certain range).
3. Redundancy:
o B+ tree: Keys are duplicated in the leaf nodes. This redundancy simplifies certain
operations and makes sequential access more efficient.
• Efficient range queries: The linked list of leaf nodes in a B+ tree makes it very efficient to retrieve
data within a specific range.
• Simplified insertion and deletion: Since data is only stored in leaf nodes, insertion and deletion
operations are generally simpler in B+ trees.
https://sandeepvi.medium.com
• Better sequential access: The linked list of leaf nodes allows for easy sequential access to all data
in sorted order.
Applications:
B+ trees are the preferred choice for most database systems due to their efficiency in handling range
queries and sequential access.
In summary:
• B-trees are a general-purpose balanced tree data structure that stores keys and data in all
nodes.
• B+ trees are a specialized variation where data is only stored in leaf nodes, which are linked
together. This makes B+ trees more efficient for range queries and sequential access, making
them ideal for database systems.
Answer: Performance measures for parallel algorithms help us understand how well a parallel
algorithm utilizes resources and scales with increasing problem size and processor count. Here are
some key measures:
1. Speedup:
Definition: The ratio of the execution time of the best sequential algorithm for a problem to the
execution time of the parallel algorithm.
Ideal Speedup: Ideally, with p processors, we'd expect a speedup of p (linear speedup). In reality,
speedup is often less due to overheads.
Superlinear Speedup: Occasionally, speedup can be greater than p. This can happen due to cache
effects, or if the parallel algorithm explores the search space more efficiently than the sequential
one. It's often a sign that the sequential algorithm wasn't the absolute best.
2. Efficiency:
Practical Efficiency: Efficiency is usually less than 1 due to communication overheads, idle processors,
and other factors.
3. Scalability:
Definition: How well the parallel algorithm performs as the problem size or the number of
processors increases. There are two main types:
Strong Scaling: How the execution time changes when the number of processors increases
while keeping the problem size constant. Ideally, execution time should decrease linearly
with the number of processors.
Weak Scaling: How the execution time changes when both the number of processors and
the problem size increase proportionally. Ideally, the execution time should remain constant.
Metrics: Scalability is often measured by how efficiency changes as the number of processors or
problem size changes.
4. Overhead:
Definition: The extra time spent by the parallel algorithm that is not spent by the sequential
algorithm. This includes communication, synchronization, and idle time.
https://sandeepvi.medium.com
5. Cost:
Definition: The product of the execution time and the number of processors used.
6. Amdahl's Law:
Definition: A law that states the maximum speedup achievable by parallelizing a program is limited
by the portion of the program that cannot be parallelized (the sequential portion).
Implication: Even with an infinite number of processors, the speedup is limited by 1/(1-f), where 'f' is
the fraction of the program that can be parallelized.
7. Gustafson's Law:
Definition: A law that states that the problem size that can be solved in a fixed amount of time grows
linearly with the number of processors.
Implication: Focuses on how increasing the number of processors allows us to solve larger problems,
rather than just reducing the execution time of a fixed-size problem.
8. Communication Overhead:
Definition: The time spent by processors communicating with each other. This is a major source of
overhead in parallel algorithms.
Factors: Communication overhead depends on the communication network, the amount of data
being communicated, and the communication patterns.
9. Synchronization Overhead:
Definition: The time spent by processors waiting for each other to reach a certain point in the
computation. This can occur due to barriers, locks, or other synchronization mechanisms.
Definition: When some processors have more work to do than others. This can lead to idle
processors and reduce overall efficiency.
Answer: Parallel searching is a technique that leverages multiple processors or cores to speed up
the process of finding a specific element within a data set. Instead of sequentially examining
https://sandeepvi.medium.com
each element, the data is divided among the available processors, and each processor searches a
portion of the data concurrently. This can significantly reduce the overall search time, especially
for large datasets.
How it works:
3. Search: All processors perform the search operation simultaneously on their assigned partitions.
4. Combine: If any processor finds the target element, it reports the result. Otherwise, the search
continues until all partitions have been examined.
CREW (Concurrent Read Exclusive Write) is a parallel computing model where multiple processors can
read from the same memory location simultaneously (Concurrent Read), but only one processor can
write to a specific memory location at any given time (Exclusive Write).
1. Data Distribution: The data set is stored in the shared memory, accessible by all processors.
2. Partitioning: The data is divided into equal-sized partitions, with each partition assigned to a
different processor.
3. Search: Each processor performs a local search on its assigned partition. This can be a linear
search or any other suitable search algorithm.
4. Result Reporting: A shared "result" variable is used to store the outcome of the search. If a
processor finds the target element, it writes its index or position to the "result" variable. Since
CREW allows concurrent reads, all processors can monitor the "result" variable to check if the
element has been found.
5. Termination: The search terminates when either the element is found (i.e., the "result" variable
is updated) or all processors have completed their search without finding the element.
Example:
Let's say we have an array of 10 elements and 5 processors. We want to search for the element '7'.
4. If any processor finds '7', it writes its index to the shared "result" variable.
5. The search terminates when '7' is found or all partitions are searched.
• Faster search time: By dividing the work among multiple processors, the search time can be
significantly reduced.
• Scalability: Parallel searching can be easily scaled by adding more processors to handle larger
datasets.
Considerations:
• Overhead: There is some overhead associated with dividing the data, assigning partitions, and
coordinating the processors.
• Data structure: The data structure should be suitable for parallel access.
• Algorithm: The search algorithm used by each processor should be efficient for the given data
and partition size.
https://sandeepvi.medium.com
Answer: A greedy algorithm is a simple, intuitive approach to problem-solving where you make the
best choice at each step, based on the information currently available, without regard for the overall,
long-term consequences. It's like always taking the "biggest" or "best" option you see right now,
hoping that by doing so repeatedly, you'll end up with the best overall solution.
Locally Optimal Choices: They make locally optimal choices at each step, meaning they pick the best
option available at that moment.
No Backtracking: Once a choice is made, it's never reconsidered or undone. Greedy algorithms don't
look back.
Not Always Optimal: The biggest drawback is that they don't guarantee finding the globally optimal
solution. They can get stuck in "local optima," which are good solutions but not the absolute best.
Imagine you're a cashier, and you need to give someone a certain amount of change using the
fewest number of coins possible. Let's say you have coins of denominations 1, 5, 10, and 25, and you
need to give 63 cents in change.
Start with the largest denomination: Choose as many 25-cent coins as possible without exceeding
the total amount. You can use two 25-cent coins (50 cents total).
Move to the next largest: Now you have 13 cents remaining. Choose as many 10-cent coins as
possible. You can use one 10-cent coin (60 cents total).
Continue: You have 3 cents remaining. Choose as many 1-cent coins as possible. You'll need three 1-
cent coins.
So, the greedy algorithm would give you two 25-cent coins, one 10-cent coin, and three 1-cent coins,
for a total of six coins.
In this particular case, the greedy approach does happen to find the optimal solution (the fewest
number of coins).
Suppose you have a knapsack with a weight limit, and you have several items, each with a weight
and a value. You can take fractions of items. The goal is to maximize the total value of the items you
put in the knapsack.
Calculate value-to-weight ratio: For each item, divide its value by its weight. This tells you how much
"value" you get per unit of weight.
Sort by ratio: Sort the items in descending order of their value-to-weight ratio.
Fill the knapsack: Start adding items to the knapsack, starting with the item with the highest ratio. If
an item doesn't fit completely, take a fraction of it to fill the remaining space.
In this case, the greedy algorithm guarantees finding the optimal solution.
Optimization problems: When you're trying to find the best solution, even if it's not guaranteed to
be perfect.
Approximation algorithms: When finding the absolute best solution is too computationally
expensive, a greedy approach can give you a "good enough" solution quickly.
Heuristics: Greedy algorithms can be used as heuristics (rules of thumb) to guide the search for a
solution.
Simple problems: When the problem has a simple structure, and the locally optimal choice is likely
to be globally optimal.
4 1
A ----- B ----- C
| /\ |
2 / \ 8
| / \ |
C ----- D
• Dijkstra's steps:
2. From A:
▪ Reach B (distance 4)
4. From C:
6. From B:
• Result: Shortest paths: A to B (4), A to C (2), A to D (9). [Image: Final paths highlighted, distances
marked]
1. Draw the graph: Represent cities as circles (nodes) and roads as lines (edges).
3. Highlight steps: In each image, highlight the current node being visited and the paths being
considered.
4. Update distances: Show how distances are updated as the algorithm progresses.
10 4
A ----- B ----- C
| /\ |
5 / \ 7
| / \ |
C ----- D
• Prim's steps:
• Result: Bridges: A-C (5), B-C (4), C-D (7). Total cost: 16. [Image: Final MST highlighted]
3. Highlight steps: In each image, highlight the edge being added to the MST.
• Kruskal's steps:
1. Draw the graph and list edges: Show all edges and their costs.
2. Highlight steps: In each image, highlight the edge being added to the MST.
3. Show cycle detection: If an edge is skipped, illustrate why it would create a cycle.
10 35
A ----- B ----- C
| /\ |
15 / \ 30
| / \ |
D -----
20
https://sandeepvi.medium.com
• Nearest Neighbor:
• Result: Route A-B-D-C-A. Total distance: 85. [Image: Complete route highlighted]
3. Highlight steps: In each image, highlight the edge being added to the route.
5. Knapsack Problem
• Items:
• 0/1 Knapsack:
▪ ...
• Fractional Knapsack:
2. Draw items: Represent items as boxes with their weights and values.
3. Show combinations: In each image, show the items being considered for inclusion in the
knapsack.
Limitations:
Local Optima: The biggest limitation is that greedy algorithms can get stuck in local optima. Imagine
you're climbing a mountain, and you always take the steepest path upwards. You might reach a peak,
but it might not be the highest peak on the mountain.
• Answer: Dynamic Programming (DP) is used for optimization problems by solving overlapping
subproblems and storing results to avoid redundant calculations.
6. Graph Algorithms:
Depth-First Search (DFS) is an algorithm for traversing or searching tree or graph data structures. It starts
at the root (or some arbitrary node) and explores as far as possible along each branch before
backtracking. Think of it like exploring a maze by choosing one path and going as deep as you can until
you hit a dead end, then going back and trying another path.
How it works:
3. Explore adjacent nodes: For each unvisited node adjacent to the current node:
Example:
/\
B C
https://sandeepvi.medium.com
/\ \
D E F
Breadth-First Search (BFS) is a graph traversal algorithm that explores a graph level by level. It starts at a
given node and visits all of its neighbors before moving on to the neighbors of those neighbors, and so
on. Think of it like exploring a building floor by floor, rather than going deep into one room and then
another.
How it works:
2. Use a queue: Maintain a queue (a First-In, First-Out data structure) to keep track of the nodes to
visit.
https://sandeepvi.medium.com
Example:
/\
B C
/\ \
D E F
2. Dequeue A:
3. Dequeue B:
4. Dequeue C:
5. Dequeue D:
https://sandeepvi.medium.com
7. NP-Complete Problems:
Answer: the complexity classes P, NP, NP-complete, and NP-hard with examples:
1. P (Polynomial Time):
• Example:
o Sorting: Sorting an array of numbers (e.g., merge sort - O(n log n)).
o Adding two numbers: Even if the numbers are huge, the number of operations grows
only linearly with the number of digits.
• Definition: Problems whose solutions can be verified in polynomial time. This means that if
someone gives you a potential solution, you can quickly (in polynomial time) check if it's correct.
Finding the solution itself might be hard, but checking a proposed solution is easy.
• Example:
o The Traveling Salesperson Problem (TSP): Given a list of cities and the distances
between them, find the shortest route that visits each city exactly once and returns to
the starting city. If someone gives you a proposed route, you can easily check if it's valid
(visits each city once) and calculate its total distance.
o The Subset Sum Problem: Given a set of integers, is there a subset whose elements sum
to a given target value? If someone gives you a subset, you can quickly add up the
numbers and see if it equals the target.
https://sandeepvi.medium.com
o Graph Coloring: Can the vertices of a graph be colored with a given number of colors
such that no two adjacent vertices have the same color? If someone gives you a
coloring, you can quickly check if any adjacent vertices have the same color.
3. NP-complete:
• Definition: Problems that are both in NP and NP-hard. They are the "hardest" problems in NP. If
you could solve any NP-complete problem in polynomial time, you could solve all problems in NP
in polynomial time (meaning P = NP). No one has ever proven that P = NP, and most computer
scientists believe they are not equal.
• Key Idea: NP-complete problems are all "inter-reducible." If you can transform one NP-complete
problem into another NP-complete problem in polynomial time, you can solve one if and only if
you can solve the other.
• Examples:
o The Clique Problem: In a graph, is there a complete subgraph (a clique) of a given size?
o The Vertex Cover Problem: In a graph, is there a set of vertices such that every edge has
at least one endpoint in this set?
o The Knapsack Problem (Decision version): Given a set of items, each with a weight and
a value, and a maximum weight capacity, can you select a subset of items whose total
value is at least a given target value?
4. NP-hard:
• Definition: Problems that are at least as hard as the hardest problems in NP. This means that all
problems in NP can be reduced to them in polynomial time. However, NP-hard problems don't
have to be in NP. In fact, some NP-hard problems are known to be outside of NP.
• Key Idea: If you could solve an NP-hard problem in polynomial time, you could solve all problems
in NP in polynomial time.
• Examples:
o The Halting Problem: Given a computer program and an input, will the program
eventually halt (stop running), or will it run forever? This problem is undecidable
(outside of NP). It's NP-hard because if you could solve it, you could solve any problem in
NP.
o Unsolvable Problems: Problems for which no algorithm can ever be constructed to solve
them. These are also NP-hard.
In summary:
• NP-complete: The hardest problems in NP. If you can solve one quickly, you can solve them all
quickly (and P = NP).
• NP-hard: At least as hard as NP-complete problems, but they don't have to be in NP. Some are
even unsolvable.
Since finding exact solutions to NP-complete problems is believed to be very difficult, here are some
common approaches:
Approximation algorithms: Find a solution that is "good enough," even if it's not the absolute best.
Heuristics: Use rules of thumb or educated guesses to find a solution that is likely to be good.
Special cases: If the problem has some special constraints, it might become solvable in polynomial time.
Randomized algorithms: Use randomness to find a solution that is correct with high probability.