0% found this document useful (0 votes)
2 views49 pages

Advance Algorithm

The document discusses various advanced algorithms and data structures, including asymptotic notation, recurrence relations, Red-Black Trees, and parallel algorithms. It covers specific topics such as the Greedy approach, Backtracking vs. Branch and Bound, and B-trees, along with their properties and characteristics. Additionally, it provides examples and explanations for solving recurrence relations, sorting algorithms like merge sort, and merging in the EREW model.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views49 pages

Advance Algorithm

The document discusses various advanced algorithms and data structures, including asymptotic notation, recurrence relations, Red-Black Trees, and parallel algorithms. It covers specific topics such as the Greedy approach, Backtracking vs. Branch and Bound, and B-trees, along with their properties and characteristics. Additionally, it provides examples and explanations for solving recurrence relations, sorting algorithms like merge sort, and merging in the EREW model.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

https://sandeepvi.medium.

com

Advance Algorithm

SECTION A

1. Attempt all questions in brief.


(2 marks each)

(a) What is asymptotic notation? Explain Big Oh notation.


Asymptotic notation is used to describe the efficiency of algorithms in terms of their input size as it
grows large. It helps classify algorithms based on their performance.
Big Oh (O-notation) represents the upper bound of an algorithm's time complexity, meaning it defines
the worst-case scenario for the algorithm’s growth rate. It describes the maximum amount of time an
algorithm will take in the worst case.
For example, O(n) implies that as the input size increases, the algorithm’s execution time increases
linearly.

(b) What is recurrence relation? How is a recurrence solved using the master’s theorem?
A recurrence relation describes the time complexity of a recursive algorithm in terms of smaller sub-
problems. For example, T(n) = 2T(n/2) + n² describes an algorithm that splits the problem into two sub-
problems of size n/2 and combines them with O(n²) effort.
The Master Theorem solves recurrences of the form T(n) = aT(n/b) + f(n). It compares the growth rates
of n^log_b(a) and f(n) to determine the overall time complexity. Depending on whether f(n) is larger or
smaller than n^log_b(a), we can directly conclude the time complexity.

(c) Write down the properties of Red-Black Tree.

• Binary Search Tree (BST): Follows the properties of a binary search tree.

• Colors: Each node is either red or black.

• Root is Black: The root node is always black.

• Red Node Rule: No two red nodes can be adjacent (a red node cannot have a red child).

• Leaf Nodes: Every leaf (NULL node) is black.

• Black Height: The path from any node to its leaf nodes must contain the same number of black
nodes.

• Balanced: Red-black trees ensure balanced height, ensuring logarithmic time complexity for
search operations.

(d) Write down design issues of parallel algorithms.


Designing parallel algorithms involves several challenges:
https://sandeepvi.medium.com

1. Decomposition: Efficiently breaking down the problem into smaller tasks that can be executed in
parallel.

2. Communication: Minimizing data sharing between processors, as excessive communication can


reduce parallel efficiency.

3. Synchronization: Ensuring that parallel tasks are executed in the correct order without conflicts
or data corruption.

4. Load Balancing: Ensuring all processors are used effectively, avoiding idle times.

5. Scalability: Ensuring the algorithm performs well as the number of processors increases.

6. Fault Tolerance: The algorithm must handle processor or task failures without crashing the
system.

(e) (i) Discuss Greedy approach with a suitable example.


A Greedy approach makes the locally optimal choice at each stage with the hope of finding the global
optimum. The idea is to choose the best option available at each step and not revisit the decision.
Example: Coin Change Problem — Given coins of denominations {1, 5, 10, 25}, if you need to make
change for 30 units, the greedy approach will select 25 first, then 5, and reach the total. This is optimal
for this specific case but may not always work for all problems.

(ii) Differentiate Backtracking algorithm with Branch and Bound algorithm.

• Backtracking explores all possible solutions by trying one solution at a time and discarding non-
promising paths as soon as a dead end is reached. It’s typically used for problems like the N-
Queens problem or Sudoku.

• Branch and Bound also explores possible solutions but uses a systematic approach to prune
branches that cannot lead to an optimal solution. It’s often used for optimization problems, like
the Traveling Salesman Problem, where the goal is to find the best solution.

(g) Write down the characteristics of B-tree.

• Balanced: All leaf nodes are at the same level, ensuring logarithmic height.

• Self-balancing: Rebalances itself after insertions and deletions to maintain efficient search times.

• Multi-way: Each node can have more than two children, making it ideal for databases where
large blocks of data are read and written.

• Sorted: Data is kept in sorted order within the tree, allowing for fast search and retrieval
operations.

• Efficient: Due to its structure, B-trees provide efficient search, insert, and delete operations in
O(log n) time, even for large datasets.
https://sandeepvi.medium.com

SECTION B

2. Attempt any three of the following:


(7 marks each)

(a) Solve the following recurrence relation:

• (i) T(n) = 47(n/2) + n²


To solve using the Master Theorem:

o Here, a = 47, b = 2, and f(n) = n².

o We need to compare n^log_b(a) with f(n).

o n^log_2(47) ≈ n^5.58.

o Since f(n) = n² grows slower than n^5.58, the time complexity is dominated by the
n^log_b(a) term.

o Therefore, T(n) = O(n^5.58).

• (ii) T(n) = 2T(√n) + log n


This recurrence is a bit tricky as it doesn't directly match the form used in the Master Theorem.
Using substitution or recursion tree, we can determine that the time complexity is
approximately O(log log n). This is because the size of the sub-problems decreases exponentially
with each recursive call.

(b) Sort the following list using merge sort:

List: [14, 33, 21, 45, 67, 20, 40, 59, 12, 36]

• Merge Sort divides the list into halves, recursively sorts them, and then merges them.

o Step 1: Divide [14, 33, 21, 45, 67, 20, 40, 59, 12, 36] into two halves: [14, 33, 21, 45, 67]
and [20, 40, 59, 12, 36].

o Step 2: Keep dividing each sub-array until we have individual elements.

o Step 3: Merge sorted sub-arrays back together.

Sorted List: [12, 14, 20, 21, 33, 36, 40, 45, 59, 67]

• Time Complexity: O(n log n) due to the repeated splitting and merging steps.

OR

Merge sort is a divide-and-conquer algorithm that sorts a list by repeatedly dividing it into smaller
sublists, sorting each sublist, and then merging the sorted sublists back together.
https://sandeepvi.medium.com

Steps:

1. Divide: Divide the list into two halves.

2. Conquer: Recursively sort each half using merge sort.

3. Combine: Merge the two sorted halves back together to form a single sorted list.

Example:

Let's sort the list: [14, 33, 21, 45, 67, 20, 40, 59, 12, 36]

1. Divide:

o Divide the list into two halves: [14, 33, 21, 45, 67] and [20, 40, 59, 12, 36]

o Recursively divide each half until you have sublists of size 1.

2. Conquer:

o Sort each sublist of size 1 (they are already sorted).

3. Combine:

o Merge the sorted sublists back together:

▪ [14, 33, 21, 45, 67] becomes [14, 21, 33, 45, 67]

▪ [20, 40, 59, 12, 36] becomes [12, 20, 36, 40, 59]

▪ Finally, merge [14, 21, 33, 45, 67] and [12, 20, 36, 40, 59] to get the sorted list:
[12, 14, 20, 21, 33, 36, 40, 45, 59, 67]

Time Complexity:

Merge sort has a time complexity of O(n log n), which means it is efficient even for large lists.

Space Complexity:

Merge sort has a space complexity of O(n), which means it requires extra memory to store the sublists
during the merging process.

(c) Discuss the various cases for insertion of a key in red-black tree:

Inserting a key into a Red-Black Tree involves maintaining the tree’s properties, and there are a few
important cases to consider:

1. Case 1: The inserted node is the root, which must be black.

2. Case 2: If the parent node is black, the tree is still valid and no further action is needed.
https://sandeepvi.medium.com

3. Case 3: If the parent node is red, this violates the Red-Black property (two consecutive red
nodes). In this case:

o Recoloring: Change the parent and uncle to black and the grandparent to red.

o Rotation: If recoloring doesn’t solve the issue, perform rotations (left or right) to
rebalance the tree and restore the properties.

OR

It's important to remember the red-black tree properties:

1. Every node is either red or black.

2. The root is always black.

3. Every NIL node (leaf) is considered black.

4. If a node is red, both its children must be black.

5. Every path from a given node to any of its descendant NIL nodes contains the same number of
black nodes (black-height).

Insertion in a red-black tree starts like a regular binary search tree insertion. The new node is always
inserted as a red node. This might violate the red-black tree properties, so we need to rebalance the
tree. The rebalancing involves rotations and recoloring.

Here's a breakdown of the cases, categorized by the "uncle" of the newly inserted node (z):

Case 1: Uncle is Red

• Scenario: The new node (z)'s parent (p) and uncle (u) are both red.

• Solution:

1. Recolor: Change the colors of p, u, and z's grandparent (g) to their opposites (red
becomes black, black becomes red).

2. Move up: Treat g as the new z and repeat the process from Case 1 or Case 2/3, if
necessary.

• Example:

G (Black) G (Red)

/ \ / \

P (Red) U (Red) P (Black) U (Black)

/\ /\ /\

Z ... ... ... Z ...

Case 2: Uncle is Black (or NIL) and z is the Right Child


https://sandeepvi.medium.com

• Scenario: The new node (z)'s parent (p) is red, the uncle (u) is black (or NIL), and z is the right
child of p.

• Solution:

1. Left Rotation: Perform a left rotation on p. This makes z the parent and p the left child of
z.

2. Now, we've transformed this into Case 3.

Case 3: Uncle is Black (or NIL) and z is the Left Child

• Scenario: The new node (z)'s parent (p) is red, the uncle (u) is black (or NIL), and z is the left child
of p.

• Solution:

1. Recolor: Swap the colors of p and z's grandparent (g).

2. Right Rotation: Perform a right rotation on g.

Cases 2 and 3 Illustrated:

G (Black) G (Black) G (Red)

/ \ / \ / \

P (Red) U (Black) P (Red) U(Black) P (Black) U (Black)

/\ /\ / \

... Z ... Z Z ...

(Case 3) (Case 2) (After Rotation)

4. These steps ensure that the tree remains balanced and satisfies all Red-Black properties after
each insertion.

(d) Discuss merging on EREW model with a suitable example:

In the EREW (Exclusive Read, Exclusive Write) model, no two processors can read or write the same
memory location at the same time. This creates challenges when merging data.

EREW PRAM Model:

• Exclusive Read: No two processors can read the same memory location simultaneously.

• Exclusive Write: No two processors can write to the same memory location simultaneously.
https://sandeepvi.medium.com

These restrictions make merging on EREW PRAM a bit more complex than on models with concurrent
access. We need to be careful about how we distribute the data and manage the writes to avoid
conflicts.

Merging on EREW PRAM:

The basic idea is to divide the two sorted input arrays among the processors and have each processor
perform a portion of the merge. Since we can't have concurrent writes, we need a strategy to ensure
that each element is written to the correct location in the output array without conflicts.

Algorithm (Simplified):

1. Divide: Divide the two sorted input arrays (A and B) into roughly equal parts, one part for each
processor.

2. Local Merge: Each processor performs a local merge of its assigned portions of A and B. This
results in several small sorted sub-arrays.

3. Global Merge: The sorted sub-arrays from each processor need to be merged together. This is
the tricky part due to the EREW restriction. One common technique is to use a binary tree-like
reduction.

Example:

Let's say we have two sorted arrays:

• A = [2, 5, 7, 9]

• B = [1, 3, 4, 6, 8, 10]

And we have 4 processors (P1, P2, P3, P4).

1. Divide:

o P1: A[0:1] = [2, 5], B[0:1] = [1, 3]

o P2: A[2:3] = [7, 9], B[2:3] = [4, 6]

o P3: B[4:5] = [8, 10] (A is exhausted, so P3 only gets part of B)

o P4: (No data assigned in this example)

2. Local Merge:

o P1: [2, 5] and [1, 3] merge to [1, 2, 3, 5]

o P2: [7, 9] and [4, 6] merge to [4, 6, 7, 9]

o P3: [8, 10] merges to [8, 10]

3. Global Merge (Binary Tree Reduction):

o Level 1:
https://sandeepvi.medium.com

▪ P1 and P2 "merge" their results: [1, 2, 3, 5] and [4, 6, 7, 9] -> conceptually [1, 2,
3, 4, 5, 6, 7, 9]

▪ P3 and P4 (which has no data) merges to [8,10]

o Level 2:

▪ The result from P1/P2 and P3/P4 are now merged. This would conceptually be
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Challenges and Solutions for EREW:

• Writing to the output array: We must ensure no two processors write to the same location. The
binary tree reduction approach naturally handles this because each merge step writes to a
distinct portion of the output.

• Data distribution: Distributing the data evenly among processors is important for load balancing.

• Handling uneven input sizes: The input arrays might not be perfectly divisible by the number of
processors. The algorithm needs to handle these cases gracefully.

(e) Explain Depth First Search with a suitable example:

DFS explores as far down a branch as possible before backtracking. It’s commonly used to traverse or
search tree or graph data structures.

Depth-First Search (DFS) is an algorithm for traversing or searching tree or graph data structures. It starts
at the root (or some arbitrary node) and explores as far as possible along each branch before
backtracking. Think of it like exploring a maze by always choosing one path and going as deep as you can
before hitting a dead end and then going back to try another path.

How it works:

1. Start at a node: Pick any node in the graph to start.

2. Visit and mark: Mark the current node as visited.

3. Explore adjacent nodes: For each unvisited node adjacent to the current node:

o Recursively call DFS on that node.

Example:

Let's consider the following graph:

/\

B C

/\ \
https://sandeepvi.medium.com

D E F

We'll start at node A and perform a DFS:

1. Start at A: Mark A as visited.

2. Explore B: B is adjacent to A and unvisited. Call DFS on B.

3. Explore D: D is adjacent to B and unvisited. Call DFS on D.

4. Explore G: G is adjacent to D and unvisited. Call DFS on G.

5. G has no unvisited neighbors: Return to D.

6. Explore (nothing): D has no other unvisited neighbors. Return to B.

7. Explore E: E is adjacent to B and unvisited. Call DFS on E.

8. E has no unvisited neighbors: Return to B.

9. B has no other unvisited neighbors: Return to A.

10. Explore C: C is adjacent to A and unvisited. Call DFS on C.

11. Explore F: F is adjacent to C and unvisited. Call DFS on F.

12. F has no unvisited neighbors: Return to C.

13. C has no other unvisited neighbors: Return to A.

14. A has no other unvisited neighbors: The DFS is complete.

Order of visited nodes: A, B, D, G, E, C, F

pplications of DFS:

• Finding connected components: In a graph, DFS can be used to identify groups of connected
nodes.

• Cycle detection: DFS can detect cycles in a directed graph.

• Topological sorting: For directed acyclic graphs (DAGs), DFS can be used to produce a topological
sort.

• Path finding: DFS can be adapted to find a path between two given nodes.

• Maze solving: DFS is a natural algorithm for exploring a maze.

Key Points:

• Recursion: DFS is often implemented recursively.


https://sandeepvi.medium.com

• Stack: The recursive calls use a stack (the call stack) to keep track of the nodes being explored.
You can also implement DFS iteratively using an explicit stack.

• Backtracking: The algorithm "backtracks" when it reaches a dead end (a node with no unvisited
neighbors).

SECTION C

3. Attempt any one part of the following:


(7 marks each)

(a) Write down the algorithm for Max Heapify in heap sort.

def MaxHeapify(arr, n, i):

largest = i

left = 2 * i + 1

right = 2 * i + 2

if left < n and arr[left] > arr[largest]:

largest = left

if right < n and arr[right] > arr[largest]:

largest = right

if largest != i:

arr[i], arr[largest] = arr[largest], arr[i]

MaxHeapify(arr, n, largest)

• Time Complexity: O(log n), as the algorithm may need to move down the tree (height of the tree
is log n).

(b) Describe the properties and requirements of a good algorithm.

A good algorithm, whether for a simple task or a complex problem, should possess several key properties
and fulfill certain requirements. These can be broadly categorized into factors relating to correctness,
efficiency, and clarity.

Correctness:
https://sandeepvi.medium.com

• Correctness: The most fundamental requirement. An algorithm must produce the correct output
for all valid inputs. It should adhere to the problem's specification and handle edge cases
gracefully.

• Finiteness: The algorithm must terminate after a finite number of steps. It shouldn't get stuck in
an infinite loop.

• Unambiguity: Each step of the algorithm must be precisely defined, with no room for
interpretation. The instructions should be clear and unambiguous.

• Completeness: For a given input, the algorithm should either produce a solution or indicate that
no solution exists. It shouldn't leave the user in doubt.

Efficiency:

• Time Efficiency: An algorithm should use a reasonable amount of time to execute. Time
complexity is often expressed using Big O notation (e.g., O(n), O(log n), O(n log n), O(n²)). Lower
time complexity is generally better.

• Space Efficiency: An algorithm should use a reasonable amount of memory (space) during its
execution. Space complexity is also often expressed using Big O notation. Minimizing space
usage is important, especially for large datasets or resource-constrained environments.

• Optimality (Ideal): Ideally, an algorithm should be the most efficient possible for the given
problem. However, achieving optimality isn't always feasible, especially for complex problems.
We often settle for algorithms that are "good enough" in terms of efficiency.

Clarity and Other Factors:

• Readability: An algorithm should be easy to understand and follow, both for the person who
wrote it and for others who might need to implement or modify it later. Clear variable names,
comments, and a well-structured code are essential.

• Simplicity: An algorithm should be as simple as possible while still being effective. Avoid
unnecessary complexity. Simpler algorithms are often easier to understand, implement, and
debug.

• Robustness: An algorithm should be able to handle unexpected or invalid inputs without


crashing or producing incorrect results. It should be designed to be resilient to errors.

• Modularity: Breaking down a complex algorithm into smaller, self-contained modules can
improve readability, maintainability, and reusability.

• Maintainability: An algorithm should be easy to modify or update if needed. This is closely


related to readability and modularity.

• Implementability: The algorithm should be practical to implement in a chosen programming


language and on a target platform.
https://sandeepvi.medium.com

Attempt any one part of the following:

(7 marks each)

(a) What do you mean by parallel sorting networks? Also discuss the enumeration sort algorithm.

Parallel Sorting Networks:


A parallel sorting network is a type of sorting algorithm where the sorting process involves multiple
parallel operations (comparisons and swaps). The key idea is to use a fixed sequence of comparisons to
sort the data, regardless of the input order. These networks are designed for parallel computation,
making them efficient in hardware and massively parallel systems.

For example, in a Bitonic Sort or Odd-Even Merging Sort, multiple elements are compared and swapped
simultaneously, making them suitable for parallel processing.

Enumeration Sort:
The Enumeration Sort algorithm is a comparison-based sorting algorithm where each element is placed
at its correct position by counting the number of elements smaller than it. The steps are:

1. For each element, count how many elements in the array are smaller than it.

2. Place the element at the position corresponding to that count.

3. This is repeated for all elements in the list.

Time Complexity: The time complexity of Enumeration Sort is O(n^2), making it inefficient for large
datasets. However, it’s useful in scenarios with limited unique elements.

OR

Parallel Sorting Networks

Imagine you have a bunch of numbers that you need to sort, and you have multiple processors available
to help speed things up. A parallel sorting network is a way to arrange comparisons between these
numbers in a specific pattern so that the sorting can happen simultaneously.

Key Ideas

• Comparators: The basic building block of a sorting network is a comparator. A comparator takes
two numbers as input and outputs them in sorted order (the smaller one first, then the larger
one).

• Wires: Wires connect the comparators, carrying the numbers from one comparator to the next.

• Network Structure: The comparators are arranged in a network-like structure, where numbers
flow through the network, getting compared and swapped along the way.

• Parallelism: The magic of a sorting network is that many comparisons can happen at the same
time (in parallel). This is how it achieves speedup compared to traditional sorting algorithms that
do one comparison at a time.
https://sandeepvi.medium.com

Example

Here's a simple sorting network for 4 numbers:

How it works:

1. The numbers enter the network from the left.

2. They travel along the wires and encounter comparators.

3. Each comparator compares the two numbers on its input wires and swaps them if they are in the
wrong order.

4. The numbers continue flowing through the network until they reach the output wires on the
right, now in sorted order.

Enumeration Sort Algorithm

Enumeration sort is a straightforward sorting algorithm that works by comparing each element in the list
with every other element. It counts how many elements are smaller than each element, and this count
determines the final position of the element in the sorted list.
https://sandeepvi.medium.com

Steps

1. Comparison: For each element in the list, compare it with all other elements.

2. Counting: For each element, count the number of elements smaller than it.

3. Placement: Place each element in its correct position based on the count.

Example

Let's sort the list: [3, 1, 4, 2]

1. 3:

o 3 > 1 (count = 1)

o 3 < 4 (count = 1)

o 3 > 2 (count = 2)

o Final position of 3 is index 2.

2. 1:

o 1 < 3 (count = 0)

o 1 < 4 (count = 0)

o 1 < 2 (count = 0)

o Final position of 1 is index 0.

3. 4:

o 4 > 3 (count = 1)

o 4 > 1 (count = 2)

o 4 > 2 (count = 3)

o Final position of 4 is index 3.

4. 2:

o 2 < 3 (count = 1)

o 2 > 1 (count = 1)

o 2 < 4 (count = 1)

o Final position of 2 is index 1.

Sorted list: [1, 2, 3, 4]

Key Points

• Simple: Enumeration sort is easy to understand and implement.


https://sandeepvi.medium.com

• Inefficient: It has a time complexity of O(n^2), making it inefficient for large lists.

• Parallelizable: The comparisons can be done independently, making it suitable for parallel
implementation

(b) Explain Branch & Bound method. How 0/1 Knapsack problem can be solved using branch and
bound method?

Branch and Bound:


Branch and Bound is an algorithmic technique used for solving optimization problems, particularly useful
when there is a need to explore all possible solutions but in a more efficient manner than brute force. It
systematically explores the solution space, using bounds to eliminate large sections of the search tree
that cannot lead to an optimal solution.

• Branching: Dividing the problem into subproblems (branches).

• Bounding: Calculating an upper or lower bound to compare and prune unpromising branches.

• Pruning: Discarding branches that cannot produce better results than already found solutions.

0/1 Knapsack Problem using Branch and Bound:


For the 0/1 Knapsack problem, where the goal is to maximize the total value of items placed in a
knapsack with a weight constraint, the branch and bound method works as follows:

1. Branching: At each step, decide whether to include an item in the knapsack or not.

2. Bounding: Calculate an upper bound for the value that can be obtained from the current
subproblem. This could be the total value if all remaining items are added, respecting the weight
constraint.

3. Pruning: If the bound of a branch is less than the best solution found so far, prune that branch.

4. Continue branching and bounding until all possibilities are explored.

OR

Branch and Bound is like a smart way to try different combinations without checking every single one.

1. Make a Tree: Think of all the possible combinations as a tree. Each branch in the tree is like a
decision: "Do I take this item or not?"

2. Estimate (Bound): Before you explore a branch too far, you make a quick estimate. You pretend
you could take parts of items (like taking half a sandwich). This gives you a rough idea of the best
value you could get if you went down that branch. It's an optimistic estimate.

3. Cut Off Bad Branches (Prune): If your estimate for a branch is already worse than the best full
combination you've found so far, there's no point exploring that branch. You "cut it off" (prune
it). It's like saying, "I don't need to go down that path; I already know it won't be the best."
https://sandeepvi.medium.com

4. Explore: You keep going down the tree, making estimates and cutting off branches, until you've
explored enough to be sure you've found the absolute best combination.

Knapsack Example (Simplified):

• Backpack limit: 10 kg

• Item 1: 5 kg, worth $10

• Item 2: 4 kg, worth $40

• Item 3: 6 kg, worth $30

1. Start: You haven't packed anything yet. You estimate you could take item 2 and a bit of item 3 for
a total value of around $70.

2. Try Item 1:

o Take Item 1: You estimate you could then take item 2 for a total value of $50.

o Don't take Item 1: You estimate you could still take item 2 and a bit of item 3, for a value
of around $70.

3. Keep Going: You keep exploring, but if you find a combination that's worth more than your
current estimate for a branch, you don't bother with that branch.

Basically, Branch and Bound helps you find the best solution by being smart about what you check. You
don't have to try every single combination; you can rule out many of them early on. It's like a shortcut to
the best answer!

Attempt any one part of the following:

(7 marks each)

(a) Find the optimal solution to the Knapsack instances n = 5, w = (5, 10, 20, 30, 40) v = (30, 20, 100, 90,
160) and W = 60, by using Greedy approach (Fractional Knapsack)

Calculate the Value-to-Weight Ratio:

For each item, calculate the ratio of its value to its weight (v/w). This tells us how much value we get per
unit of weight for each item.

• Item 1: 30/5 = 6

• Item 2: 20/10 = 2

• Item 3: 100/20 = 5

• Item 4: 90/30 = 3

• Item 5: 160/40 = 4
https://sandeepvi.medium.com

2. Sort by Ratio:

Arrange the items in descending order of their value-to-weight ratio.

• Item 1: (6, 5, 30) - Ratio, Weight, Value

• Item 3: (5, 20, 100)

• Item 5: (4, 40, 160)

• Item 4: (3, 30, 90)

• Item 2: (2, 10, 20)

3. Fill the Knapsack:

Start adding items to the knapsack, beginning with the item that has the highest value-to-weight ratio.

• Item 1: Add the entire item 1 (weight 5, value 30). Knapsack capacity remaining: 60 - 5 = 55.

• Item 3: Add the entire item 3 (weight 20, value 100). Knapsack capacity remaining: 55 - 20 = 35.

• Item 5: Add the entire item 5 (weight 40, value 160). Knapsack capacity remaining: 35 - 40 = -5.
Since we cannot add the whole item 5, we add a fraction of it.

• Fraction of Item 5: Calculate the fraction of item 5 that can fit in the remaining capacity: 35/40 =
0.875. Add this fraction of item 5 to the knapsack. The value added is 0.875 * 160 = 140.

4. Calculate Total Value:

The maximum value that can be carried in the knapsack is the sum of the values of the items (or
fractions of items) added.

Total Value = 30 + 100 + 140 = 270

Therefore, the optimal solution using the Greedy approach (Fractional Knapsack) yields a maximum
value of 270.

(b) What is sum of subset problem? Draw a state space tree for Sum of subset problem using
backtracking? Let n = 6, m = 30, and w[1:6] = (5, 10, 12, 13, 15, 18)

The Sum of Subsets Problem is a classic combinatorial problem where, given a set of non-negative
integers (w) and a target sum (m), the goal is to find a subset (or subsets) of those integers whose
elements sum up to exactly m.

State Space Tree using Backtracking

A state-space tree is a graphical representation of the possible solutions to a problem. Each node in the
tree represents a decision point, and the branches represent the different choices that can be made. In
the Sum of Subsets problem, each level of the tree corresponds to an element in the set, and the
branches represent whether to include that element in the subset or not.
https://sandeepvi.medium.com

Here's how the state-space tree is constructed for the given instance (n = 6, m = 30, w = (5, 10, 12, 13,
15, 18)) using backtracking:

1. Root Node: Represents the starting point with an empty subset and a sum of 0.

2. Level 1:

o Left Branch: Include the first element (5). The current sum is 5.

o Right Branch: Exclude the first element (5). The current sum is 0.

3. Level 2: For each node in Level 1, we have two branches:

o Left Branch: Include the second element (10).

o Right Branch: Exclude the second element (10).

4. Continue: Repeat this process for each subsequent level, considering whether to include or
exclude the corresponding element.

5. Pruning: At each node, check if the current sum exceeds the target sum (m = 30). If it does,
prune that branch (stop exploring it) as it cannot lead to a valid solution.

6. Solution Nodes: When a node is reached where the current sum equals the target sum (m = 30),
it represents a valid solution.

Backtracking:

The backtracking algorithm explores this tree in a depth-first manner. It starts at the root node and
explores each branch until it reaches a solution node or a dead end (where the sum exceeds m). If it
reaches a dead end, it backtracks (goes back up the tree) to the last decision point and explores the
other branch. This process continues until all possible paths have been explored.
https://sandeepvi.medium.com

Solutions:

By traversing the state-space tree and applying backtracking, we find the following subsets that sum to
30:

• {5, 10, 15}

• {5, 12, 13}

• {12, 18}

Attempt any one part of the following:

(7 marks each)

(a) What are the factors performance measures of Parallel Algorithm?

The performance of parallel algorithms is measured by several factors:

1. Speedup: The ratio of the time taken by the best sequential algorithm to the time taken by the
parallel algorithm.

o Speedup = T(sequential) / T(parallel)

2. Efficiency: How effectively the parallel resources are being utilized. It is the ratio of speedup to
the number of processors.

o Efficiency = Speedup / Number of processors

3. Scalability: The ability of the algorithm to handle increasing problem sizes or more processors
without a significant drop in performance.

4. Load Balancing: How evenly the work is distributed among processors. If one processor is idle,
the algorithm is inefficient.

5. Communication Overhead: The time spent on exchanging data between processors. Minimizing
communication is crucial for good parallel performance.

6. Memory Usage: The amount of memory used by the algorithm. More processors often mean
more memory is required.

(b) What is parallel searching and also explain the CREW searching?

Parallel Searching refers to techniques that search for an element in a dataset using multiple processors
simultaneously to reduce the time it takes to find the element. It divides the dataset into smaller parts
and searches them in parallel, which can lead to faster search times than sequential searching.
https://sandeepvi.medium.com

CREW Searching:
CREW (Concurrent Read, Exclusive Write) model allows multiple processors to read the same memory
location simultaneously but restricts multiple processors from writing to the same location at the same
time. This ensures data consistency while allowing efficient parallel searching.

Example:

• CREW Parallel Search: Multiple processors search different portions of the data in parallel,
reading elements concurrently. When they find a match, the result is written to a shared
variable, ensuring only one processor writes at a time.

Attempt any one part of the following:

(7 marks each)

(a) Describe Breadth First Search algorithm with a suitable example.

Breadth First Search (BFS) is a graph traversal algorithm that explores all the vertices at the present
depth level before moving on to the next level. It uses a queue to keep track of nodes to visit next.

Algorithm:

1. Initialize a queue and enqueue the starting node.

2. Dequeue a node, visit it, and enqueue its neighbors if they are unvisited.

3. Repeat the process until the queue is empty.


https://sandeepvi.medium.com

Example:
Consider the graph:

/\

B C

| |

D E

Starting from A, BFS visits nodes in this order: A → B → C → D → E.

OR

Breadth-First Search (BFS) is an algorithm for traversing or searching tree or graph data structures. It
starts at the root 1 (or some arbitrary node) and explores all the neighbor nodes at the present depth
prior to moving on to nodes at the next depth level. 2 Imagine searching a building floor by floor, rather
than going deep into one room and then another.

How it works:

1. Start at a node: Pick any node in the graph to start.

2. Use a queue: Maintain a queue to keep track of nodes to visit.

3. Visit and enqueue:

o Mark the starting node as visited.

o Enqueue the starting node.

4. Dequeue and explore:

o While the queue is not empty:

▪ Dequeue a node from the queue.

▪ For each unvisited neighbor of the dequeued node:

▪ Mark the neighbor as visited.

▪ Enqueue the neighbor.

Example:

Let's consider the following graph:

/\

B C
https://sandeepvi.medium.com

/\ \

D E F

We'll start at node A and perform a BFS:

1. Start at A: Mark A as visited, enqueue A. Queue: [A]

2. Dequeue A:

o Mark B (neighbor of A) as visited, enqueue B. Queue: [B]

o Mark C (neighbor of A) as visited, enqueue C. Queue: [B, C]

3. Dequeue B:

o Mark D (neighbor of B) as visited, enqueue D. Queue: [C, D]

o Mark E (neighbor of B) as visited, enqueue E. Queue: [C, D, E]

4. Dequeue C:

o Mark F (neighbor of C) as visited, enqueue F. Queue: [D, E, F]

5. Dequeue D:

o Mark G (neighbor of D) as visited, enqueue G. Queue: [E, F, G]

6. Dequeue E: (No unvisited neighbors)

7. Dequeue F: (No unvisited neighbors)

8. Dequeue G: (No unvisited neighbors)

9. Queue is empty: The BFS is complete.

Order of visited nodes: A, B, C, D, E, F, G

(b) Explain P, NP, NP Complete, and NP hard problems with suitable examples.

• P (Polynomial Time): A problem is in P if it can be solved in polynomial time. Example: Sorting.

• NP (Nondeterministic Polynomial Time): A problem is in NP if a solution can be verified in


polynomial time. Example: Sudoku.

• NP-Complete: A problem is NP-complete if it is both in NP and as hard as any problem in NP.


Example: Traveling Salesman Problem.
https://sandeepvi.medium.com

• NP-Hard: A problem is NP-hard if it is at least as hard as the hardest problems in NP, but it might
not be in NP. Example: Halting Problem.
https://sandeepvi.medium.com

UNIT .I Algorithm Fundamentals

(a) What is an algorithm? Explain the characteristics of a good algorithm.

• Answer: An algorithm is a step-by-step procedure or set of rules to be followed in calculations or


problem-solving. It must be clear, unambiguous, and produce a result.

Characteristics of a good algorithm:

o Correctness: The algorithm must give the correct output for all valid inputs.

o Efficiency: The algorithm should use minimal resources (time and space).

o Finiteness: The algorithm must terminate after a finite number of steps.

o Clarity: Each step must be well-defined and unambiguous.

OR

An algorithm is a well-defined, step-by-step procedure or set of rules for solving a problem or


accomplishing a specific task. It's like a recipe for a computer, telling it exactly what to do, in what order,
to get a desired result. Algorithms are fundamental to computer science and are used in everything from
simple calculations to complex artificial intelligence systems.

Think of it like this: You follow an algorithm every time you bake a cake. The recipe is the algorithm, the
ingredients are the input, and the cake is the output.

Characteristics of a Good Algorithm:

A good algorithm, whether for a simple task or a complex problem, should possess several key
properties. These can be broadly categorized into factors relating to correctness, efficiency, clarity, and
other practical considerations.

Correctness:

• Correctness: This is the most crucial aspect. An algorithm must produce the correct output for
all valid inputs. It should adhere precisely to the problem's specification and handle edge cases
(unusual or boundary inputs) gracefully. A partially correct algorithm is often not very useful.

• Finiteness: The algorithm must terminate after a finite number of steps. It shouldn't get stuck in
an infinite loop. Even if the number of steps is large, it must eventually come to an end.

• Unambiguity (Precision): Each step of the algorithm must be precisely defined, with no room for
interpretation. The instructions should be clear, unambiguous, and deterministic. There should
be no guesswork involved.

• Completeness: For a given input, the algorithm should either produce a solution or indicate that
no solution exists. It shouldn't leave the user in doubt about whether a solution was found.

Efficiency:

• Time Efficiency: An algorithm should use a reasonable amount of time to execute. Time
complexity measures how the execution time grows as the input size increases. We often use Big
https://sandeepvi.medium.com

O notation (e.g., O(n), O(log n), O(n log n), O(n²)) to express time complexity. Lower time
complexity is generally better. We want algorithms that scale well with larger inputs.

• Space Efficiency: An algorithm should use a reasonable amount of memory (space) during its
execution. Space complexity measures how the memory usage grows with input size. Minimizing
space usage is important, especially for large datasets or resource-constrained environments
(like mobile devices).

• Optimality (Ideal): Ideally, an algorithm should be the most efficient possible for the given
problem. However, achieving true optimality isn't always feasible, especially for complex
problems. We often settle for algorithms that are "good enough" in terms of efficiency.

Clarity and Other Factors:

• Readability: An algorithm should be easy to understand and follow, both for the person who
wrote it and for others who might need to implement or modify it later. Clear variable names,
comments, and a well-structured code are essential. A well-documented algorithm is much more
useful.

• Simplicity: An algorithm should be as simple as possible while still being effective. Avoid
unnecessary complexity. Simpler algorithms are often easier to understand, implement, and
debug.

• Robustness: An algorithm should be able to handle unexpected or invalid inputs without


crashing or producing incorrect results. It should be designed to be resilient to errors. Error
handling is important.

• Modularity: Breaking down a complex algorithm into smaller, self-contained modules (functions
or subroutines) can improve readability, maintainability, and reusability. This makes the
algorithm easier to understand and modify.

• Maintainability: An algorithm should be easy to modify or update if needed (e.g., to fix bugs,
improve performance, or adapt to new requirements). This is closely related to readability and
modularity.

• Implementability: The algorithm should be practical to implement in a chosen programming


language and on a target platform. Some algorithms might be theoretically efficient but very
difficult to implement in practice.

(b) What is the significance of Big O notation in analyzing algorithms?

• Answer: Big O notation is crucial in algorithm analysis because it provides a standardized way to
describe how an algorithm's runtime scales with the size of its input data, essentially giving a
"worst-case scenario" estimate of how long an algorithm will take to run, allowing developers to
compare different algorithms and choose the most efficient one for a given problem, even as the
input size grows significantly.

Key points about Big O notation:


https://sandeepvi.medium.com

• Upper bound:

It represents the upper limit on an algorithm's execution time, signifying the worst possible scenario for
a given input size.

• Scalability analysis:

By looking at the Big O notation, developers can understand how an algorithm's performance will change
as the input size increases, which is critical for large datasets.

• Comparison tool:

Big O notation allows for direct comparison between different algorithms, even if implemented
differently, by focusing on the dominant growth rate of their time complexity.

Example:

• O(n):

An algorithm with a time complexity of O(n) means the execution time grows linearly with the input size
(n). This is considered relatively efficient for most applications.

• O(log n):

An algorithm with a time complexity of O(log n) is considered very efficient as the execution time
increases much slower than the input size, typically seen in algorithms like binary search.

• O(n²):

An algorithm with a time complexity of O(n²) is less efficient as the execution time increases
quadratically with the input size, making it less suitable for large datasets.

• (c) Explain Master’s Theorem with an example.

Answer: Master’s Theorem is used to analyze the time complexity of divide-and-conquer


algorithms. It works with recurrence relations of the form: T(n)=aT(n/b)+f(n)T(n) = aT(n/b) +
f(n)T(n)=aT(n/b)+f(n) The theorem has three cases based on the relationship between
f(n)f(n)f(n) and nlogban^{log_b{a}}nlogba.

o Example: T(n)=2T(n/2)+O(n)T(n) = 2T(n/2) + O(n)T(n)=2T(n/2)+O(n) Using Master’s


Theorem, a=2a = 2a=2, b=2b = 2b=2, and f(n)=O(n)f(n) = O(n)f(n)=O(n). According to
Case 2 of the theorem, the complexity is O(nlog⁡n)O(n \log n)O(nlogn).
https://sandeepvi.medium.com

UNIT .II Sorting Algorithms:

(a) Explain Merge Sort with an example and its time complexity.

Merge Sort is a popular sorting algorithm that follows the divide and conquer paradigm. It recursively
breaks down the input list into smaller sublists until each sublist contains only one element (which is
considered sorted). Then, it repeatedly merges these sorted sublists back together until you have a
single sorted list.

Here's a breakdown of the steps involved:

1. Divide:

o Split the unsorted list into two halves.

o Recursively repeat this process for each half until you have sublists of size 1.

2. Conquer:

o Sublists of size 1 are already sorted.

3. Combine:

o Repeatedly merge pairs of sorted sublists to produce new sorted sublists.

o Continue merging until you have a single sorted list.

Example:

Let's say we have the following unsorted list: [5, 2, 4, 6, 1, 3]

1. Divide:

o [5, 2, 4] and [6, 1, 3]

o [5, 2] and [4] and [6, 1] and [3]

o [5] and [2] and [4] and [6] and [1] and [3]

2. Conquer:

o All sublists are now of size 1 and considered sorted.

3. Combine:

o [2, 5] and [4] and [1, 6] and [3]

o [2, 4, 5] and [1, 3, 6]

o [1, 2, 3, 4, 5, 6]

Time Complexity:

Merge Sort has a time complexity of O(n log n) in all cases (worst, average, and best). This is because:

• Divide: Dividing the list in half takes logarithmic time (log n).
https://sandeepvi.medium.com

• Combine: Merging the sublists takes linear time (n).

Since we have to divide and combine the list multiple times, the overall time complexity becomes O(n
log n).

Advantages of Merge Sort:

• Stable: It preserves the relative order of equal elements.

• Efficient: It has a consistent O(n log n) time complexity.

• Well-suited for linked lists: It can be efficiently implemented for linked lists.

Disadvantages of Merge Sort:

• Not in-place: It requires additional memory to store the sublists during the merging process.

In summary, Merge Sort is a reliable and efficient sorting algorithm with a time complexity of O(n log n).
It's a good choice for sorting large datasets when stability is important.

(b) Explain Quick Sort with an example and its time complexity.

• Answer: Quick Sort is a divide-and-conquer sorting algorithm. It works by selecting a "pivot"


element from the array and partitioning the other elements into 1 two sub-arrays, according to
whether they are less than or greater than the pivot. The sub-arrays are then recursively sorted.

Here's a breakdown of the steps:

1. Choose a Pivot: Select an element from the array to be the pivot. There are various strategies for
choosing a pivot (e.g., first element, last element, random element, median of three).

2. Partition: Rearrange the array so that all elements less than the pivot are placed before it, and all
elements greater than the pivot are placed after it. Elements equal to the pivot can go either
way. The pivot is now in its final sorted position.

3. Recurse: Recursively apply steps 1 and 2 to the sub-arrays created in the partitioning step.

Example (using the first element as the pivot):

Let's sort the array: [5, 1, 8, 3, 2, 9, 4]

1. Choose Pivot: The pivot is 5.

2. Partition:

o Initialize two pointers, left and right, at the beginning and end of the array (excluding the
pivot).

o Move the left pointer to the right until you find an element greater than or equal to the
pivot.
https://sandeepvi.medium.com

o Move the right pointer to the left until you find an element less than the pivot.

o If left and right haven't crossed, swap the elements at left and right.

o Continue this process until left and right cross.

o Swap the pivot with the element at the left pointer (which is now where the pivot
belongs).

After partitioning, the array becomes: [4, 1, 2, 3, 5, 9, 8] (5 is now in its sorted position).

3. Recurse:

o Recursively sort the left sub-array: [4, 1, 2, 3]

o Recursively sort the right sub-array: [9, 8]

The recursive calls will continue partitioning and sorting the sub-arrays until the entire array is sorted: [1,
2, 3, 4, 5, 8, 9].

Time Complexity:

• Best Case: O(n log n). This occurs when the pivot selection consistently divides the array into
roughly equal halves.

• Average Case: O(n log n). On average, with a good pivot selection strategy, Quick Sort performs
very well.

• Worst Case: O(n^2). This occurs when the pivot selection is consistently poor (e.g., always
choosing the smallest or largest element). In this scenario, one sub-array is always empty, and
the recursion depth becomes n.

Advantages of Quick Sort:

• Generally fast: Its average-case performance is excellent.

• In-place: Requires minimal extra memory (although the recursive calls use some stack space).

Disadvantages of Quick Sort:

• Worst-case performance: Can be significantly slower in the worst case.

• Not stable: Doesn't preserve the relative order of equal elements.

Key improvements to mitigate the worst-case scenario:

• Random pivot selection: Choosing a random pivot makes it less likely to encounter the worst-
case scenario.

• Median-of-three pivot selection: Choosing the median of the first, middle, and last elements as
the pivot often provides a better pivot.

(c) Explain Heap Sort with an example and its time complexity.
https://sandeepvi.medium.com

Answer: Heap Sort is a comparison-based sorting algorithm that uses a binary heap data
structure. It's an in-place algorithm, meaning it doesn't require significant extra memory. It
works by first building a max-heap (or min-heap) from the input array and then repeatedly
removing the root (maximum or minimum element) and placing it at the end of the array.

Here's a breakdown of the steps:

1. Build a Heap:

o Transform the input array into a max-heap (or min-heap). In a max-heap, the value of
each node is greater than or equal to the value of its children.

2. Sort:

o Repeatedly remove the root (which is the maximum element in a max-heap) from the
heap and place it at the end of the array.

o After each removal, heapify the remaining heap to maintain the heap property.

Example (using a max-heap):

Let's sort the array: [4, 10, 3, 5, 1]

1. Build a Max-Heap:

o The array is transformed into a max-heap. The exact steps are a bit involved, but the
general idea is to start from the middle of the array and work your way up, "sifting
down" elements that are smaller than their children. After building the heap, the array
might conceptually be represented like this (though the actual array is still just [10, 5, 4,
3, 1]):

o 10

o / \

o 5 4

o /\

o 3 1

2. Sort:

o Step 1: Swap the root (10) with the last element (1). The array becomes [1, 5, 4, 3, 10].

o Step 2: Reduce the heap size by 1 (conceptually removing the 10). Heapify the remaining
part of the array (the first four elements) to restore the max-heap property. The array
becomes [5, 3, 4, 1, 10].

o 5

o / \
https://sandeepvi.medium.com

o 3 4

o /

o 1

o Step 3: Swap the new root (5) with the second-to-last element (1). The array becomes
[1, 3, 4, 5, 10].

o Step 4: Reduce the heap size and heapify. The array becomes [4, 3, 1, 5, 10].

o 4

o / \

o 3 1

o Continue this process until the entire array is sorted: [1, 3, 4, 5, 10].

Time Complexity:

• Building the heap: Takes O(n) time.

• Sorting: Removing the root and heapifying takes O(log n) time for each element. Since we do
this n times, this part takes O(n log n) time.

Therefore, the overall time complexity of Heap Sort is O(n log n) in all cases (worst, average, and
best).

Advantages of Heap Sort:

• In-place: Requires minimal extra memory.

• Efficient: O(n log n) time complexity.

Disadvantages of Heap Sort:

• Not stable: Doesn't necessarily preserve the relative order of equal elements.

• Can be slower than other O(n log n) algorithms: While the time complexity is the same, the
constant factors can make it slightly slower than algorithms like Merge Sort in practice, especially
on some datasets. However, its in-place nature often makes it preferable when memory is
limited.
https://sandeepvi.medium.com

UNIT .III Advanced Data Structures:

(a) What are Red-Black Trees? Explain their properties.

• Answer: A Red-Black Tree is a self-balancing binary search tree with the following properties:

1. Each node is either red or black.

2. The root is always black.

3. Red nodes cannot have red children (no two consecutive red nodes).

4. Every path from a node to its descendant NULL nodes must have the same number of
black nodes.

5. Operations like insertions and deletions are balanced to ensure the tree remains
approximately balanced.

(b) What is a B-tree? How does a B+ tree differ from a B-tree?

• Answer:

B-tree Explained

A B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential
access, insertions, and deletions in logarithmic time.

It's a generalization of a binary search tree, allowing nodes to have more than two children. This makes it
particularly efficient for working with large amounts of data, especially when data is stored on disk.

Key features of a B-tree:

• Balanced: All leaf nodes are at the same level, ensuring consistent search times.

• Ordered: Keys within each node are stored in sorted order, facilitating efficient searching.

• Multiple keys per node: Each node can hold multiple keys and have multiple children, reducing
the tree's height and the number of disk accesses required for operations.

How it works:

1. Structure: A B-tree of order 'm' has the following properties:

o Each node has at most 'm' children.

o Each node (except the root) has at least 'm/2' children.

o The root node has at least two children (unless it's a leaf).

o All leaf nodes are at the same level.

o A non-leaf node with 'k' children contains 'k-1' keys.

2. Operations:
https://sandeepvi.medium.com

o Search: Similar to a binary search, but each node is examined to find the key or
determine the appropriate child node to explore further.

o Insertion: New keys are inserted into leaf nodes. If a leaf node becomes full, it's split,
and the middle key is moved up to the parent node. This process may propagate up the
tree.

o Deletion: Keys are deleted from leaf nodes. If a node becomes too empty, it may be
merged with a sibling or borrow keys from a sibling to maintain the minimum number of
children.

Applications:

B-trees are widely used in:

• Databases: To index data for efficient retrieval.

• File systems: To organize and manage files.

B+ tree Explained

A B+ tree is a variation of the B-tree with some key differences that make it even more suitable for
certain applications, especially databases.

Key differences between B+ tree and B-tree:

1. Data storage:

o B-tree: Stores keys and data in both internal nodes and leaf nodes.

o B+ tree: Stores data only in leaf nodes. Internal nodes only store keys to guide the
search.

2. Leaf node structure:

o B-tree: Leaf nodes are independent.

o B+ tree: Leaf nodes are linked together in a sequential order, forming a linked list. This
allows for efficient range queries (retrieving all data within a certain range).

3. Redundancy:

o B-tree: Keys are stored only once in the tree.

o B+ tree: Keys are duplicated in the leaf nodes. This redundancy simplifies certain
operations and makes sequential access more efficient.

Advantages of B+ trees over B-trees:

• Efficient range queries: The linked list of leaf nodes in a B+ tree makes it very efficient to retrieve
data within a specific range.

• Simplified insertion and deletion: Since data is only stored in leaf nodes, insertion and deletion
operations are generally simpler in B+ trees.
https://sandeepvi.medium.com

• Better sequential access: The linked list of leaf nodes allows for easy sequential access to all data
in sorted order.

Applications:

B+ trees are the preferred choice for most database systems due to their efficiency in handling range
queries and sequential access.

In summary:

• B-trees are a general-purpose balanced tree data structure that stores keys and data in all
nodes.

• B+ trees are a specialized variation where data is only stored in leaf nodes, which are linked
together. This makes B+ trees more efficient for range queries and sequential access, making
them ideal for database systems.

Sources and related content


https://sandeepvi.medium.com

UNIT .IV Parallel Algorithms

(a) What are performance measures of parallel algorithms?

Answer: Performance measures for parallel algorithms help us understand how well a parallel
algorithm utilizes resources and scales with increasing problem size and processor count. Here are
some key measures:

1. Speedup:

Definition: The ratio of the execution time of the best sequential algorithm for a problem to the
execution time of the parallel algorithm.

Formula: Speedup = T(sequential) / T(parallel)

Ideal Speedup: Ideally, with p processors, we'd expect a speedup of p (linear speedup). In reality,
speedup is often less due to overheads.

Superlinear Speedup: Occasionally, speedup can be greater than p. This can happen due to cache
effects, or if the parallel algorithm explores the search space more efficiently than the sequential
one. It's often a sign that the sequential algorithm wasn't the absolute best.

2. Efficiency:

Definition: Measures how well the processors are being utilized.

Formula: Efficiency = Speedup / p (where p is the number of processors)

Ideal Efficiency: 1 (or 100%) indicates perfect processor utilization.

Practical Efficiency: Efficiency is usually less than 1 due to communication overheads, idle processors,
and other factors.

3. Scalability:

Definition: How well the parallel algorithm performs as the problem size or the number of
processors increases. There are two main types:

Strong Scaling: How the execution time changes when the number of processors increases
while keeping the problem size constant. Ideally, execution time should decrease linearly
with the number of processors.

Weak Scaling: How the execution time changes when both the number of processors and
the problem size increase proportionally. Ideally, the execution time should remain constant.

Metrics: Scalability is often measured by how efficiency changes as the number of processors or
problem size changes.

4. Overhead:

Definition: The extra time spent by the parallel algorithm that is not spent by the sequential
algorithm. This includes communication, synchronization, and idle time.
https://sandeepvi.medium.com

Formula: Overhead = (p * T(parallel)) - T(sequential)

Minimizing Overhead: A good parallel algorithm aims to minimize overhead.

5. Cost:

Definition: The product of the execution time and the number of processors used.

Formula: Cost = p * T(parallel)

Cost-Optimal: A parallel algorithm is considered cost-optimal if its cost is proportional to the


execution time of the best sequential algorithm. This means that the total work done by the parallel
algorithm is roughly the same as the work done by the sequential algorithm.

6. Amdahl's Law:

Definition: A law that states the maximum speedup achievable by parallelizing a program is limited
by the portion of the program that cannot be parallelized (the sequential portion).

Implication: Even with an infinite number of processors, the speedup is limited by 1/(1-f), where 'f' is
the fraction of the program that can be parallelized.

7. Gustafson's Law:

Definition: A law that states that the problem size that can be solved in a fixed amount of time grows
linearly with the number of processors.

Implication: Focuses on how increasing the number of processors allows us to solve larger problems,
rather than just reducing the execution time of a fixed-size problem.

8. Communication Overhead:

Definition: The time spent by processors communicating with each other. This is a major source of
overhead in parallel algorithms.

Factors: Communication overhead depends on the communication network, the amount of data
being communicated, and the communication patterns.

9. Synchronization Overhead:

Definition: The time spent by processors waiting for each other to reach a certain point in the
computation. This can occur due to barriers, locks, or other synchronization mechanisms.

10. Load Imbalance:

Definition: When some processors have more work to do than others. This can lead to idle
processors and reduce overall efficiency.

(b) What is Parallel Searching and how is it implemented on CREW?

Answer: Parallel searching is a technique that leverages multiple processors or cores to speed up
the process of finding a specific element within a data set. Instead of sequentially examining
https://sandeepvi.medium.com

each element, the data is divided among the available processors, and each processor searches a
portion of the data concurrently. This can significantly reduce the overall search time, especially
for large datasets.

How it works:

1. Divide: The data set is divided into smaller chunks or partitions.

2. Assign: Each processor is assigned one or more of these partitions.

3. Search: All processors perform the search operation simultaneously on their assigned partitions.

4. Combine: If any processor finds the target element, it reports the result. Otherwise, the search
continues until all partitions have been examined.

Implementation on CREW (Concurrent Read Exclusive Write) PRAM:

CREW (Concurrent Read Exclusive Write) is a parallel computing model where multiple processors can
read from the same memory location simultaneously (Concurrent Read), but only one processor can
write to a specific memory location at any given time (Exclusive Write).

Here's how parallel searching can be implemented on a CREW PRAM:

1. Data Distribution: The data set is stored in the shared memory, accessible by all processors.

2. Partitioning: The data is divided into equal-sized partitions, with each partition assigned to a
different processor.

3. Search: Each processor performs a local search on its assigned partition. This can be a linear
search or any other suitable search algorithm.

4. Result Reporting: A shared "result" variable is used to store the outcome of the search. If a
processor finds the target element, it writes its index or position to the "result" variable. Since
CREW allows concurrent reads, all processors can monitor the "result" variable to check if the
element has been found.

5. Termination: The search terminates when either the element is found (i.e., the "result" variable
is updated) or all processors have completed their search without finding the element.

Example:

Let's say we have an array of 10 elements and 5 processors. We want to search for the element '7'.

1. The array is divided into 5 partitions of 2 elements each.

2. Each processor is assigned one partition.

3. All processors simultaneously search their respective partitions.

4. If any processor finds '7', it writes its index to the shared "result" variable.

5. The search terminates when '7' is found or all partitions are searched.

Advantages of Parallel Searching:


https://sandeepvi.medium.com

• Faster search time: By dividing the work among multiple processors, the search time can be
significantly reduced.

• Scalability: Parallel searching can be easily scaled by adding more processors to handle larger
datasets.

Considerations:

• Overhead: There is some overhead associated with dividing the data, assigning partitions, and
coordinating the processors.

• Data structure: The data structure should be suitable for parallel access.

• Algorithm: The search algorithm used by each processor should be efficient for the given data
and partition size.
https://sandeepvi.medium.com

UNIT .V Advanced Design and Analysis Techniques:

(a) Explain Greedy Algorithms with an example.

Answer: A greedy algorithm is a simple, intuitive approach to problem-solving where you make the
best choice at each step, based on the information currently available, without regard for the overall,
long-term consequences. It's like always taking the "biggest" or "best" option you see right now,
hoping that by doing so repeatedly, you'll end up with the best overall solution.

Key characteristics of Greedy Algorithms:

Locally Optimal Choices: They make locally optimal choices at each step, meaning they pick the best
option available at that moment.

No Backtracking: Once a choice is made, it's never reconsidered or undone. Greedy algorithms don't
look back.

Easy to Implement: They are often simple to understand and implement.

Not Always Optimal: The biggest drawback is that they don't guarantee finding the globally optimal
solution. They can get stuck in "local optima," which are good solutions but not the absolute best.

Example: The Coin Change Problem

Imagine you're a cashier, and you need to give someone a certain amount of change using the
fewest number of coins possible. Let's say you have coins of denominations 1, 5, 10, and 25, and you
need to give 63 cents in change.

A greedy approach would work like this:

Start with the largest denomination: Choose as many 25-cent coins as possible without exceeding
the total amount. You can use two 25-cent coins (50 cents total).

Move to the next largest: Now you have 13 cents remaining. Choose as many 10-cent coins as
possible. You can use one 10-cent coin (60 cents total).

Continue: You have 3 cents remaining. Choose as many 1-cent coins as possible. You'll need three 1-
cent coins.

So, the greedy algorithm would give you two 25-cent coins, one 10-cent coin, and three 1-cent coins,
for a total of six coins.

In this particular case, the greedy approach does happen to find the optimal solution (the fewest
number of coins).

Another Example: Fractional Knapsack Problem

Suppose you have a knapsack with a weight limit, and you have several items, each with a weight
and a value. You can take fractions of items. The goal is to maximize the total value of the items you
put in the knapsack.

A greedy approach would be:


https://sandeepvi.medium.com

Calculate value-to-weight ratio: For each item, divide its value by its weight. This tells you how much
"value" you get per unit of weight.

Sort by ratio: Sort the items in descending order of their value-to-weight ratio.

Fill the knapsack: Start adding items to the knapsack, starting with the item with the highest ratio. If
an item doesn't fit completely, take a fraction of it to fill the remaining space.

In this case, the greedy algorithm guarantees finding the optimal solution.

When are Greedy Algorithms useful?

Optimization problems: When you're trying to find the best solution, even if it's not guaranteed to
be perfect.

Approximation algorithms: When finding the absolute best solution is too computationally
expensive, a greedy approach can give you a "good enough" solution quickly.

Heuristics: Greedy algorithms can be used as heuristics (rules of thumb) to guide the search for a
solution.

Simple problems: When the problem has a simple structure, and the locally optimal choice is likely
to be globally optimal.

All Greedy Algorithm are :

1. Dijkstra's Algorithm (Shortest Path)

Imagine this map:

4 1

A ----- B ----- C

| /\ |

2 / \ 8

| / \ |

C ----- D

• Dijkstra's steps:

1. Start at A: Distance to A = 0. [Image: A highlighted, distances to others marked infinity]

2. From A:

▪ Reach B (distance 4)

▪ Reach C (distance 2) [Image: Arrows from A to B and C, distances marked]


https://sandeepvi.medium.com

3. C is closer: Go to C. [Image: C highlighted]

4. From C:

▪ Reach D (distance 2 + 8 = 10) [Image: Arrow from C to D, distance marked]

5. B is next closest: Go to B. [Image: B highlighted]

6. From B:

▪ Reach C (distance 4 + 1 = 5) - but we already have a shorter path to C (2), so


ignore this.

▪ Reach D (distance 4 + 5 = 9) - this is shorter than the current path to D (10), so


update it. [Image: Arrow from B to D, distance 9 marked]

• Result: Shortest paths: A to B (4), A to C (2), A to D (9). [Image: Final paths highlighted, distances
marked]

To create the images:

1. Draw the graph: Represent cities as circles (nodes) and roads as lines (edges).

2. Mark distances: Write the distance on each edge.

3. Highlight steps: In each image, highlight the current node being visited and the paths being
considered.

4. Update distances: Show how distances are updated as the algorithm progresses.

2. Prim's Algorithm (Minimum Spanning Tree)

Imagine these islands:

10 4

A ----- B ----- C

| /\ |

5 / \ 7

| / \ |

C ----- D

• Prim's steps:

1. Start at A: [Image: A highlighted]

2. Cheapest from A: A to C (5). [Image: A-C edge highlighted]

3. Cheapest from A or C: B to C (4). [Image: B-C edge highlighted]


https://sandeepvi.medium.com

4. Cheapest to D: C to D (7). [Image: C-D edge highlighted]

• Result: Bridges: A-C (5), B-C (4), C-D (7). Total cost: 16. [Image: Final MST highlighted]

To create the images:

1. Draw the graph: Represent islands as circles and bridges as lines.

2. Mark costs: Write the cost on each edge.

3. Highlight steps: In each image, highlight the edge being added to the MST.

3. Kruskal's Algorithm (Minimum Spanning Tree)

Same islands as Prim's.

• Kruskal's steps:

1. List and sort: [Image: List of edges and costs]

2. Add cheapest: B-C (4). [Image: B-C edge highlighted]

3. Next cheapest: A-C (5). [Image: A-C edge highlighted]

4. Next cheapest: C-D (7). [Image: C-D edge highlighted]

5. Skip B-D: (creates a cycle).

6. Skip A-B: (creates a cycle).

• Result: Same as Prim's. [Image: Final MST highlighted]

To create the images:

1. Draw the graph and list edges: Show all edges and their costs.

2. Highlight steps: In each image, highlight the edge being added to the MST.

3. Show cycle detection: If an edge is skipped, illustrate why it would create a cycle.

4. Traveling Salesperson Problem (TSP)

Imagine these cities:

10 35

A ----- B ----- C

| /\ |

15 / \ 30

| / \ |

D -----

20
https://sandeepvi.medium.com

• Nearest Neighbor:

1. Start at A. [Image: A highlighted]

2. Nearest to A: B (10). [Image: A-B edge highlighted]

3. Nearest to B: D (25). [Image: B-D edge highlighted]

4. Nearest to D: C (30). [Image: D-C edge highlighted]

5. Return to A (20). [Image: C-A edge highlighted]

• Result: Route A-B-D-C-A. Total distance: 85. [Image: Complete route highlighted]

To create the images:

1. Draw the graph: Represent cities as circles and roads as lines.

2. Mark distances: Write the distance on each edge.

3. Highlight steps: In each image, highlight the edge being added to the route.

5. Knapsack Problem

Imagine you have a knapsack that can hold 10 kg.

• Items:

o Item 1: 5 kg, $10

o Item 2: 4 kg, $40

o Item 3: 6 kg, $30

• 0/1 Knapsack:

o Try different combinations:

▪ Item 1 only: 5 kg, $10

▪ Item 2 only: 4 kg, $40

▪ Item 3 only: 6 kg, $30

▪ Item 1 + Item 2: 9 kg, $50 (best)

▪ ...

• Fractional Knapsack:

1. Value/weight: Item 1 (2), Item 2 (10), Item 3 (5).

2. Take Item 2 (4 kg, $40).

3. Take Item 3 (6 kg, $30).

To create the images:


https://sandeepvi.medium.com

1. Draw a knapsack: Represent the knapsack as a rectangle.

2. Draw items: Represent items as boxes with their weights and values.

3. Show combinations: In each image, show the items being considered for inclusion in the
knapsack.

Limitations:

Local Optima: The biggest limitation is that greedy algorithms can get stuck in local optima. Imagine
you're climbing a mountain, and you always take the steepest path upwards. You might reach a peak,
but it might not be the highest peak on the mountain.

(b) Explain Dynamic Programming with an example.

• Answer: Dynamic Programming (DP) is used for optimization problems by solving overlapping
subproblems and storing results to avoid redundant calculations.

o Example: Fibonacci Sequence: Fib(n)=Fib(n−1)+Fib(n−2)Fib(n) = Fib(n-1) + Fib(n-


2)Fib(n)=Fib(n−1)+Fib(n−2) Using DP, store intermediate values to avoid recalculating
them, improving efficiency.

6. Graph Algorithms:

(a) Explain DFS (Depth-First Search) with an example.

Depth-First Search (DFS) is an algorithm for traversing or searching tree or graph data structures. It starts
at the root (or some arbitrary node) and explores as far as possible along each branch before
backtracking. Think of it like exploring a maze by choosing one path and going as deep as you can until
you hit a dead end, then going back and trying another path.

How it works:

1. Start at a node: Pick any node in the graph to start.

2. Visit and mark: Mark the current node as visited.

3. Explore adjacent nodes: For each unvisited node adjacent to the current node:

o Recursively call DFS on that node.

Example:

Let's consider the following graph:

/\

B C
https://sandeepvi.medium.com

/\ \

D E F

We'll start at node A and perform a DFS:

1. Start at A: Mark A as visited.

2. Explore B: B is adjacent to A and unvisited. Call DFS on B.

3. Explore D: D is adjacent to B and unvisited. Call DFS on D.

4. Explore G: G is adjacent to D and unvisited. Call DFS on G.

5. G has no unvisited neighbors: Return to D.

6. Explore (nothing): D has no other unvisited neighbors. Return to B.

7. Explore E: E is adjacent to B and unvisited. Call DFS on E.

8. E has no unvisited neighbors: Return to B.

9. B has no other unvisited neighbors: Return to A.

10. Explore C: C is adjacent to A and unvisited. Call DFS on C.

11. Explore F: F is adjacent to C and unvisited. Call DFS on F.

12. F has no unvisited neighbors: Return to C.

13. C has no other unvisited neighbors: Return to A.

14. A has no other unvisited neighbors: The DFS is complete.

Order of visited nodes: A, B, D, G, E, C, F

(b) Explain BFS (Breadth-First Search) with an example.

Breadth-First Search (BFS) is a graph traversal algorithm that explores a graph level by level. It starts at a
given node and visits all of its neighbors before moving on to the neighbors of those neighbors, and so
on. Think of it like exploring a building floor by floor, rather than going deep into one room and then
another.

How it works:

1. Start at a node: Pick any node in the graph to start.

2. Use a queue: Maintain a queue (a First-In, First-Out data structure) to keep track of the nodes to
visit.
https://sandeepvi.medium.com

3. Visit and enqueue:

o Mark the starting node as visited.

o Enqueue the starting node.

4. Dequeue and explore:

o While the queue is not empty:

▪ Dequeue a node from the queue.

▪ For each unvisited neighbor of the dequeued node:

▪ Mark the neighbor as visited.

▪ Enqueue the neighbor.

Example:

Let's consider the following graph:

/\

B C

/\ \

D E F

We'll start at node A and perform a BFS:

1. Start at A: Mark A as visited, enqueue A. Queue: [A]

2. Dequeue A:

o Mark B (neighbor of A) as visited, enqueue B. Queue: [B]

o Mark C (neighbor of A) as visited, enqueue C. Queue: [B, C]

3. Dequeue B:

o Mark D (neighbor of B) as visited, enqueue D. Queue: [C, D]

o Mark E (neighbor of B) as visited, enqueue E. Queue: [C, D, E]

4. Dequeue C:

o Mark F (neighbor of C) as visited, enqueue F. Queue: [D, E, F]

5. Dequeue D:
https://sandeepvi.medium.com

o Mark G (neighbor of D) as visited, enqueue G. Queue: [E, F, G]

6. Dequeue E: (No unvisited neighbors)

7. Dequeue F: (No unvisited neighbors)

8. Dequeue G: (No unvisited neighbors)

9. Queue is empty: The BFS is complete.

Order of visited nodes: A, B, C, D, E, F, G

7. NP-Complete Problems:

(a) What are NP-Complete problems? Explain with an example.

Answer: the complexity classes P, NP, NP-complete, and NP-hard with examples:

1. P (Polynomial Time):

• Definition: Problems that can be solved by a deterministic algorithm in polynomial time.


Polynomial time means the number of steps the algorithm takes grows as a polynomial function
of the input size (e.g., n, n², n³, but not 2ⁿ or n!). These are considered "easy" problems for
computers.

• Example:

o Searching: Finding an element in an unsorted array (linear search - O(n)).

o Sorting: Sorting an array of numbers (e.g., merge sort - O(n log n)).

o Adding two numbers: Even if the numbers are huge, the number of operations grows
only linearly with the number of digits.

2. NP (Nondeterministic Polynomial Time):

• Definition: Problems whose solutions can be verified in polynomial time. This means that if
someone gives you a potential solution, you can quickly (in polynomial time) check if it's correct.
Finding the solution itself might be hard, but checking a proposed solution is easy.

• Example:

o The Traveling Salesperson Problem (TSP): Given a list of cities and the distances
between them, find the shortest route that visits each city exactly once and returns to
the starting city. If someone gives you a proposed route, you can easily check if it's valid
(visits each city once) and calculate its total distance.

o The Subset Sum Problem: Given a set of integers, is there a subset whose elements sum
to a given target value? If someone gives you a subset, you can quickly add up the
numbers and see if it equals the target.
https://sandeepvi.medium.com

o Graph Coloring: Can the vertices of a graph be colored with a given number of colors
such that no two adjacent vertices have the same color? If someone gives you a
coloring, you can quickly check if any adjacent vertices have the same color.

3. NP-complete:

• Definition: Problems that are both in NP and NP-hard. They are the "hardest" problems in NP. If
you could solve any NP-complete problem in polynomial time, you could solve all problems in NP
in polynomial time (meaning P = NP). No one has ever proven that P = NP, and most computer
scientists believe they are not equal.

• Key Idea: NP-complete problems are all "inter-reducible." If you can transform one NP-complete
problem into another NP-complete problem in polynomial time, you can solve one if and only if
you can solve the other.

• Examples:

o The Traveling Salesperson Problem (TSP) (as mentioned above).

o The Boolean Satisfiability Problem (SAT): Given a Boolean formula, is there an


assignment of truth values to the variables that makes the formula true?

o The Clique Problem: In a graph, is there a complete subgraph (a clique) of a given size?

o The Vertex Cover Problem: In a graph, is there a set of vertices such that every edge has
at least one endpoint in this set?

o The Knapsack Problem (Decision version): Given a set of items, each with a weight and
a value, and a maximum weight capacity, can you select a subset of items whose total
value is at least a given target value?

4. NP-hard:

• Definition: Problems that are at least as hard as the hardest problems in NP. This means that all
problems in NP can be reduced to them in polynomial time. However, NP-hard problems don't
have to be in NP. In fact, some NP-hard problems are known to be outside of NP.

• Key Idea: If you could solve an NP-hard problem in polynomial time, you could solve all problems
in NP in polynomial time.

• Examples:

o The Halting Problem: Given a computer program and an input, will the program
eventually halt (stop running), or will it run forever? This problem is undecidable
(outside of NP). It's NP-hard because if you could solve it, you could solve any problem in
NP.

o Optimization versions of NP-complete problems: For example, finding the absolute


shortest route in the TSP (rather than just determining if there's a route shorter than a
certain length). Optimization problems are often harder than their decision versions.
https://sandeepvi.medium.com

o Unsolvable Problems: Problems for which no algorithm can ever be constructed to solve
them. These are also NP-hard.

In summary:

• P: Problems you can solve quickly.

• NP: Problems whose solutions you can check quickly.

• NP-complete: The hardest problems in NP. If you can solve one quickly, you can solve them all
quickly (and P = NP).

• NP-hard: At least as hard as NP-complete problems, but they don't have to be in NP. Some are
even unsolvable.

The relationship is often visualized like this:

What to do when you encounter an NP-complete problem:

Since finding exact solutions to NP-complete problems is believed to be very difficult, here are some
common approaches:

Approximation algorithms: Find a solution that is "good enough," even if it's not the absolute best.

Heuristics: Use rules of thumb or educated guesses to find a solution that is likely to be good.

Special cases: If the problem has some special constraints, it might become solvable in polynomial time.

Randomized algorithms: Use randomness to find a solution that is correct with high probability.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy