CACS201 Unit 8 - Sorting
CACS201 Unit 8 - Sorting
CACS201 Unit 8 - Sorting
Sorting
Contents
• Introduction
• Internal and External Sort
• Insertion and Selection Sort
• Exchange Sort
• Bubble and Quick Sort
• Merge and Radix Sort
• Shell Sort
• Binary Sort
• Heap Sort as Priority Queue
• Efficiency of Sorting
• Big ‘O’ Notation
Sorting
• Sorting refers to the process of arranging the data elements of an array in a specified
order, that is, either in ascending or descending order.
• For example, it will be practically impossible for us to find a name in the telephone
directory if the names in it are not in alphabetical order.
• However, this same thing can be true for dictionaries, book indexes, bank accounts, and
so on.
• Hence, the convenience of having sorted data is unquestionable.
• Retrieval of information becomes much easier when the data is stored in some specified
order.
• Therefore, sorting is a very important application in computer science.
• Let us take an array which is declared and initialized as:
int array[] = {10, 25, 17, 8, 30, 3} ;
• Then, the array after applying the sorting technique is:
array[] = {3, 8, 10, 17, 25, 30} ;
Sorting Algorithm
• A sorting algorithm can be defined as an algorithm which puts the
data elements of an array/list in a certain order, that is, either
numerical order or any predefined order.
• There are many sorting algorithms which are available and are widely
used according to the different environments required by the
different sorting methods.
• The two basic categories of sorting methods are:
• Internal Sorting
• External Sorting
Internal and External Sort
• Internal sorting refers to the process of arranging elements in an array only
when they are present in the computer’s primary or main memory.
• External sorting, on the other hand, is the process of sorting elements from
an external file by reading them from secondary memory.
• In internal sorting, all data to be sorted is stored in main memory at all
times while sorting is in process.
• In external sorting, all data to be sorted is stored outside the main memory
and only loaded into memory in small chunks as per requirement.
• External sorting is usually applied in cases when data can’t fit into the main
memory entirely.
• The internal sort algorithm has better performance than the external sort.
Insertion Sort
• Insertion Sort is very simple sorting algorithm which works just like its name suggests,
that is, it inserts each element into its proper position in the concluding list.
• To limit the wastage of memory or, we can say, to save memory, most implementations
of an insertion sort work by moving the current element past the already sorted
elements and repeatedly swapping or interchanging it with the preceding element until it
is placed in its correct position.
• Insertion Sort Technique
• Pass 1 – Initially there is only one element in the list which is already sorted. Hence, we proceed to
the next steps.
• Pass 2 – During the first iteration, the first and the second element of the list are compared. The
smaller value occupies the first position of the list.
• Pass 3 – During the second iteration, the first three elements of the list are compared. The smaller
value will occupy the first position in the list. The second position will be occupied by the second
smallest element, and so on.
This procedure is repeated for all the elements of the array up to (n-1) iterations.
Algorithm for an Insertion Sort
Insertion Sort Example
Complexity of an Insertion Sort
• In an insertion sort, the best case will happen when the array is
already sorted, and in that case the running time of the algorithm is
O(n) (i.e., linear running time).
• Obviously, the worst case will happen when the array is sorted in the
reverse order.
• Thus, in that case the running time of the algorithm is O(n2) (i.e.,
quadratic running time).
Selection Sort
• Selection sort is a sorting technique that works by finding the smallest
value in the array and placing it in the first position.
• After that, it then finds the second smallest value and places it in the
second position.
• This process is repeated until the whole array is sorted.
• Thus, the selection sort works by finding the smallest unsorted element
remaining in the entire array and then swapping it with the element in the
next position to be filled.
• It is a very simple technique and it is also easier to implement than other
sorting techniques.
• Selection sort is used for sorting files with large records.
Selection Sort
• Let us take an array ARR with N elements in it.
• Now, the selection sort technique works as follows:
• First of all, we will find the smallest value in the entire array, and we will place that value in the
first position of the array.
• Then, we will find the second smallest value in the array and we will place it in the second
position of the array.
• Now, we will repeat this process until the whole array is sorted.
• Pass 1 – Find the position POS of the smallest value in the array of N elements and interchange
ARR[POS] with ARR[0]. Hence, ARR[0] is sorted.
• Pass 2 – Find the position POS of the smallest value in the array of N-1 elements and interchange
ARR[POS] with A[1]. Hence, A[1] is sorted.
.
.
• Pass N-1 – Find the position POS of the smaller of the elements of ARR[N-2] and ARR[N-1] and
interchange ARR[POS] with ARR[N-2]. Hence, ARR[0], ARR[1], . . . ARR[N-1] is sorted.
Algorithm for a Selection Sort
• Consider an array ARR having N elements from ARR[0] to ARR[N-1]. I
and J are the looping variables, and POS is the swapping variable.
Selection Sort Example
Complexity of the Selection Sort Algorithm
• Selection sort is the simple technique of sorting.
• In this method, if there are n elements in the array then (n-1)
comparisons or iterations are made.
• Thus, the selection sort technique has a complexity of O(n2).
Exchange Sort
• The exchange sort compares the first element with each following element
of the array, making any necessary swaps.
• When the first pass through the array is complete, the exchange sort then
takes the second element and compares it with each following element of
the array swapping elements that are out of order.
• This sorting process continues until the entire array is ordered.
Exchange Sort
• The exchange sort, in some situations, is slightly more efficient than
the bubble sort.
• It is not necessary for the exchange sort to make that final complete
pass needed by the bubble sort to determine that it is finished.
Bubble Sort
• Bubble sort is a very simple sorting method.
• It works by repeatedly moving the largest element to the highest position of the
array.
• In bubble sort, we are comparing two elements at a time, and swapping is done if
they are wrongly placed.
• If the element at lower index or position is greater than the element at a higher
index, then in that case both the elements are interchanged so that the smaller
element is placed before the bigger one.
• This process is repeated until the list becomes sorted.
• Bubble sort gets its name from the way that the smaller elements “bubble” to the
top of the array.
• This sorting technique only uses comparisons to operate on the elements.
• Hence, we can also call it a comparison sort.
Bubble Sort
• The basic idea applied for a bubble sort is to let us assume if an array ARR contains n
elements, then the number of iterations required to sort the array will be (n – 1).
• Pass 1 – During the first iteration, the largest value in the array is placed at the last
position.
• Pass 2 – During the second iteration, the second largest value of the array is placed in the
second last position.
• Pass 3 – During the third iteration, the third largest value of the array is placed in the
third last position and so on.
• This procedure is repeated until all the elements in the array are scanned and are placed
in their correct position, which means that the array is sorted.
Bubble Sort Example
Complexity of the Bubble Sort
• The bubble sort is the most inefficient sorting algorithm, and hence it
is not commonly used.
• In the best case, the running time of the bubble sort is O(n), that is,
when the array is already sorted.
• Otherwise, its level of complexity in average and worst cases is
extremely poor, that is, O(n2).
Quick Sort
• Quick sort, also known as partition exchange sort and developed by C. A. R. Hoare, is a
widely used sorting algorithm which also uses the divide and conquer approach as we
have discussed in merge sort.
• Here also, we will divide a single unsorted array into its two smaller sub-arrays.
• The divide and conquer method means dividing the bigger problem into two smaller
problems, and then those two smaller problems into smaller problems, and so on.
• A quick sort algorithm is faster than all the other sorting algorithms which have time
complexity O(n log10n).
• Working of Quick Sort
1. An element called pivot is selected from the array elements.
2. After choosing the pivot element, all the elements of the array are rearranged such that all the
elements less than the pivot element will be on left side, and all the elements greater than the
pivot element will be placed on the right side of the pivot element. After rearranging all the
elements, the pivot is now placed in its final position. Thus, this process is known as
partitioning.
3. Now, the two sub-arrays obtained will be recursively sorted
Quick Sort
• Quick Sort Technique
1. Initially set the index of the first element to LEFT and POS. Similarly, set the index of the last
element to RIGHT. Now, LEFT = 0, POS = 0, RIGHT = N – 1 (assuming n elements in the array).
2. We will start with the last element which is pointed to by RIGHT, and we will traverse each
element in the array from right to left, comparing each element with the first element pointed
to by POS. ARR[POS] should always be less than ARR[RIGHT].
• If ARR[POS] is less than ARR[RIGHT] then continue comparing until RIGHT = POS.
• If RIGHT = POS then it means that pivot is placed in its correct position.
• If ARR[RIGHT] < ARR[POS], then swap the two values and go to the next step.
• Set POS = RIGHT.
3. We will start from the first element which is pointed to by LEFT, and we will traverse every
element in the array from left to right, comparing each element with the element pointed to by
POS. ARR[POS] should always be greater than ARR[LEFT].
• If ARR[POS] is greater than ARR[RIGHT] then continue comparing until LEFT = POS. If LEFT = POS then it means
that pivot is placed in its correct position.
• If ARR[LEFT] > ARR[POS], then swap the two values and go to the previous step.
• Set POS = LEFT.
Complexity of Quick Sort
• The running time efficiency of quick sort is O(n log10n) in the average
and the best case.
• However, the worst case will happen if the array is already sorted and
the leftmost element is selected as the pivot element.
• In the worst case, its efficiency is O(n2).
Merge Sort
• Merge sort is a sorting method which follows the divide and conquer approach.
• The divide and conquer approach is a very good approach in which divide means partitioning the
array having n elements into two sub-arrays of n/2 elements each.
• However, if there are no elements present in the list/array or if an array contains only one
element, then it is already sorted.
• However, if an array has more elements, then it is divided into two sub-arrays containing equal
elements in them.
• Conquer is the process of sorting the two sub-arrays recursively using merge sort.
• Finally, the two sub-arrays are merged into one single sorted array.
• Merge Sort Techniques
1. If the array has zero or one element in it, then there is no need to sort that array as it is already sorted.
2. Otherwise, if there are more elements in the array, then divide the array into two sub-arrays containing
equal elements.
3. Each sub-array is now sorted recursively using merge sort.
4. Finally, the two sub-arrays are merged into a single sorted array.
Sort the following array using merge sort.
int array[] = { 40, 10, 86, 44, 93, 26, 69, 17 }
Complexity of Merge Sort
• The running time of the merge sort algorithm is O(n log10n).
• This runtime remains the same in the average as well as in the worst
case of the merge sort algorithm.
• Although it has an optimal time complexity, sometimes this runtime
can be O(n).
Radix Sort
• Radix sort is the linear sorting algorithm that is used for integers.
• In Radix sort, there is digit by digit sorting is performed that is started from the least significant digit to the
most significant digit.
• The process of radix sort works similar to the sorting of students names, according to the alphabetical order.
• In this case, there are 26 radix formed due to the 26 alphabets in English.
• In the first pass, the names of students are grouped according to the ascending order of the first letter of
their names.
• After that, in the second pass, their names are grouped according to the ascending order of the second
letter of their name.
• And the process continues until we find the sorted list.
• Algorithm:
radixSort(arr)
max = largest element in the given array
d = number of digits in the largest element (or, max)
Now, create d buckets of size 0 - 9
for i -> 0 to d
sort the array elements using counting sort (or any stable sort) according to the digits at the ith place
Working of Radix sort Algorithm
• First, we have to find the largest element (suppose max) from the given array.
• Suppose 'x' be the number of digits in max. The 'x' is calculated because we
need to go through the significant places of all elements.
• After that, go through one by one each significant place.
• Here, we have to use any stable sorting algorithm to sort the digits of each
significant place.
• In the given array, the largest element is 736 that have 3 digits in it. So, the loop
will run up to three times (i.e., to the hundreds place). That means three passes
are required to sort the array.
• Now, first sort the elements on the basis of unit place digits (i.e., x = 0). Here,
we are using the counting sort algorithm to sort the elements.
• Pass 1:
• In the first pass, the list is sorted on the basis of the digits at 0's place.
• After the first pass, the array elements are -
Working of Radix sort Algorithm
• Pass 2:
• In this pass, the list is sorted on the basis of the next
significant digits (i.e., digits at 10th place).
• After the second pass, the array elements are –
• Pass 3:
• In this pass, the list is sorted on the basis of the next
significant digits (i.e., digits at 100th place).
• After the third pass, the array elements are -
Radix Sort Complexity
• Best Case Complexity
• It occurs when there is no sorting required, i.e. the array is already sorted.
• The best-case time complexity of Radix sort is O(n+k).
• Average Case Complexity
• It occurs when the array elements are in jumbled order that is not properly
ascending and not properly descending.
• The average case time complexity of Radix sort is O(nk).
• Worst Case Complexity
• It occurs when the array elements are required to be sorted in reverse order.
• That means suppose you have to sort the array elements in ascending order, but its
elements are in descending order.
• The worst-case time complexity of Radix sort is O(nk).
Shell Sort
• Shell sort is a highly efficient sorting algorithm and is based on insertion sort algorithm.
• This algorithm avoids large shifts as in case of insertion sort, if the smaller value is to the
far right and has to be moved to the far left.
• This algorithm uses insertion sort on a widely spread elements, first to sort them and
then sorts the less widely spaced elements.
• This spacing is termed as interval.
• This interval is calculated based on Knuth's formula as −
Knuth's Formula
h=h*3+1
where h is interval with initial value 1
• This algorithm is quite efficient for medium-sized data sets as its average and worst-case
complexity of this algorithm depends on the gap sequence the best known is Ο(n), where
n is the number of items. And the worst case space complexity is O(n).
Working of Shell Sort
• For our example and ease of understanding, we • Then, we take interval of 1 and this gap generates
take the interval of 4. two sub-lists - {14, 27, 35, 42}, {19, 10, 33, 44}
• Make a virtual sub-list of all values located at the
interval of 4 positions.
• Here these values are {35, 14}, {33, 19}, {42, 27}
and {10, 44}
• We compare values in each sub-list and swap them • Finally, we sort the rest of the array using interval
(if necessary) in the original array. After this step, of value 1. Shell sort uses insertion sort to sort the
the new array should look like this − array.
Working of Shell Sort
• Algorithm
• Step 1 − Initialize the value of h
• Step 2 − Divide the list into smaller sub-list of
equal interval h
• Step 3 − Sort these sub-lists using insertion sort
• Step 3 − Repeat until complete list is sorted
Binary Sort
• Binary sort is a comparison type sorting algorithm.
• It is a modification of the insertion sort algorithm.
• In this algorithm, we also maintain one sorted and one unsorted subarray.
• The only difference is that we find the correct position of an element using binary search
instead of linear search.
• It helps to fasten the sorting algorithm by reducing the number of comparisons required.
• Binary Sort Algorithm
Let us assume that we have an unsorted array A[] containing n elements. The first
element, A[0], is already sorted and in the sorted subarray.
• Mark the first element from the unsorted subarray A[1] as the key.
• Use binary search to find the correct position p of A[1] inside the sorted subarray.
• Shift the elements from p 1 steps rightwards and insert A[1] in its correct position.
• Repeat the above steps for all the elements in the unsorted subarray.
Binary Sort Example
Binary Sort Complexity
Time Complexity
• Average Case
• Binary search has logarithmic complexity logn compared to linear complexity n of linear search used in
insertion sort.
• We use binary sort for n elements giving us the time complexity nlogn.
• Hence, the time complexity is of the order of O(nlogn).
• Worst Case
• The worst-case occurs when the array is reversely sorted, and the maximum number of shifts are required.
• The worst-case time complexity is O(nlogn).
• Best Case
• The best-case occurs when the array is already sorted, and no shifting of elements is required.
• The best-case time complexity is O(n).
Space Complexity
• Space Complexity for the binary sort algorithm is O(n) because no extra memory other than a
temporary variable is required.
Heap Sort
• Heap sort is a non-linear sort.
• Heap sort is based on a heap structure which is a special
type of binary tree.
• Heap sort consists of two phases:
• Creation of heap
• Processing of heap
• A heap of size ‘n’ is a binary tree of ‘n’ nodes that satisfies
the following two constraints:
• The binary tree is an almost complete binary tree.
• Keys in nodes are arranged such, that content or value of each
node is less than or equal to the content of its father or parent
node. Which means for each node info [i] <= info [j], where j is
the father of node i. This heap is called as max heap.
• Here, in heap, each level of binary tree is filled from left to
right and a new node is not placed on a new level until the
preceding level is full.
Creating a Heap
• The unsorted keys are taken sequentially one after the other and added into a heap. The size of
heap grows with the addition of each key.
• The ith key that is, ki is added into a present heap of size (i − 1) and a heap of size i is generated.
• Initially, the node is arranged in the heap of size (i − 1) so that the almost complete constraint is
fulfilled. The value of the key, ki is then compared to the value of its parent key. If ki is greater, the
contents of newly added node and that a parent node are interchanged.
• This process keeps on until either ki is at the root node or parent’s key value is not less than ki.
• The final tree is a heap of size i.
• Here, the resulting heap is stored in the array level by level, from left to right. The root is stored in
a heap [0] and the last node in the heap [maxnodes-1] where max nodes is the number of nodes
in the heap.
• We can note that for any node heap [i], its two children are residing in a heap [i*2] and a heap
[i*2+1].
• If we want to see the parent of any heap [k], we can get it at the heap [k/2].
Algorithm of Creating a Heap
1. Start
2. s = i ;
3. Find parent node index of i th node in the array as,
parent = s / 2;
key[ s] = newkey ;
4. while ( s != 0 && key[parent] <= key[s] )
{
Thereafter exchange the parent and the child nodes as, temp = key [parent] ;
key[parent] = key[s];
keys[s] = temp;
Advance the new node one level up in the tree as follows,
s = parent;
parent = s / 2 ;
}
5. Stop.
Example of Creating a Heap of keys 10, 20, 9, 4, 15, 17
Processing a Heap
• We may note that the resulting heap depends on the initial ordering
of the unsorted list. For different order of the input list, the heap
would be different.
• Now, we have to process the heap in order to generate a sorted list of
keys.
• We know that the largest element is at the top of the heap which is
sorted in the array at position heap [0].
• We interchange heap [0] with the last element in the heap array heap
[maxnodes-1] so that the heap [0] is in its proper place.
• We then adjust the array to be a heap of size (n – 1).
• Again interchange heap [0] with heap [n – 2], adjusting the array to
be a heap of size (n – 2) and so on.
• At the end, we get an array that contains the keys in sorted order.
Algorithm of Processing a Heap
Step 1: Start.
Step 2: Interchange the root node with the last node in the heap array.
Step 3: At present, heap [maxnode-1] is in its correct position.
Step 4: Now, compare the new root value with its left child value.
Step 5: If the new root value is smaller than its left child, then compare the left child with its right sibling. else
go to Step 7.
Step 6: If the left child value is larger than its right sibling, then swap root with the left child. Otherwise, swap
root with its right child.
Step 7: If the root value is larger than its left child, then compare the root value with its right child value.
Step 8: If the root value is smaller than its right child, then swap root with the right child. Otherwise, stop the
process.
Step 9: Repeat the same until the root node is fixed at its exact position.
Step 10: Repeat the step 2 to 9 for new root till the heap contains one element.
Step 11: Stop.
Complexity of Heap Sort
• Time complexity of heap sort:
• Best case = Ω (n log n)
• Average case = Θ (n log n)
• Worst case = O (n log n)
• Space complexity of heap sort:
• O (1) is the space complexity of heap sort.