Improved Selection Sort Algorithm

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/272621833

Improved Selection Sort Algorithm

Article  in  International Journal of Computer Applications · January 2015


DOI: 10.5120/19314-0774

CITATIONS READS

8 7,735

4 authors, including:

James Ben Hayfron-Acquah Obed Appiah


Kwame Nkrumah University Of Science and Technology University of Energy and Natural Resources
59 PUBLICATIONS   145 CITATIONS    25 PUBLICATIONS   68 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

PhD thesis View project

Designing Algorithm for Malaria Diagnosis Using Fuzzy Logic for Treatment (AMDFLT) in Ghana View project

All content following this page was uploaded by Obed Appiah on 22 February 2015.

The user has requested enhancement of the downloaded file.


International Journal of Computer Applications (0975 – 8887)
Volume 110 – No. 5, January 2015

Improved Selection Sort Algorithm


J. B. Hayfron-Acquah, Ph.D. Obed Appiah K. Riverson, Ph.D.
Department of Computer Science University of Energy and Natural CSIR
Kwame Nkrumah University of Resources Accra, Ghana
Science and Technology Sunyani, Ghana
Kumasi, Ghana

ABSTRACT some of the common sorting algorithms. Due to high


One of the basic problems of Computer Science is sorting a number of sorting algorithms available, the best one for a
list of items. It refers to the arrangement of numerical or particular application depends on various factors which were
alphabetical or character data in statistical order. Bubble, summarised by Jadoon et al. (2011) as:
Insertion, Selection, Merge, and Quick sort are most common  The size of the list (number of elements to be
ones and they all have different performances based on the sorted).
size of the list to be sorted. As the size of a list increases,
some of the sorting algorithm turns to perform better than  The extent up to which the given input sequence is
others and most cases programmers select algorithms that already sorted.
perform well even as the size of the input data increases. As
the size of dataset increases, there is always the chance of  The probable constraints on the given input values.
duplication or some form of redundancies occurring in the  The system architecture on which the sorting
list. For example, list of ages of students on a university operation will be performed.
campus is likely to have majority of them repeating. A new
algorithm is proposed which can perform sorting faster than  The type of storage devices to be used: main
most sorting algorithms in such cases. The improved memory or disks [4].
selection sort algorithm is a modification of the existing
Almost all the available sorting algorithms can be
selection sort, but here the number of passes needed to sort
categorized into two groups based on their difficulty. The
the list is not solely based on the size of the list, but the
complexity of an algorithm and its relative effectiveness are
number of distinct values in the dataset. This offers a far
directly correlated [5]. A standardized notation i.e. Big O(n),
better performance as compared with the old selection sort in
is used to describe the complexity of an algorithm. In this
the case where there are redundancies in the list.
notation, the O represents the complexity of the algorithm
General Terms and n represents the size of the input data values. The two
groups of sorting algorithms are O(n2), which includes the
Algorithms, Sorting Algorithms
bubble, insertion, selection sort and O(nlogn) which includes
Keywords the merge, heap & quick sort.
Algorithms, sorting algorithms, selection sort, improved
selection sort, redundancies in dataset
1.1 Selection Sort Algorithm
The concept of the existing selection sort (SS) algorithm is
1. INTRODUCTION simple and can easily be implemented as compared to others
One of the basic problems of Computer Science is sorting a such as the merge or quick sorting. The algorithm does not
list of items. This is the arrangement of a set of items either need extra memory space in order to perform the sorting.
in increasing or decreasing order. The formal definition of The SS simply partition the list into two main logical parts,
the sorting problem is as follows: the sorted part and the unsorted part. Any iteration picks a
value form the unsorted and places it in the sorted list,
Input: A sequence having n numbers in some random order making the sort partition grow in size while the unsorted
partition shrinks for each iteration. When adding to the
(a1, a2, a3, ….. an)
sorted list, the algorithm makes sure that the value is added at
Output: A permutation (a‟1, a‟2, a‟3, ….. a‟n) of the input the right position to ensure an order sequence of the sorted
sequence such that partition. The process is terminated when the number of
items or the size of the unsorted is one (1). The procedure to
a‟1 ≤ a‟2 ≤ a‟3 ≤ ….. a‟n select a value to be moved to the sorted list will return
For instance, if the given input of numbers is 59, 41, 31, 41, minimum value or maximum value in the unsorted partition,
26, 58, then the output sequence returned by a sorting which will be swapped to position the item correctly.
algorithm will be 26, 31, 41, 41, 58, 59 [1].
1.2 Algorithm: Selection Sort (a[], n)
Sorting is considered as a fundamental operation in Here a is the unsorted input list and n is the size of the list or
Computer Science as it is used as an intermediate step in number of items in the list. After completion of the algorithm
many programs. For example, the binary search algorithm the array will become sorted. Variable max keeps the
(one of the fastest search algorithms) requires that data must location of the maximum value in each iteration.
be sorted before the search could be done accurately at all
times. Data is generally sorted to facilitate the process of k n-1
searching. As a result of its vital or key role in computing, Repeat steps 3 to 6 until k=1
several techniques for sorting have been proposed. The
bubble, insertion, selection, merge, heap, and quick sort are Set max=0

29
International Journal of Computer Applications (0975 – 8887)
Volume 110 – No. 5, January 2015

Repeat for count=1 to k repetitions. If age ranges between 0 to 100 then each age
value could have a frequency of about 100,000
If (a[count]>a[max]) (10,000,000/100). In terms of population, more than half
Set max=count will be below the ages of fifty (50). The existing selection
sort will execute such list in the order of O(n2) in the worst
End if case scenario, but the proposed algorithm can do better. The
Interchange data at location k and max main concept of the proposed algorithm is to evaluate the
data in the list and keep track of distinct values in the list.
Set k  k - 1 This makes it possible to perform multiple swapping at each
pass unlike the existing selection sort which performs at most
Table 1.0 shows the time complexity of the algorithm in
1 at each pass, hence reducing the run time for sorting the
three different situation of the input list.
list. The technique used is simple; a queue is maintained to
Table 1.0: Time Complexity of Selection Sort Algorithm keep the locations of all the values that are the same as the
value that is held as the Minimum or Maximum. At the end
Best case Worst case Average case of the list, all the locations on the queue are swapped into
their respective positions. Where the subsequence search
will begin from can be computed as (i  i+x), where i
O(n2) O(n2) O(n2) points to the start of the unsorted partition and x is the
number of items that were dequeue. The worst case happens
Various improved selection sorting algorithms have been when there are no repetition in the list, but can guarantee best
proposed and all works better than the Selection Sort case run time O(n) when all the values in the list are the same
Algorithm. Optimized Selection Sort Algorithm (OSSA) or the number of distinct values is relatively small.
starts sorting the array from both ends. In a single iteration,
the smallest and largest elements in the unsorted part of the 2.1 Improved Selection Sort Algorithm
array are searched and swapped[2]. The array is logically
partition into three parts; lower-sorted, unsorted, upper- 1. Initialise i to 1
sorted. The search for the maximum and minimum is done 2. Repeat steps 3-5 until the i equals n.
in the unsorted partition and the minimum is moved to the
lower-sorted and the maximum to the upper-sorted. All 3. Search from the beginning of the unsorted part of
values in the upper-sorted are greater or equal to the values in the list to the end.
the lower-sorted. The process is continued until the whole 4. Enqueue the locations of all values that are the
list or array is sorted [3]. The algorithm is able to half the same as the Maximum value.
run time of the selection sort, O(n2)/2, which is better but still
exhibit a time complexity of O(n2). 5. Use the indices on the queue to perform swapping.
The concept of the Enhance Selection Sort Algorithm Example of Improved Selection Sort (Ascending Order)
(ESSA) is to memorize the location of the past maximum and
List – A[n]
start searching from that point in the subsequent iteration[2].
This enables the algorithm to avoid having to search for the Queue – Q[n]
maximum values form the beginning of the unsorted partition
to the end. This technique limits the number of comparisons Initial List
the algorithm performs during each iteration, hence A 2 2 1 5 2 5 4 4 5 5
performing better than the existing selection sort algorithm.
The arrangement of the elements of the list influences the run
time greatly. The same set of data may take different times
to be sorted as a result of their arrangement. The average Q
case of the algorithm is however O(n2).
Hybrid Select Sort Algorithm (HSSA) uses a technique that 1st Pass
prevent the algorithm from performing unnecessary iterations
by evaluating the content of the unsorted partition for ordered A 2 2 1 5 2 5 4 4 5 5
sequence so as to terminate quickly. When the list is fully
sorted or partially sorted, its run time is better when
compared with the existing selection algorithm. The modified Q 2
selection sort algorithm uses a single Boolean variable
„FLAG‟ to signal the termination of execution based on the Index 2 added to the queue.
order of the list, a[i-1] >= a[i] >=a[i+1] [6]. The best scenario
is when the list is already ordered, here the algorithm
terminate during the first pass, hence will have a run time of 2nd Pass
O(n). What this means is that, when data is not ordered, the
A 1 2 2 5 2 5 4 4 5 5
algorithm behaves generally like the old selection sort
algorithm.

2. CONCEPT OF IMPROVED Q 1 2 4
SELECTION SORT ALGORITHM Indices 1, 2 and 4 added to the queue because they all store
Generally, large data sample will contain a couple of the same value as the minimum value during the second (2 nd)
repetitions. For example sorting the ages of citizens of a Pass.
country of population of about 10 million will contain a lot of

30
International Journal of Computer Applications (0975 – 8887)
Volume 110 – No. 5, January 2015

3nd Pass 4 Pass


A 1 2 2 2 5 5 4 4 5 5 A [1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4]
Q [14, 15, 16, 17, 18, 19, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Q 6 7 Indices 14, 15, 16, 17, 18, 19 are added to the queue

Indices 6 and 7 added to the queue because they all store the After swapping
same value as the minimum value during the third (3rd) Pass. A [1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4]
4th Pass The Improved Selection Sort Algorithm (ISSA) is content
A 1 2 2 2 4 4 5 5 5 5 sensitive, in that the nature of data distribution of the list will
greatly influence the run time of the algorithm. The run time
of the ISSA depends on the number of distinct values that are
found in the list to be sorted. If the number of distinct values
Q 6 7 8 9
is big or equal to n, then the run time of the algorithm can be
Indices 6, 7, 8 and 9 added to the queue because they all approximated as O(n2). However, if the number is very
store the same value as the minimum value during the fourth small, the algorithm completes the sorting in the order of
(4th) Pass. O(n).
Sorted List Pseudocode

A 1 2 2 2 4 4 5 5 5 5 A[n]
Queue[n] // Same size as the size of the array
The list is sorted at the end of the fourth iteration or pass.
The existing selection sort will take more time to sort the i 0
same list. while i < (n-1)
Another example with original list as follows Rear  0
Original List Max  A[i]
A=[2, 2, 4, 2, 1, 2, 2, 3, 3, 4, 4, 2, 3, 4, 1, 2, 3, 4, 4, 2] Queue[Rear]  i
----------------------------------------------------------------- ji+1
1 Pass while j<(n)
A [2, 2, 4, 2, 1, 2, 2, 3, 3, 4, 4, 2, 3, 4, 1, 2, 3, 4, 4, 2] if Max < A[j]
Q [4, 14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Max  A[j]
Indices 4,14 are added to the queue Rear  -1
After swapping If Max = A[j]
A [1, 1, 4, 2, 2, 2, 2, 3, 3, 4, 4, 2, 3, 4, 2, 2, 3, 4, 4, 2] Rear  Rear + 1
----------------------------------------------------------------- Queue[Rear] = j
2 Pass //Perform the swapping of values
A [1, 1, 4, 2, 2, 2, 2, 3, 3, 4, 4, 2, 3, 4, 2, 2, 3, 4, 4, 2] Front  0
Q [3, 4, 5, 6, 11, 14, 15, 19, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] While (Front <= Rear)
Indices 3, 4, 5, 6, 11, 14, 15, 19 are added to the queue Temp A[Queue[Front]]
After swapping A[Queue[Front]]  A[i]
A [1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 3, 4, 3, 3, 3, 4, 4, 4] A[i]  Temp
----------------------------------------------------------------- ii+1
3 Pass Front  Front + 1
A [1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 3, 4, 3, 3, 3, 4, 4, 4]
3. ANALYSES OF ISSA
Q [12, 14, 15, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] The Improved Selection Sort Algorithm is very simple to
analyse, considering the fact that the time complexity or run
Indices 12, 14, 15, 16 are added to the queue
time of the algorithm depends on two main factors.
After swapping
1. Size of list (n)
A [1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4]
2. Number of distinct values in the list. dV
-----------------------------------------------------------------
Run Time = O(n.dV)

31
International Journal of Computer Applications (0975 – 8887)
Volume 110 – No. 5, January 2015

Table 1shows the runtime of a set of n values with different Fig 2 illustrates the relationship between the number of
number of distinct values. distinct values in a list and the time needed to sort it. The
number is illustrated as a ratio of the size of the list (n). If
Table 1 Run time of Improved Selection Sort Algorithm the number of distinct value is half the size of the list, then
(ISSA) the algorithm will take about half the time the old selection
sort algorithm takes. From figure 2, as the number of distinct
Number of Run Time Big-O values decreases, the run time for the sorting also decrease.
Distinct Values
Decreasing distinct values:
1 T=n O(n) 𝑛 𝑛 𝑛 𝑛 𝑛 𝑛
, , , ,…, , ,1
1 2 3 4 𝑛−2 𝑛−1
2 T= 2n O(n)
4. ANALYSIS OF SS, OSSA, ESSA,
3 T=3n O(n) HSSA AND ISSA WITH ASAMPLE
DATASET
... ... ... A given set of data of size 1000 was finally used to analyse
the performances of the various selection sort algorithms
n-2 T = (n-2)n O(n2) including Improved Selection Sort Algorithm (ISSA). The
number of redundancies in the set was quantified in terms of
percentages and 11 different sets of values were used to test
n T = n2 O(n2)
the algorithms. The data redundancies in set 1 through 11
were 0%, 10%, 20%, 30%, 40%,50%, 60%, 70%, 80%, 90%,
100%. Table 2 illustrates the run times for the various
algorithms on the various categories of the dataset.
Table 2: Estimated run times of various Selection Sort
algorithms when input dataset were not sorted

Red. SS OSSA ESSA HSSA ISSA

'0% 499500 250000 371580 498501 499500

'10% 499500 250000 377050 498501 449550

'20% 499500 250000 375967 498501 399600

'30% 499500 250000 378348 498501 349650


Figure 1: illustrates the relationship between distinct
values in the list and the run time of the ISSA.
'40% 499500 250000 383873 498501 299700
Fig 1: Run time of list of size n and number of distinct values
using ISSA. The performance of the Improved Selection '50% 499500 250000 398155 498498 249750
could also be enhanced by introducing the FLAG concept in
the HSSA to terminate sorting when the list is already sorted.
'60% 499500 250000 399608 498501 199800

Number of distinct values and '70% 499500 250000 418296 498500 149850
worst case run time of ISSA
'80% 499500 250000 433374 498495 99900
25000
n '90% 499500 250000 463134 498465 49950
20000
'100% 499500 250000 49950 998 1000

15000

n/2
10000

5000 n/4
n/8
n/16
n/32
0

Figure 2: Number of distinct values and run time


complexity of ISSA.

32
International Journal of Computer Applications (0975 – 8887)
Volume 110 – No. 5, January 2015

where there are more of such redundancies or repetitions in


600000 the list, it performs better than the existing selection sort
algorithm and also a couple of the optimized selection sort.
500000
In situation where the number of distinct values is very small,
400000 the algorithm may perform better than even the quick sort
and merge sort algorithm which have run time O(nlogn).
300000
6. REFERENCES
200000 [1] Cormen, T. H., Leiserson, C. E., Rivest, R. L., and
Stein, C. 2001. Introduction to Algorithms. MIT Press.
100000 Cambridge. MA. 2nd edition. 2001

0 [2] Jadoon, S., Solehria, S. F., Qayum, M., “Optimized


Selection Sort Algorithm is faster than Insertion Sort
'0%
'10%
'20%
'30%
'40%
'50%
'60%
'70%
'80%
'90%
'100%
Algorithm: a Comparative Study” International Journal
of Electrical & Computer Sciences IJECS-IJENS Vol:
11 No: 02, 2011
SS OSSA ESSA HSSA ISSA [3] Jadoon, S., Faiz S., Rehman S., Jan H., “Design &
Analysis of Optimized Selection Sort Algorithm”, IJEC-
IJENS Volume 11 Issue 01, 2011.
Figure 3: Estimated run times of various Selection Sort
algorithms when input dataset were not sorted [4] Khairullah, M. “Enhancing Worst Sorting Algorithms”.
International Journal of Advanced Science and
The actual comparison of these selection sort algorithms
Technology Vol. 56, July, 2013
could be done when the dataset is randomized. Here the
performance of the improved selection sort algorithm (ISSA) [5] Kapur, E., Kumar, P. and Gupta, S., “Proposal of a two
recorded best performance when the percentages of way sorting algorithm and performance comparison
redundancies exceeds 50% with existing algorithms”. International Journal of
Computer Science, Engineering and Applications
5. CONCLUSION (IJCSEA) Vol.2, No.3, June 2012
This paper proposed a new selection sort algorithm which
performs better than the existing selection sort algorithm and [6] “Design and Analysis of Hybrid Selection Sort
in most cases may have a run time in order of O(n) which is Algorithm”. International Journal of Applied Research
ideal for sorting relatively large set of data. The strength of and Studies (iJARS) ISSN: 2278-9480 Volume 2, Issue
the algorithm depends on the distinct values in the list and 7 (July- 2013) www.ijars.in

IJCATM : www.ijcaonline.org 33

View publication stats

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy