Table of contents
Sorting helps by organizing data in a specific order, making it easier to search, analyze, and process. Let's explore the variety of sorting !
What is sorting?
Sorting refers to arranging data in a particular format: either ascending or descending.
Practical use of sorting :
Microsoft Excel: Built in functionality to sort data (there is a sort icon)
Online shopping: Online shopping websites generally have option to sort by price, review, rating etc.
Types of sorting
Depending on the space, ability - sorting algorithms can be divided into two categories.
Sorting:
Space used
In place: Sorting algos which does not require any extra space for sorting (example: Bubble sort)
Out of place: Sorting algos which requires extra space for sorting (example: Merge sort)
Stability
Stable: If a sorting algorithm keeps the sequence of similar items unchanged after sorting, it is called stable sorting (example: Insertion sort)
Unstable: If a sorting algorithm changes the order of similar items after sorting, it is called unstable sorting (example: quick sort)
Sorting Terminologies
These terms are very important, specially at the time of interviews.
Increasing order
If successive element is greater than the previous one
Example: 1,3,5,7,9,11
Decreasing order
If successive element is less than the previous one
Example: 11,9,6,4,3,1
Non increasing order
If successive element is less than or equal to its previous element in the sequence.
Example: 11, 9, 7, 5, 5, 1, 1
Non decreasing order
If successive element is greater than or equal to its previous element in the sequence.
Example: 1, 2, 3, 3, 4, 7, 8, 8
Bubble Sort
Bubble sort is also referred as Sinking sort
We repeatedly compare each pair of neighboring items and swap them if they are in the wrong order.
Note: We have to repeatedly continue comparing and swapping process until all elements are fully sorted
def bubbleSort(customList):
for i in range(len(customList) -1): #O(n)
for j in range(len(customList)-i-1): #O(n)
if customList[j] > customList[j+1] #O(1)
customList[j], customList[j+1] = customList[j+1],customList[j]
cList = [2,1,7,6,9,3,10,19]
bubbleSort(cList)
Time complexity: O(N^2), Space complexity: O(1)
When to use?
When the input is already sorted
Space is a concern
Easy to implement
When to avoid?
- Average time complexity is poor
Selection Sort
In selection sort, we repeatedly find the minimum element and move it to the sorted part of the array, making the unsorted part sorted.
First, make two parts: sorted and unsorted. Find the minimum element in the unsorted part, then swap it with the first element. Move the minimum value to sorted part. Repeat until all elements are sorted.
def selectionSort(customList):
for i in range(len(customList)): #O(n)
min_index = i
for j in range(i+1, len(customList)): #O(n)
if customList[min_index] > customList[j]: #O(1)
min_index = j
customList[i], customList[min_index] = customList[min_index], customList[i]
print(customList)
cList = [2,1,7,6,9,3,10,19]
selectionSort(cList)
Time complexity: O(N^2), Space complexity: O(1)
When to use?
When we have insufficient memory
Easy to implement
When to avoid?
- When time is a concern
Insertion Sort
Divide the given array into two part
Take 1st element from unsorted array and find its correct position in sorted array
Repeat until unsorted array is empty
def insertionSort(customList):
for i in range(1, len(customList)): #O(n)
key = customList[i]
j = i-1
while j >= 0 and key < customList[j]: #O(n)
customList[j+i] = customList
j -= 1
customList[j+1] = key
print(customList)
cList = [2,1,7,6,9,3,10,19]
insertionSort(cList)
Time complexity: O(N^2), Space complexity: O(1)
When to use?
When we have insufficient memory
Easy to implement
When we have continuous inflow of numbers and we want to keep them sorted
When to avoid?
- When time is a concern
Bucket Sort
Create buckets and distribute elements of array into buckets
Sort buckets individually
Merge buckets after sorting
number of buckets = round(Sqrt(number of elements))
Appropriate buckets = ceil(Value * number of buckets/ maxValue)
import math
def insertionSort(customList):
for i in range(1, len(customList)): #O(n)
key = customList[i]
j = i-1
while j >= 0 and key < customList[j]: #O(n)
customList[j+i] = customList
j -= 1
customList[j+1] = key
return customList #CHANGE MADE IN HERE
def bucketSort(customList):
numberOfBuckets = round(math.sqrt(len(customList))
maxValue = max(customList)
arr = [] # CREATING TEMPORARY/ ADDITIONAL SPACE
for i in range(numberOfBuckets ):
arr.append([])
for j in customList:
index_b = math.ceil(j * numberOfBuckets/ maxValue)
arr[index_b - 1].append(j)
for i in range(numberOfBuckets):
arr[i] = insertionSort(arr[i]) #O(N^2)
k = 0
for i in range(numberOfBuckets):
for j in range(len(arr[i])):
customList[k] = arr[i][j]
k += 1
return customList
cList = [2,1,7,6,9,3,10,19]
print(bucketSort(cList))
Time complexity: O(N^2), Space complexity: O(n)
When to use?
When input uniformly distributed over range
1, 2, 4, 5, 3, 8, 7, 9 (uniform distribution)
1, 2, 4, 91, 93, 96 (not an uniform distribution)
When to avoid?
- When space is a concern
Bucket Sort with negative numbers
def bucketSort(customList):
numberofBuckets = round(math.sqrt(len(customList)))
minValue = min(customList)
maxValue = max(customList)
rangeVal = (maxValue - minValue) / numberofBuckets
buckets = [[] for _ in range(numberofBuckets)]
for j in customList:
if j == maxValue:
buckets[-1].append(j)
else:
index_b = math.floor((j - minValue) / rangeVal)
buckets[index_b].append(j)
sorted_array = []
for i in range(numberofBuckets):
buckets[i] = insertionSort(buckets[i])
sorted_array.extend(buckets[i])
return sorted_array
Code Explanation
def bucketSort(customList):
- Function Definition: Begins the definition of the
bucketSort
function, which takescustomList
as an argument.
numberofBuckets = round(math.sqrt(len(customList)))
- Number of Buckets: Determines the number of buckets to use in the sorting process, typically the square root of the list's length.
minValue = min(customList)
maxValue = max(customList)
- Min and Max Values: Finds the minimum and maximum values in
customList
.
rangeVal = (maxValue - minValue) / numberofBuckets
- Range Calculation: Calculates the range of each bucket.
buckets = [[] for _ in range(numberofBuckets)]
- Initialize Buckets: Creates a list of empty lists (buckets) for sorting.
for j in customList:
- Iterate Over List: Iterates through each element in
customList
.
if j == maxValue:
buckets[-1].append(j)
- Handle Max Value: Places the maximum value in the last bucket to avoid index out of range.
else:
index_b = math.floor((j - minValue) / rangeVal)
buckets[index_b].append(j)
- Bucket Assignment: Calculates the appropriate bucket index for each element and appends it to that bucket.
sorted_array = []
- Initialize Sorted Array: Creates an empty list to store the sorted elements.
for i in range(numberofBuckets):
buckets[i] = insertionSort(buckets[i])# Assuming insertionSort is defined
sorted_array.extend(buckets[i])
- Sort and Merge Buckets: Sorts each bucket individually and then merges them into a single sorted array.
return sorted_array
- Return Result: Returns the sorted array.
Time Complexity
The time complexity of bucket sort varies depending on the distribution of the input data and the sorting algorithm used for sorting each bucket.
In the best case and average case (uniform distribution), assuming the elements are uniformly distributed across the buckets, the time complexity is O(N + N (K/N) log(K/N)), where N is the number of elements and K is the number of buckets. This simplifies to O(N + N * log(N/K)), or approximately O(N) if K is close to sqrt(N).
However, in the worst case (all elements fall into a single bucket), the time complexity becomes O(N^2) due to the insertion sort on a single bucket containing all elements.
Space Complexity
- The space complexity is O(N+K), where N is the number of elements and K is the number of buckets. This is because, in addition to the space for the input list, additional space is needed for the buckets used during the sorting process.
Note: The actual performance will also depend on how the insertionSort
function is implemented.
[ Source: Udemy ]
Merge Sort
Merge sort is a divide and conquer algorithm
Divide the input array in two halves and we keep them halving recursively until they become too small that cannot be broken further
Merge halves by sorting them
def merge(customList, l, m, r):
n1 = m - l + 1
n2 = r - m
#temporary array
L = [0] * (n1) #left subtree
R = [0] * (n2) #right subtree
for i in range(0, n1):
L[i] = customList[l +1]
for j in range(0, n2):
R[j] = customList[m +1 + j]
i = 0
j = 0
k = l
while i < n1 and j < n2:
if L[i] <= R[j] :
customList[k] = L[i]
i += 1
else:
customList[k] = R[j]
j += 1
k += 1
while i < n1:
customList[k] = L[i]
i += 1
k += 1
while j < n2:
customList[k] = R[j]
j += 1
k += 1
def mergeSort(customList, l, r):
if l < r:
m = (l+(r-1)) // 2 #O(1)
mergeSort(customList, l, m) #T(n/2)
# } O(nLogn)
mergeSort(customList, m+1, r) #T(n/2)
merge(customList, l, m, r) #O(n)
return customList
We have divided our custom list into two sub arrays. Based on this division, we have combined them in sorted order.
Time complexity: O(NLogN), Space complexity: O(N)
When to use?
When you need stable sort
When average expected time is O(NlogN)
When to avoid?
- When space is a concern
Pivot Function Implementation
The pivot function in sorting, especially in quicksort, selects an element (the pivot) and partitions the array around it. Elements smaller than the pivot go to the left, and larger ones go to the right. This process is repeated recursively, sorting the array efficiently. The choice of pivot can impact the algorithm's performance.
def swap(my_list, index1, index2):
my_list[index1], my_list[index2] = my_list[index2],my_list[index1]
def pivot(my_list, pivot_index, end_index):
swap_index = pivot_index
for i in range(pivot_index + 1, end_index +1):
if my_list[i] < my_list[pivot_index]:
swap_index += 1
swap(my_list, swap_index, i)
swap(my_list, pivot_index, swap_index)
return swap_index
my_list = [3,5,0,9,6,2]
print(pivot(my_list, 0, 6))
print(my_list)
QuickSort Algorithm Implementation
def quicksort_helper(my_list, left, right):
if left < right:
pivot_index = pivot(my_list, left, right)
quicksort_helper(my_list, left, pivot_index - 1)
quicksort_helper(my_list, pivot_index +1, right )
return my_list
def quicksort(my_list):
return quicksort_helper(my_list, 0, len(my_list)-1)
my_list = [3,5,0,9,6,2]
print(quicksort(my_list))
Heap Sort
The heap sort algorithm uses binary heap tree.
Step 1: Insert data to binary heap tree
Step 2: Extract data from binary heap tree
It is best suited with array. It does not work with linked list.
def heapify(customList, n, i):
smallest = i
l = 2*i + 1
r = 2*i + 2
if l < n and customList[l] < customList[smallest]:
smallest = l
if r < n and customList[r] < customList[smallest]:
smallest = r
if smallest != i:
customList[i], customList[smallest] = customList[smallest], customList[i]
heapify(customList, n, smallest)
def heapSort(customList):
n = len(customList)
for i in range(int(n/2)-1, -1, -1):
heapify(customList, n, i)
for i in range(n-1, 0, -1):
customList[i], customList[0] = customList[0], customList[i]
heapify(customList, i, 0)
customList.reverse()
cList = [2,1,7,6,5,3,4,9,8]
heapSort(cList)
print(cList)
Time complexity: O(NLogN), Space complexity: O(1)
Comparison of Sorting Algorithm
Time complexity | Space complexity | Stable | |
Bubble sort | O(n^2) | O(1) | Yes |
Selection sort | O(n^2) | O(1) | No |
Insertion sort | O(n^2) | O(1) | Yes |
Bucket sort | O(n logn) | O(n) | Yes |
Merge sort | O(n logn) | O(n) | Yes |
Quick sort | O(n logn) | O(n) | No |
Heap sort | O(n logn) | O(1) | No |