## Table of contents

Sorting helps by organizing data in a specific order, making it easier to search, analyze, and process. Let's explore the variety of sorting !

*What is sorting?*

Sorting refers to arranging data in a particular format: either ascending or descending.

Practical use of sorting :

Microsoft Excel: Built in functionality to sort data (there is a sort icon)

Online shopping: Online shopping websites generally have option to sort by price, review, rating etc.

*Types of sorting*

Depending on the space, ability - sorting algorithms can be divided into two categories.

Sorting:

Space used

**In place**: Sorting algos which does not require any extra space for sorting (example: Bubble sort)**Out of place**: Sorting algos which requires extra space for sorting (example: Merge sort)

Stability

**Stable**: If a sorting algorithm keeps the sequence of similar items unchanged after sorting, it is called stable sorting (example: Insertion sort)**Unstable**: If a sorting algorithm changes the order of similar items after sorting, it is called unstable sorting (example: quick sort)

*Sorting Terminologies*

These terms are very important, specially at the time of interviews.

Increasing order

If successive element is greater than the previous one

Example: 1,3,5,7,9,11

Decreasing order

If successive element is less than the previous one

Example: 11,9,6,4,3,1

Non increasing order

If successive element is less than or equal to its previous element in the sequence.

Example: 11, 9, 7, 5, 5, 1, 1

Non decreasing order

If successive element is greater than or equal to its previous element in the sequence.

Example: 1, 2, 3, 3, 4, 7, 8, 8

*Bubble Sort*

Bubble sort is also referred as Sinking sort

We repeatedly compare each pair of neighboring items and swap them if they are in the wrong order.

**Note:** We have to repeatedly continue comparing and swapping process until all elements are fully sorted

```
def bubbleSort(customList):
for i in range(len(customList) -1): #O(n)
for j in range(len(customList)-i-1): #O(n)
if customList[j] > customList[j+1] #O(1)
customList[j], customList[j+1] = customList[j+1],customList[j]
cList = [2,1,7,6,9,3,10,19]
bubbleSort(cList)
```

Time complexity: O(N^2), Space complexity: O(1)

When to use?

When the input is already sorted

Space is a concern

Easy to implement

When to avoid?

- Average time complexity is poor

*Selection Sort*

In selection sort, we repeatedly find the minimum element and move it to the sorted part of the array, making the unsorted part sorted.

First, make two parts: sorted and unsorted. Find the minimum element in the unsorted part, then swap it with the first element. Move the minimum value to sorted part. Repeat until all elements are sorted.

```
def selectionSort(customList):
for i in range(len(customList)): #O(n)
min_index = i
for j in range(i+1, len(customList)): #O(n)
if customList[min_index] > customList[j]: #O(1)
min_index = j
customList[i], customList[min_index] = customList[min_index], customList[i]
print(customList)
cList = [2,1,7,6,9,3,10,19]
selectionSort(cList)
```

Time complexity: O(N^2), Space complexity: O(1)

When to use?

When we have insufficient memory

Easy to implement

When to avoid?

- When time is a concern

*Insertion Sort*

Divide the given array into two part

Take 1st element from unsorted array and find its correct position in sorted array

Repeat until unsorted array is empty

```
def insertionSort(customList):
for i in range(1, len(customList)): #O(n)
key = customList[i]
j = i-1
while j >= 0 and key < customList[j]: #O(n)
customList[j+i] = customList
j -= 1
customList[j+1] = key
print(customList)
cList = [2,1,7,6,9,3,10,19]
insertionSort(cList)
```

Time complexity: O(N^2), Space complexity: O(1)

When to use?

When we have insufficient memory

Easy to implement

When we have continuous inflow of numbers and we want to keep them sorted

When to avoid?

- When time is a concern

*Bucket Sort*

Create buckets and distribute elements of array into buckets

Sort buckets individually

Merge buckets after sorting

number of buckets = round(Sqrt(number of elements))

Appropriate buckets = ceil(Value * number of buckets/ maxValue)

```
import math
def insertionSort(customList):
for i in range(1, len(customList)): #O(n)
key = customList[i]
j = i-1
while j >= 0 and key < customList[j]: #O(n)
customList[j+i] = customList
j -= 1
customList[j+1] = key
return customList #CHANGE MADE IN HERE
def bucketSort(customList):
numberOfBuckets = round(math.sqrt(len(customList))
maxValue = max(customList)
arr = [] # CREATING TEMPORARY/ ADDITIONAL SPACE
for i in range(numberOfBuckets ):
arr.append([])
for j in customList:
index_b = math.ceil(j * numberOfBuckets/ maxValue)
arr[index_b - 1].append(j)
for i in range(numberOfBuckets):
arr[i] = insertionSort(arr[i]) #O(N^2)
k = 0
for i in range(numberOfBuckets):
for j in range(len(arr[i])):
customList[k] = arr[i][j]
k += 1
return customList
cList = [2,1,7,6,9,3,10,19]
print(bucketSort(cList))
```

Time complexity: O(N^2), Space complexity: O(n)

When to use?

When input uniformly distributed over range

1, 2, 4, 5, 3, 8, 7, 9 (uniform distribution)

1, 2, 4, 91, 93, 96 (not an uniform distribution)

When to avoid?

- When space is a concern

*Bucket Sort with negative numbers*

```
def bucketSort(customList):
numberofBuckets = round(math.sqrt(len(customList)))
minValue = min(customList)
maxValue = max(customList)
rangeVal = (maxValue - minValue) / numberofBuckets
buckets = [[] for _ in range(numberofBuckets)]
for j in customList:
if j == maxValue:
buckets[-1].append(j)
else:
index_b = math.floor((j - minValue) / rangeVal)
buckets[index_b].append(j)
sorted_array = []
for i in range(numberofBuckets):
buckets[i] = insertionSort(buckets[i])
sorted_array.extend(buckets[i])
return sorted_array
```

**Code Explanation**

```
def bucketSort(customList):
```

**Function Definition**: Begins the definition of the`bucketSort`

function, which takes`customList`

as an argument.

```
numberofBuckets = round(math.sqrt(len(customList)))
```

**Number of Buckets**: Determines the number of buckets to use in the sorting process, typically the square root of the list's length.

```
minValue = min(customList)
maxValue = max(customList)
```

**Min and Max Values**: Finds the minimum and maximum values in`customList`

.

```
rangeVal = (maxValue - minValue) / numberofBuckets
```

**Range Calculation**: Calculates the range of each bucket.

```
buckets = [[] for _ in range(numberofBuckets)]
```

**Initialize Buckets**: Creates a list of empty lists (buckets) for sorting.

```
for j in customList:
```

**Iterate Over List**: Iterates through each element in`customList`

.

```
if j == maxValue:
buckets[-1].append(j)
```

**Handle Max Value**: Places the maximum value in the last bucket to avoid index out of range.

```
else:
index_b = math.floor((j - minValue) / rangeVal)
buckets[index_b].append(j)
```

**Bucket Assignment**: Calculates the appropriate bucket index for each element and appends it to that bucket.

```
sorted_array = []
```

**Initialize Sorted Array**: Creates an empty list to store the sorted elements.

```
for i in range(numberofBuckets):
buckets[i] = insertionSort(buckets[i])# Assuming insertionSort is defined
sorted_array.extend(buckets[i])
```

**Sort and Merge Buckets**: Sorts each bucket individually and then merges them into a single sorted array.

```
return sorted_array
```

**Return Result**: Returns the sorted array.

#### Time Complexity

The time complexity of bucket sort varies depending on the distribution of the input data and the sorting algorithm used for sorting each bucket.

In the best case and average case (uniform distribution), assuming the elements are uniformly distributed across the buckets, the time complexity is O(N + N

*(K/N)*log(K/N)), where N is the number of elements and K is the number of buckets. This simplifies to O(N + N * log(N/K)), or approximately O(N) if K is close to sqrt(N).However, in the worst case (all elements fall into a single bucket), the time complexity becomes O(N^2) due to the insertion sort on a single bucket containing all elements.

#### Space Complexity

- The space complexity is O(N+K), where N is the number of elements and K is the number of buckets. This is because, in addition to the space for the input list, additional space is needed for the buckets used during the sorting process.

Note: The actual performance will also depend on how the `insertionSort`

function is implemented.

[ Source: Udemy ]

*Merge Sort*

Merge sort is a divide and conquer algorithm

Divide the input array in two halves and we keep them halving recursively until they become too small that cannot be broken further

Merge halves by sorting them

```
def merge(customList, l, m, r):
n1 = m - l + 1
n2 = r - m
#temporary array
L = [0] * (n1) #left subtree
R = [0] * (n2) #right subtree
for i in range(0, n1):
L[i] = customList[l +1]
for j in range(0, n2):
R[j] = customList[m +1 + j]
i = 0
j = 0
k = l
while i < n1 and j < n2:
if L[i] <= R[j] :
customList[k] = L[i]
i += 1
else:
customList[k] = R[j]
j += 1
k += 1
while i < n1:
customList[k] = L[i]
i += 1
k += 1
while j < n2:
customList[k] = R[j]
j += 1
k += 1
def mergeSort(customList, l, r):
if l < r:
m = (l+(r-1)) // 2 #O(1)
mergeSort(customList, l, m) #T(n/2)
# } O(nLogn)
mergeSort(customList, m+1, r) #T(n/2)
merge(customList, l, m, r) #O(n)
return customList
```

We have divided our custom list into two sub arrays. Based on this division, we have combined them in sorted order.

Time complexity: O(NLogN), Space complexity: O(N)

When to use?

When you need stable sort

When average expected time is O(NlogN)

When to avoid?

- When space is a concern

*Pivot Function Implementation*

The pivot function in sorting, especially in quicksort, selects an element (the pivot) and partitions the array around it. Elements smaller than the pivot go to the left, and larger ones go to the right. This process is repeated recursively, sorting the array efficiently. The choice of pivot can impact the algorithm's performance.

```
def swap(my_list, index1, index2):
my_list[index1], my_list[index2] = my_list[index2],my_list[index1]
def pivot(my_list, pivot_index, end_index):
swap_index = pivot_index
for i in range(pivot_index + 1, end_index +1):
if my_list[i] < my_list[pivot_index]:
swap_index += 1
swap(my_list, swap_index, i)
swap(my_list, pivot_index, swap_index)
return swap_index
my_list = [3,5,0,9,6,2]
print(pivot(my_list, 0, 6))
print(my_list)
```

*QuickSort Algorithm Implementation*

```
def quicksort_helper(my_list, left, right):
if left < right:
pivot_index = pivot(my_list, left, right)
quicksort_helper(my_list, left, pivot_index - 1)
quicksort_helper(my_list, pivot_index +1, right )
return my_list
def quicksort(my_list):
return quicksort_helper(my_list, 0, len(my_list)-1)
my_list = [3,5,0,9,6,2]
print(quicksort(my_list))
```

*Heap Sort*

The heap sort algorithm uses binary heap tree.

Step 1: Insert data to binary heap tree

Step 2: Extract data from binary heap tree

It is best suited with array. It does not work with linked list.

```
def heapify(customList, n, i):
smallest = i
l = 2*i + 1
r = 2*i + 2
if l < n and customList[l] < customList[smallest]:
smallest = l
if r < n and customList[r] < customList[smallest]:
smallest = r
if smallest != i:
customList[i], customList[smallest] = customList[smallest], customList[i]
heapify(customList, n, smallest)
def heapSort(customList):
n = len(customList)
for i in range(int(n/2)-1, -1, -1):
heapify(customList, n, i)
for i in range(n-1, 0, -1):
customList[i], customList[0] = customList[0], customList[i]
heapify(customList, i, 0)
customList.reverse()
cList = [2,1,7,6,5,3,4,9,8]
heapSort(cList)
print(cList)
```

Time complexity: O(NLogN), Space complexity: O(1)

*Comparison of Sorting Algorithm*

Time complexity | Space complexity | Stable | |

Bubble sort | O(n^2) | O(1) | Yes |

Selection sort | O(n^2) | O(1) | No |

Insertion sort | O(n^2) | O(1) | Yes |

Bucket sort | O(n logn) | O(n) | Yes |

Merge sort | O(n logn) | O(n) | Yes |

Quick sort | O(n logn) | O(n) | No |

Heap sort | O(n logn) | O(1) | No |