**Introduction**

A Comparison-based sorting algorithm is a sorting algorithm whose final order is determined by a comparison between the value of it’s input elements. The comparison algorithm reads the list of elements through a single abstract comparison operation that determines which of two elements should occur first in the final sorted list. There are two general categories of comparison algorithms, as defined by Big-O notation, in use today quadratic and linearithmic. Three very popular quadratic algorithms are Bubble sort, Insertion sort and Selection sort and three linearithmic Heapsort, Merge sort and Quicksort. Quicksort has a Quadratic worst case, but since it is often implemented in ways that make that very unlikely it deserves the linearithmic category. I should also mention Timsort and Introsort, since they are very popular and are variations on previously mentioned algorithms.

I am primarily a Java developer, and have been since Java 1.0, so the examples will be in Java and I will conclude with a short discussion on the usage of these algorithms within the Java JDK/JRE.

### Terms

**Stable**: Does not change the relative order of elements with equal keys**In-place**: Space complexity – only requires a constant amount O(1) of additional memory space**Online**: Can sort a list as it receives it, piece-by-piece in a serial fashion**Adaptive**: Takes advantage of existing order in its input**Lower bound**: A mathematical argument saying you can’t hope to go faster than a certain amount**Inversions**: Unordered subsets within a data set**Locality**: It says that some data, or the place where it is stored is accessed often.**Big-O Notation**: The approximate time it takes an algorithm to complete as a function of the size of it’s data set. Also described as Time complexity**Time complexity**: A measure of the number of operations given an input size**Asymptotic**: Refers to a limiting behavior based on a single variable and a desired measure.[1]

**Quadratic Algorithms O(n**^{2})

**Quadratic Algorithms O(n**

^{2})The O(n^{2}) family of algorithms are conceptually the simplest, and in some cases very fast, but their quadratic time complexity limits their scalability. The swap operation is fundamental to both the bubble sort and the selection sort. Insertion sort works by selecting the smallest values and inserting them in the proper order by shifting the higher values right. A common characteristic of quadratic algorithms is nested looping, where each item in the list is compared to every other item in the list, (n * n) or (n^{2}).

for (int i = 0; i < N; i++) { for (int j = 0; j < N; j++) { if (i == j) c[i][j] = 1.0; else c[i][j] = 0.0; } }

**Bubble sort**

The algorithm starts at the beginning of the dataset. It compares the first two elements, and if the first is greater than the second, it swaps them. It continues doing this for each pair of adjacent elements to the end of the dataset. It then starts again with the first two elements, repeating again until no more swaps have occurred. If the list is already sorted the algorithm will perform with a Big-O time complexity of O(n). Large values at the beginning of list are known as rabbits while small values at the end of list are known as turtles because of the speed that they move through the list. Rabbits and turtles can have a large effect of the performance of bubble sort.

Big-O Time Complexity | |
---|---|

Worst Case | O(n^{2}) |

Avg Case | O(n^{2}) |

Best Case | O(n) |

int[] data = {41, 67, 6, 3, 23}; bubbleSort(data); public static void bubbleSort(int[] data) { boolean swapped = true; // initialize to begin first pass int temp; // temp swap data while (swapped) { // initialize to false for each iteration swapped = false; // Note the nested loop, common with quadratic algorithms for (int i = 0; i < data.length - 1; i++) { // If left value is greater than it's right sibling -> swap if (data[i] > data[i + 1]) { // data swap temp = data[i]; data[i] = data[i + 1]; data[i + 1] = temp; // flags the outer loop that we need to continue swapped = true; } } } }

Runtime Profile: Dataset{41, 67, 6, 3, 23} | |||||
---|---|---|---|---|---|

[0] | [1] | [2] | [3] | [4] | Index (i) |

41 | 67 | 6 | 3 | 23 | 41 < 67 -> no swap |

41 | 67 | 6 | 3 | 23 | 67 > 6 -> swap(1, 2) |

41 | 6 | 67 | 3 | 23 | 67 > 3 -> swap(2, 3) |

41 | 6 | 3 | 67 | 23 | 67 > 23 -> swap(3, 4) |

41 | 6 | 3 | 23 | 67 | 41 > 6 -> swap(0, 1) |

6 | 41 | 3 | 23 | 67 | 41 > 3 -> swap(1, 2) |

6 | 3 | 41 | 23 | 67 | 41 > 23 -> swap(2, 3) |

6 | 3 | 23 | 41 | 67 | 41 < 67 -> no swap |

6 | 3 | 23 | 41 | 67 | 6 > 3 -> swap(0, 1) |

3 | 6 | 23 | 41 | 67 | Sorted |

Comparison Count: 7 Write Count: 7

**Selection sort**

An in-place and non-stable sorting algorithm that generally performs worse than the similar insertion sort. The algorithm divides the input array into two parts, the sorted part which begins at index[0] and builds right as the data is sorted, and the unsorted items occupying the remainder of the array. Initially the sorted part is empty and the unsorted part occupies the full array. It then finds the smallest element in the array and exchanges it with the element in the first position, then find the second smallest element and exchanges it with the element in the second position, and continues until the entire array is sorted.

Big-O Time Complexity | |
---|---|

Worst Case | O(n^{2}) |

Avg Case | O(n^{2}) |

Best Case | O(n^{2}) |

int[] data = {41, 67, 6, 3, 23}; selectionSort(data); public static void selectionSort(int[] data) { // Iterate through array - nums[outer_i] - and replace each // index in array with the next ordered value in array for (int outer_i = 0; outer_i < data.length - 1; outer_i++) { int smallestValue = Integer.MAX_VALUE; int smallest_i = outer_i + 1; // Note the nested loop, common with quadratic algorithms // Iterate through every item in the array, excluding // previous filled, finding the smallest value for (int inner_i = outer_i; inner_i < data.length; inner_i++) { if (data[inner_i] < smallestValue) { smallest_i = inner_i; smallestValue = data[inner_i]; } } // Once the smallest value is found // swap smallest_i & outer_i if (outer_i != smallest_i) { int temp = data[outer_i]; data[outer_i] = data[smallest_i]; data[smallest_i] = temp; } } }

Runtime Profile: Dataset{41, 67, 6, 3, 23} | |||||
---|---|---|---|---|---|

[0] | [1] | [2] | [3] | [4] | Index (outer_i) |

41 | 67 | 6 | 3 | 23 | [0] 41 > 3 (lowest value) -> swap(0, 3) |

3 | 67 | 6 | 41 | 23 | [1] 67 > 6 (next lowest value) -> swap(1, 2) |

3 | 6 | 67 | 41 | 23 | [2] 67 > 23 (next lowest value) -> swap(2, 4) |

3 | 6 | 23 | 41 | 67 | [3] 41 < 67 (only remaining to evaluate) -> sorted |

Comparison Count: 9 Write Count: 3

**Insertion sort**

An in-place, stable and on-line sorting algorithm that is relatively efficient for small datasets and mostly sorted sets. It works by taking elements from the set one by one and inserting them in their correct position into a new sorted set. In arrays, the new set and the remaining elements can share the array’s space, but insertion is expensive, requiring shifting all following elements over by one. Like Selection sort the algorithm divides the input array into two parts, the sorted part which begins at index[0] and builds right as the data is sorted, and the unsorted items occupying the remainder of the array. Initially the sorted part is empty and the unsorted part occupies the full array. It then finds the smallest element in the array and exchanges it with the element in the first position, then find the second smallest element and exchanges it with the element in the second position, and continues until the entire array is sorted. While insertion sort typically makes fewer comparisons than selection sort, it requires more writes because the inner loop can require shifting large sections of the sorted portion of the array. In general, insertion sort will write to the array O(n^{2}) times, whereas selection sort will write only O(n) times. Insertion sort is efficient, passing through the array only once, it’s also adaptive taking advantage of datasets that are already substantially sorted: the time complexity is O(n + d), where d is the number of inversions.

Big-O Time Complexity | |
---|---|

Worst Case | O(n^2) |

Avg Case | O(n^2) |

Best Case | O(n) |

int[] data = {41, 67, 6, 3, 23}; insertionSort(data); public static void insertionSort(int[] data) { int outer_i, inner_i, value; for (outer_i = 1; outer_i < data.length; outer_i++) { value = data[outer_i]; inner_i = outer_i; // Note the nested loop, common with quadratic algorithms // iterate right to left, // compare ‘value’ with element to the left while (inner_i > 0 && value < data[inner_i - 1]) { // Move element right one index per iteration // to make room for ‘value’ as it moves left data[inner_i] = data[inner_i - 1]; inner_i--; } data[inner_i] = value; } }

Runtime Profile: Dataset{41, 67, 6, 3, 23} | |||||
---|---|---|---|---|---|

[0] | [1] | [2] | [3] | [4] | Index (outer_i) |

41 | 67 | 6 | 3 | 23 | [1] 67 > 41 -> no action, begins at index [1], compare to (index – 1) |

41 | 67 | 6 | 3 | 23 | [2] 6 < 67 -> move 6 left |

41 | 67 | 3 | 23 | 6 < 67 -> shift 67 right | |

41 | 67 | 3 | 23 | 6 < 41 -> shift 41 right | |

6 | 41 | 67 | 3 | 23 | insert 6 -> index[0] |

6 | 41 | 67 | 3 | 23 | [3] 3 < 67 -> move 3 left |

6 | 41 | 67 | 23 | 3 < 67 -> shift 67 right | |

6 | 41 | 67 | 23 | 3 < 41 -> shift 41 right | |

6 | 41 | 67 | 23 | 3 < 6 -> shift 6 right | |

3 | 6 | 41 | 67 | 23 | insert 3 -> index[0] |

3 | 6 | 41 | 67 | 23 | [4] 23 < 67 -> move 23 left |

3 | 6 | 41 | 67 | 23 < 67 -> shift 67 right | |

3 | 6 | 41 | 67 | 23 < 41 -> shift 41 right | |

3 | 6 | 23 | 41 | 67 | insert 23 -> index[2] |

3 | 6 | 23 | 41 | 67 | Sorted |

Comparison Count: 7 Write Count:10

**Quadratic Summary**

Among simple O(n^{2}) algorithms, selection sort almost always outperforms bubble sort. Insertion sort is very similar to selection sort, has an advantage in that it only scans as many elements as needed to determine the correct location of the k+1st element, while selection sort must scan all remaining elements to find the absolute smallest element. Insertion sort will usually perform about half as many comparisons as Selection sort. If the input data is already sorted, Insertion sort performs as few as n-1 comparisons, thus making Insertion sort more efficient when given sorted, or nearly sorted, datasets. While Insertion sort typically makes fewer comparisons than Selection sort, it requires more writes because the inner loop can require shifting large sections of the sorted portion of the dataset. In general, Insertion sort perform O(n^{2}) writes, whereas Selection sort will only perform O(n) writes. For this reason Selection sort may be preferable in cases where writing to memory is more expensive than reading.

**Linearithmic Algorithms O(n log n)**

There are fundamental limits on the performance of comparison sorts. A comparison sort must have a lower bound of Ω(n log n) comparison operations.[2] In this sense, mergesort, heapsort, and introsort are asymptotically optimal in terms of the number of comparisons they must perform for each n (log n). These algorithms often leverage divide-and-conquer techniques to recursively separate a problem into one or more smaller related subproblems, until these become simple enough to be solved directly. The solutions to the subproblems are then combined to give a solution to the original problem. Merge sort and Quicksort both employ divide-and-conquer techniques which are crucial for their O(n log n) time complexity. Heapsort achieves O(n log n) by first organizing the data set into a heap data structure. Because of their time complexity Linearithmic algorithms are capable of scaling to large data sets.

**Heapsort**

Is an in-place, non-recursive, unstable algorithm, but considered by many to be the de-facto sorting algorithm for guaranteed O(n log n) time complexity. Heapsort is the O(n log n) version of selection sort. It also works by determining the largest (or smallest) element of the dataset, placing that at the end (or beginning) of the set, then continuing with the rest of the dataset, but leverages a heap data structure to improve time complexity. Once the dataset has been made into a heap, the root node is guaranteed to be the largest (or smallest) element. When it is removed and placed at the end of the datset, the heap is rearranged so the largest element remaining moves to the root. Using the heap, finding the next largest element takes O(log n) time, instead of O(n) for a linear scan as in selection sort. This allows Heapsort to run in O(n log n) time.

Heapsort is generally appreciated because it is trustworthy: There aren’t any “pathological” cases which would cause it to be unacceptably slow.

I haven’t included the heapify source below but have documented in another post.

Big-O Time Complexity | |
---|---|

Worst Case | O(n log n) |

Avg Case | O(n log n) |

Best Case | O(n log n) |

int[] data = {41, 67, 6, 3, 23}; heapsort(data); public static void heapsort(int[] data) { int count = data.length; // first place in max-heap order (largest value at index:0) heapify(data, count); int end = count - 1; while (end > 0) { // swap the root(max value) with the last element int tmp = data[end]; data[end] = data[0]; data[0] = tmp; // put the heap back in max-heap order // (largest value at index:0) siftDown(data, 0, end - 1); // decrement the size of the heap so that the previous // max value will stay in its proper place // at the end of the array end--; } }

Runtime Profile: Dataset{41, 67, 6, 3, 23} | |||||
---|---|---|---|---|---|

[0] | [1] | [2] | [3] | [4] | Index |

41 | 67 | 6 | 3 | 23 | heapify -> move max value(67) to root[0] |

67 | 41 | 6 | 3 | 23 | swap root[0] 67 with end[4] 23 |

23 | 41 | 6 | 3 | 67 | shift down: re-heap(end-1) -> move “next” largest value(41) to root[0] |

41 | 23 | 6 | 3 | 67 | swap root[0] 41 with end[4-1] 3 |

3 | 23 | 6 | 41 | 67 | shift down: re-heap(end–1) -> move “next” largest value(23) to root[0] |

23 | 3 | 6 | 41 | 67 | swap root[0] 23 with end[4–1] 6 |

6 | 3 | 23 | 41 | 67 | shift down: re-heap(end—1) -> largest value is already at root[0] 6 |

6 | 3 | 23 | 41 | 67 | swap root[0] 6 with end[4—1] 3 |

3 | 6 | 23 | 41 | 67 | Sorted |

**Merge sort**

A comparison-based, stable, divide and conquer algorithm that requires additional memory, in fact each pass through the data doubles the size of the sorted subsections. The algorithm starts by comparing every two elements (i.e., 1 with 2, then 3 with 4…) and swapping them if the first should come after the second. It then merges each of the resulting datasets of two into sets of four, then merges those sets of four, and so on; until at last two sets are merged into the final sorted dataset. Due to the required copying of the collection Merge sort is in the average case slower than Quicksort.[3]

Big-O Time Complexity | |
---|---|

Worst Case | O(n log n) |

Avg Case | O(n log n) |

Best Case | O(n log n) |

int[] data = {3, 67, 6, 41, 23}; mergesort(data, 0, data.length); public static void mergesort(int data[], int start_i, int end_i) { // reached the end if (end_i - start_i == 1) { return; } // middle index of the array int mid_i = start_i + (end_i - start_i) / 2; // left half mergesort(data, start_i, mid_i); // right half mergesort(data, mid_i, end_i); // merge merge(data, start_i, mid_i, end_i); } public static void merge(int data[], int start_i, int middle_i, int end_i) { int[] tmp_data = new int[data.length]; int left_i = start_i; int right_i = middle_i; int tmp_i = start_i; // Reorder data: Use tmp_data to store re-ordered data while (left_i < middle_i || right_i < end_i) { if (left_i == middle_i) { // Reached the middle, assign data[right_i] tmp_data[tmp_i] = data[right_i]; right_i++; } else if (right_i == end_i) { // Reached the end, assign data[left_i] tmp_data[tmp_i] = data[left_i]; left_i++; } else if (data[left_i] < data[right_i]) { // Swap lower of left or right values tmp_data[tmp_i] = data[left_i]; left_i++; } else { // Swap lower of left or right values tmp_data[tmp_i] = data[right_i]; right_i++; } tmp_i++; } // Update the data array with the sorted tmp array values for (tmp_i = start_i; tmp_i < end_i; tmp_i++) { data[tmp_i] = tmp_data[tmp_i]; } }

Runtime Profile: Dataset{3, 67, 6, 41, 23} | |||||
---|---|---|---|---|---|

[0] | [1] | [2] | [3] | [4] | Index |

3 | 67 | 6 | 41 | 23 | Unsorted |

3 | 67 | merge sets {3} & {67}, note: when sets are merged they are ordered | |||

23 | 41 | merge sets {41} & {23} | |||

6 | 23 | 41 | merge sets {6} & {23, 41} | ||

3 | 6 | 23 | 42 | 67 | merge sets {3, 67} & {6, 23, 41} |

3 | 6 | 23 | 42 | 67 | Sorted |

**Quicksort**

An in-place, unstable, divide and conquer algorithm. Quicksort can be implemented as a stable sort but at a performance penalty. Quicksort relies on a partition, called a pivot, to divide the data into two parts. The objective being to select the median value of the dataset as the pivot (which can’t actually be known until after the data is sorted). All elements, with values, smaller than the pivot are moved before it and all greater elements are moved after it. This is done in linear time O(n) and in-place. The lesser and greater subsets are then recursively sorted, with each recursion selecting a new pivot. The most complex issue in quicksort is choosing a good pivot element; consistently poor choices of pivots can result in drastically slower O(n²) performance, if at each step the median is chosen as the pivot then the algorithm works in O(n log n). Finding the pivot however, is an O(n) operation on unsorted data-sets and therefore exacts its own penalty. A very good partition splits a dataset into two equal subsets. A bad partition, on other hand, splits a dataset into two sets of very different sizes. The worst partition puts only one element in one dataset and all other elements in the other set, preventing divide-and-conquer. If the partitioning is balanced, Quicksort runs asymptotically as fast as Merge sort. If, on the other hand, the partitioning is unbalanced, Quicksort runs asymptotically as slow as insertion sort.

Big-O Time Complexity | |
---|---|

Worst Case | O(n^{2}) |

Avg Case | O(n log n) |

Best Case | O(n log n) |

int[] data = {3, 67, 6, 41, 23}; quicksort(data, 0, data.length - 1); public static void quicksort(int data[], int left_i, int right_i) { // Returns the index of the pivot int pivot_i = partition(data, left_i, right_i); // Iterate recursively beginning at the pivot moving left if (left_i < pivot_i - 1) { // Sort left half of array quicksort(data, left_i, pivot_i - 1); } // Iterate recursively beginning at the pivot moving right if (pivot_i < right_i) { // Sort right half of array quicksort(data, pivot_i, right_i); } } // Reorder the list so that all elements with values less than // the pivot come before the pivot, while all elements with // values greater than the pivot come after it. public static int partition(int data[], int left_i, int right_i) { int l = left_i, r = right_i; int tmpValue; // Hope to select a good pivot. If you don’t select a value // that is close to the median then you can not benefit from // divide-and-conquer and will not achieve O(n log n). int pivotValue = data[(left_i + right_i) / 2]; // Start at each end of the array and work inward toward // the center, swapping values while (l <= r) { // select values in the 'left' side of the array that // are greater than the pivot, we’ll want to swap // these with the 'right' side while (data[l] < pivotValue) { l++; } // select values in the 'right' side array that are less than // the pivot, we’ll want to swap these with the 'left' side while (data[r] > pivotValue) { r--; } // Relative to the pivot // Swap lower value, on the ‘right’ side, with higher value, // on the ‘left’ side if (l <= r) { tmpValue = data[l]; data[l] = data[r]; data[r] = tmpValue; // increment ‘left’ index l++; // decrement ‘right’ index r--; } } // Return the location of the pivot index // If the pivot index is repeatedly chosen poorly, // then partition degrades to O(n^2) return l; }

Runtime Profile: Dataset{3, 67, 6, 41, 23, 2} | ||||||
---|---|---|---|---|---|---|

[0] | [1] | [2] | [3] | [4] | [5] | Index |

3 | 67 | 6 | 41 | 23 | 2 | 67 > 2 -> swap(1, 5) |

3 | 2 | 6 | 41 | 23 | 67 | 3 > 2 -> swap(0, 1) |

2 | 3 | 6 | 41 | 23 | 67 | 41 > 23 -> swap(3, 4) |

2 | 3 | 6 | 23 | 41 | 67 | Sorted |

**Introsort**

Introsort, or introspective sort is a sorting algorithm that initially uses Quicksort, but switches to Heapsort when the recursion depth exceeds a level based on, the logarithm of, the number of elements being sorted, and uses Insertion sort for small cases because of its good locality of reference, i.e. when the data most likely resides in memory and easily referenced.

Big-O Time Complexity | |
---|---|

Worst Case | O(n log n) |

Avg Case | O(n log n) |

Best Case | O(n) |

**Timsort**

A stable, adaptive, hybrid sorting algorithm, derived from Merge sort and Insertion sort. The algorithm finds subsets of the data that are already ordered, and uses the subsets to sort the data more efficiently. This is done by merging an identified subset, called a run, with existing runs until certain criteria are fulfilled. Timsort was designed to take advantage of partial orderings that already exist in most real-world data. Timsort is now included as part of the Java JDK/JRE 1.7 for sorting objects. Timsort was originally written by Tim Peters, Joshua Bloch has written a well documented Java version. The code is complex so instead of documenting here I have provided a link to the OpenJDK version.

Wikipedia also has a good breakdown of the inner workings of the Timsort algorithm.

Big-O Time Complexity | |
---|---|

Worst Case | O(n log n) |

Avg Case | O(n log n) |

Best Case | O(n) |

**Linearithmic Summary**

Linearithmic algorithms are important because they scale to very large datasets. Quicksort is generally considered the fastest, Heapsort the most trustworthy for it’s guaranteed worst-case and Merge sort is appreciated for it’s stability when sorting objects. Introsort and Timsort are more complex hybrids of Quicksort, Heapsort, Merge sort and Insertion sort respectively that minimizes or eliminates quadratic worst-case time complexity. It’s interesting to see how the Java JDK/JRE has used these algorithms in real-world implementations.

**Java JDK / JRE**

Prior to Java 1.7 the JDK/JRE has used Merge sort for sorting objects and a “tuned” Quicksort for sorting primitives. Merge sort provides a stable sort, which is often crucial for sorting objects, and guarantees O(n log n) which means it is scalable. The “tuned” Quicksort has been modified so that it reduces the likelihood of quadratic time O(n^{2}) by improving the selection of the partition pivot, in what is called the Bentley-McIlroy adaptive pivot selection. The approach works by using a ternary comparison operator (less-than, equal-to, greater-than) to enable a “fat pivot,” and insertion sort for small sub-arrays. This Quicksort also uses a fast swap that adapts to the size of the data elements. The adaptive pivot selection uses the middle element for small arrays, the median of the first, middle and last elements for medium-sized arrays, and the median of nine equally-spaced elements for larger arrays.[4]

Beginning with Java 1.7 the JDK/JRE has begun using Timsort for object sorts and a three-way partition (dual pivot) Quicksort for sorting primitives. Bentley-McIlroy developed one of the first implementations of a three-way partition in 1993.

The Java JDK three-way partition has been implemented by Vladimir Yaroslavskiy with assistance from Jon Bentley and Josh Bloch. The source code can be found here.

Vladimir Yaroslavskiy briefly describes the algorithm as…

The invariant of classical Quicksort is:

[ <= p | >= p ]

The new Dual-Pivot Quicksort uses *two* pivots elements in this manner:

1. Pick an elements P1, P2, called pivots from the array.

2. Assume that P1 <= P2, otherwise swap it.

3. Reorder the array into three parts: those less than the smaller

pivot, those larger than the larger pivot, and in between are

those elements between (or equal to) the two pivots.

4. Recursively sort the sub-arrays.

The invariant of the Dual-Pivot Quicksort is:

[ < P1 | P1 <= & <= P2 } > P2 ] [5]

One issue that’s improved with a three-way partition is duplicate values. Quicksort has been well documented to run at quadratic time with datasets containing many duplicate values. A three-way partition can significantly increase performance by grouping the duplicate values with a pivot of the same value. This allows the other two partitions to still benefit from the divide-and-conquer nature of the algorithm that produces O(n log n) that we desire.

### References

(1) Julienne Walker, “Asymptotic Notation”

(2) Avrim Blum, “Comparison-based Lower Bounds for Sorting”

(3) Lars Vogel, “Mergesort in Java”

(4) Jon L. Bentley and M. Douglas McIlroy, “Engineering a Sort Function”

(5) Vladimir Yaroslavskiy, “Dual-Pivot Quicksort”

on January 26, 2017 at 9:10 am |sharifcse08Reblogged this on Codebazz and commented:

Very helpful article.