median of three – Physics, Python, and Programming

Last time, we have studied heap sort — a fast, in-place sorting algorithm. Today, we take a look at quick sort. As the name gives away, this is also a very fast sorting algorithm. While the worst case scales as $O(N^2)$ with the size $N$ of the input data, the expected scaling is just $O(N \log N)$ , just as good as heap sort. We will implement it in C++; the code, as always, can be found at the bottom of this post.

Quick Sort

The algorithm works as follows:

Choose a pivot element.
Loop through the array and ensure that all the values lower than the pivot are to the left of it, while the larger values are to the right of it.
Recursively iterate on the left and right subarray.

Let’s address each point in turn. First, we need to choose the pivot element. This choice can be as simple as always choosing the last element of the (sub)array. However, what would happen if the original array is largely sorted? In that case, the chances are high the last element is already the largest. We then use it as the pivot, moving all smaller elements to the left of it — which they already are. The problem ensues as we split the problem for the recursion. The right subarray contains no elements, as the pivot is already the largest, and all other values now lie to the left of it. The left subarray is just one element shorter.

Now, consider what this means for the recursion. As each recursive call has a high chance of processing a subarray just one element shorter than the last, the number of recursive calls is linearly proportional to the number of elements. In the worst case, the pivot is always the largest value at every level, and thus every single recursion works with data only one element shorter. Hence there will be a total of $N$ recursion levels. As the workload is linear in the number of elements $N$ , this results in the worst case running time of $O(N^2)$ .

There are several ways to ameliorate this. One option is to choose a pivot at random. This drastically lowers the chance of choosing the largest element for pivoting even if the array is nearly sorted. Another solution would be to choose more than one element, find their median, and use that value as the pivot. Again, the chance that the median of several values is equal to the lowest value in the array is tiny for most intents and purposes.

Next, we adress the second point. How do we actually pivot the elements? Before we start, we move the pivot to the end of the array. The reason for this becomes apparent shortly. We introduce two indices. The first, i, will loop through the whole array from the left. The second, j, marks the position of the right-most element lower than the pivot. Whenever i encounters a value that is smaller than the pivot, the index j is increased (now pointing to the left-most value higher than the pivot), and the values at indices i and j are swapped. Once i reaches the end of the array, the index j is, logically, pointing to the last value that was swapped. Since we put the pivot at the end of the array, j is the index of the pivot, which we can return. We will call this procedure partition(). Maybe it will be easier to understand if its operation is illustrated on a practical example:

9 is larger than the pivot (8, indicated by the arrow), and thus nothing happens.

11 is again larger than the pivot, we move on.

7 is smaller than the pivot, thus gets colored dark grey. We advance the index j by one, and swap the elements in positions i and j.

4 is smaller than the pivot, thus we color it dark grey, advance index j, and swap.

9 is larger than the pivot, we continue…

… and finally, we advance index j one last time, and swap the pivot in its designated pocition.

Finally, as per the third step, we use the index of the pivot from the partition() procedure to split the array into two subarrays, and recursively apply our quick_sort() to both of them.

Implementation

We will implement quick_sort() like this:

template <class T>
void quick_sort(std::vector<T> & vec, int left, int right)
{
    if (left >= right)
    {
        return;
    }

    T pivot = select_pivot(vec, left, right);
    int q = partition(vec, pivot, left, right);
    
    quick_sort(vec, left, q - 1);
    quick_sort(vec, q + 1, right);
}

We first check whether our current (sub)array consists of just one element. In that case there’s nothing more to do and we can return. If there’s more than just one element, we select a value to pivot around using the function select_pivot(). However we choose to make the selection, it will be implemented here.

Next, we partition the array as described above and find the index of the pivot. Finally, we recursively sort the left and the right subarray. Pretty straightforward so far.

The procedure partition() can be implemented as:

template <class T>
void partition(std::vector<T> & vec, const T pivot, int left, int right)
{
    int j = left - 1;
    for (int i = left; i <= right; ++i)
    {
        if (vec[i] <= pivot)
        {
            ++j;
            std::swap(vec[i], vec[j]);
        }
    }
    return j;
}

We set the index j pointing to the left of the left-most array index. This looks dangerous, but keep in mind that in the following, we first increment j before using it to access any array elements. As discussed previously, we loop from left to right using index i. If the current value is larger than the pivot, j remains pointing at the right-most element lower than the pivot. Whenever we encounter a value lower than or equal to the pivot, however, index j is incremented to point to the first value larger than the pivot, and we swap the values at indices i and j. This way, j is now pointing to this smaller value, which is now located just to the right of the previously right-most value — in other words, all is well. At the very end of the loop, the pivot itself gets swapped, and the index j remains pointing to it. Hence we return j as the index of the pivot.

What about the choice of the pivot? The easiest way is to choose the last element:

template <class T>
T select_pivot_last(std::vector<T> & vec, int left, int right)
{
    return vec[right];
}

However, if the array is nearly sorted, this might lead to very unfavorable splitting of the array into subarrays. Too many of those, and our algorithm slows down considerably.

Hence, we may select a random element to serve as pivot:

#include <cstdlib>

template <class T>
T select_pivot_random(std::vector<T> & vec, int left, int right)
{
    int random = left + std::rand() * (right - left) / RAND_MAX;
    std::swap(vec[random], vec[right]);
    return vec[right];
}

This ensures that, while it’s still possible to get a really bad partitioning now and then, the overall partitions will generally be acceptable.

The last discussed choice could look like this:

template <class T>
T select_pivot_median(std::vector<T> & vec, int left, int right)
{
    int middle = (left + right) / 2;
    if (vec[left] > vec[middle])
        std::swap(vec[left], vec[middle]);
    if (vec[middle] > vec[right])
        std::swap(vec[middle], vec[right]);
    if (vec[left] > vec[right])
        std::swap(vec[left], vec[right]);

    std::swap(vec[middle], vec[right]); 
    return vec[right];
}

Here, we select the median of three values: the left-most, the middle, and the right-most elements of the array. We first sort these three values, then choose the middle one. Finally, we move it to the right-most position. This pivot selection is often called the median-of-three method. If there are $k$ smallest values present in an array of length $N$ , the chance of selecting one of them as pivot is $\dfrac{k}{N}$ for the select_pivot_last() and select_pivot_random() procedures, but only $\dfrac{k(k-1)}{N(N-1)} \sim \dfrac{k^2}{N^2}$ in the case of select_pivot_median().

Finally, we may want to define a wrapper function that would sort an array without the need of supplying the starting and final index. We can simply overload the quick_sort() function with a single argument:

template <class T>
void quick_sort(std::vector<T> & vec)
{
    int left = 0;
    int right = vec.size() - 1;
    quick_sort(vec, left, right);
}

In conclusion, quick sort, while quite slow in the worst case scenario, is rather fast in the general case. Investing a few instructions in better selecting the pivot can speed up the algorithm in pathological cases. Just like heap sort, quick sort sorts the array in place, meaning no extra memory, nor its costly allocation, is required.

For the sake of convenience, the header file with the quick sort implementation is reproduced below:

#include <cstdlib>
#include <algorithm>
#include <vector>

template <class T>
T select_pivot_last(std::vector<T> & vec, const int left, const int right)
{
    return vec[right];
}

template <class T>
T select_pivot_median(std::vector<T> & vec, const int left, const int right)
{
    int middle = (left + right) / 2;
    if (vec[left] > vec[middle])
        std::swap(vec[left], vec[middle]);
    if (vec[middle] > vec[right])
        std::swap(vec[middle], vec[right]);
    if (vec[left] > vec[right])
        std::swap(vec[left], vec[right]);

    std::swap(vec[middle], vec[right]);
    return vec[right];
}

// make sure the random number generator is initialized first
template <class T>
T select_pivot_random(std::vector<T> & vec, const int left, const int right)
{
    int random = left + std::rand() * (right - left) / RAND_MAX;
    std::swap(vec[random], vec[right]);
    return vec[right];
}

template <class T>
int partition(std::vector<T> & vec, const T pivot, const int left, const int right)
{
    int j = left - 1;
    for (int i = left; i <= right; ++i)
    {
        if (vec[i] <= pivot)
        {
            ++j;
            std::swap(vec[i], vec[j]);
        }
    }
    return j;
}

template <class T>
void quick_sort(std::vector<T> & vec, const int left, const int right)
{
    if (left >= right)
    {
        return;
    }

    T pivot = select_pivot_median(vec, left, right);
    //        select_pivot_random(vec, left, right);
    //        select_pivot_last(vec, left, right);
    int q = partition(vec, pivot, left, right);
    
    quick_sort(vec, left, q - 1);
    quick_sort(vec, q + 1, right);
}

template <class T>
void quick_sort(std::vector<T> & vec)
{
    int left = 0;
    int right = vec.size() - 1;
    quick_sort(vec, left, right);
}

Tag: median of three

An Assortment of Sorting Algorithms, Part 2

Quick Sort

Implementation