Indian Computing Olympiad

Training Material


Sorting

Quicksort

Quicksort is a another divide-and-conquer algorithm. Its worst case complexity is not good, but it avoids mergesort's problem of needing an extra array to store the final sorted output. Also, though the worst case complexity is bad, it works well on the average.

Here is how quicksort works.

  1. Call A[0], the first element of the given array A the PIVOT element or the SPLITTER.
  2. We regroup A[1] to A[N-1] into two parts. The first part consists of elements less than the splitter A[0] while the second part consists of elements greater than A[0]. (If there are duplicate entries in the array, we may have elements equal to A[0]. We can place these, for instance, in the first partition.)
  3. After rearranging the elements, we move the splitter element in between the two partitions and recursively sort the lower and upper parts of the array (leaving out the splitter).

Before we describe how to do this rearrangement efficiently, we observe that this procedure will not necessarily give us subproblems of size N/2 as we had assumed in the earlier analysis. The sizes of the two partitions depend on the value of A[0]. In the worst case, A[0] may be the smallest or largest value in A, resulting in one of the two partitions being empty. However, such extreme cases are rare and it can be proved mathematically that on an average Quicksort runs in time proportional to N*log N. More importantly, Quicksort works very well in practice and it is usually the algorithm used to implement built-in sorting routines.

To achieve the rearrangement, we scan the array from A[1] to A[n-1]. At an intermediate point in the scan, we are examining A[i]. At this point, we should have partitioned A[1] to A[i-1] into two groups, those smaller than A[0] and those bigger than A[0]. We thus have two indices into the array, b and i. The index b specifies the boundary between the lower and upper partitions within A[1] to A[i-1]. We can fix b to be either the last position in the lower partition or the first position in the upper partition. Suppose we choose the latter.

We then get an invariant as follows. After each iteration of the loop where we scan through A,

  • i marks the first element of the part of the array that is yet to be scanned
  • b marks the beginning of the upper partition within A[1] to A[i-1]

Thus, at an intermediate point, the array looks like the following.

	        0 1          b         i         n-1
	        -------------------------------------
	       | | <= A[0]  | > A[0]  | Yet to scan  |
	        -------------------------------------
      

When we scan A[i], we check whether it is smaller than A[0]. If so, it should move to the smaller partition, so we swap it with the entry at A[b] and increment both b and i. If A[i] is not smaller than A[0], it is already in the correct partition so we leave b unchanged and just increment i.

         i = 1; b = 1;
         while (i < n){
           if (A[i] <= A[0]){
             swap(A[b],A[i]);
             b++;
           }
           i++;
         }
      

Of course, at the end we should also insert the splitter between the two partitions, so we can add, after the loop;

        swap(A[0],A[b-1]);  /* Insert splitter before upper partition */
      

We then recursively sort A[0]..A[b-2] and A[b]..A[n-1]. As we mentioned earlier, in the worst case, one of these two intervals will have length n-1 while the other is empty, but on an average the array will split into approximately equal parts.

In general, quicksort will be a recursive function which will be passed the array and the left and right endpoints of the interval to be sorted. We use the convention that the left endpoint is the first index in the interval while the right endpoint is one position beyond the last index in the interval. Thus, the initial call would be to quicksort(A,0,n), where n is the size of A. In general, the function is

	int quicksort(int *A, int l, int r){

	  if (r - l <= 1)){ return 0; }  /* Trivial array, do nothing */

	  /* Rearrange with respect to splitter, a[l] */
	  b = l+1;
	  for (i = l+1; i < r; i++){
	    if (A[i] <= A[l]){
	      swap(A[b],A[i]);
	      b++;
	    }
	  }
	  swap(A[l],A[b-1]);  /* Insert splitter before upper partition */

	  /* Recursive calls */
	  quicksort(a,l,b-1);
	  quicksort(a,b,r);
	}
      

Quicksort in real life

If we choose the pivot element to be first (or last) element in the array, the worst case input for quicksort is when the array is already sorted. We argued that this will rarely happen and that in the average case, quicksort works well in time N log N.

In certain practical situations, the worst case can and does happen with unexpected frequency. Consider an application like a spreadsheet where you can display and manipulate a table of data. One option typically provided to the user is to select a column and sort the entire table based on the values in that column. It is often the case that the user first sorts the column in the wrong order (ascending or descending), realizes the mistake and then repeats the sort in the opposite direction. If the column sorting algorithm is quicksort, the second sort provides a worst case input!

One solution is to randomize the choice of pivot. You can generate random numbers in C/C++ as follows:

  	#include <stdlib.h>

  	int seed;
  	...
  	srandom(seed);  /* Initialize the random number generator*/

  	...

  	j = random();   /* Returns a integer random number */
      

If we initialize srandom() with the same value, the subsequent sequence of random numbers will be the same. In IOI, if you use randomization in your program, it is required that your random number generator always yields the same sequence for any input. Thus, you should use a fixed value as a seed, not something "random" like the current system time etc.

To generate a random number in the range 1..k, we can write

        j = random()%k;
      

For quicksort, we need to choose a random pivot element between positions l and r-1. We can do this by writing:

        pivot = l+ random()%(r-l);
        swap(A[l],A[pivot]);
      

The second statement moves the pivot to the beginning of the array. We can then use the earlier implementation of quicksort that assumes that the pivot is at the beginning of the array.

Video material

Video lectures on Quicksort, NPTEL online course on Design and Analysis of Algorithms