Programming Assignment #2: Shellsort

Due: Tuesday, Nov. 16.

This is a modification of exercise 8.25 in your textbook. In this assignment you will be executing an implementation of Shellsort many times in order to calculate the average performance of the algorithm for inputs of different sizes.

Part I:

Determine the average amount of time it takes for Shellsort to sort an array of 100,000 integers. (You can obtain the code for ShellSort from the book here.) To do this, you will have to run the algorithm on 30 different, randomly generated arrays. Your program should generate the arrays first, then effect the sort 100 times on that array (so that you can calculate the average run-time on that array.) Here is some pseudo-code:

for i from 0 to 29
  generate "random" input array, A[], using seed[i]
  get startTime
  for j from 0 to 100
    array B = copy of A
    shellSort(B)
  get stopTime
  runningTime = stopTime - startTime
  print runningTime/100, seed #

You should run your program for each of the interval sequences defined in exercise 8.23 (and given at the bottom of this page.). To ensure fairness, your program should use the same input arrays for each run. To that end, you'll have to use the same seed for the random number generating functions each time.

Pseudo-Random Number Generators in Java

You will want to use the class java.util.Random. One of the constructors takes a long as a seed:

import java.util.*;
...
Random r = new Random(1001L);  // r is a random number generator. (The 'L' connotes a long integer)
Now whenever you want a random number between, say, 0 and x-1, you use:
int newRndNum = r.nextInt(x);  // Generates a random number from 0 to x-1

Here is a very small program that has two random number generators generate the same sequence of "random" numbers by using the same initial seed for both.

Displaying the Data

After you've gathered the data for all the interval sequences, use a graphing program, like Excel, to plot the average run-times by Shellsort for each interval sequence.

Part II:

Rather than measuring run-time, count the number of comparisons made during execution of Shellsort. Measure the number of comparisons required by ShellSort for each of the interval sequences, and for different values of n. For each value of n and interval sequence, report the average number of comparisons across 30 randomly generated sequences. Again, you should use the same seeds for the random number generator to ensure you are using the same 30 arrays for each of the interval sequences. You should measure for values of n = {1000, 10000, 20000, 40000, 80000, 160000, 320000}.

Displaying the Data

After you've gathered the data for all the interval sequences, use a graphing program, like Excel, to plot the average number of comparison required to solve problems of the different values of n, for each of the interval sequences. The horizontal axis should be n, and the vertical axis should be the number of comparisons. Because there are six different interval sequences, there should be six curves, or data sets shown on the graph. Use a logarithmically scaled x-axis (for values of n). You may want to experiment with a logarithmic y-axis, as well. (What might that tell you about the shape of the curves?)

The Interval Sequences:

  1. Shell's original sequence (repeatedly divide by 2)
  2. Shell's original sequence, adding 1 if the result is nonzero but even.
  3. Gonnet's sequence (shown in the text) with repeated division by 2.2.
  4. Hibbard's increments: 1, 3, 7, ..., 2k-1.
  5. Knuth's increments: 1, 14, 13, ..., (3k-1)/2.
  6. Sedgewick's increments: 1, 5, 19, 41, 109, ..., with each term having the form of either 9*4k - 9*2k + 1 or 4k - 3*2k + 1

To Hand in:

  1. Hardcopy of your code (I particularly want to see your code for each of the interval sequences.)
  2. A printed list of the raw data you got from each run.
  3. A printed graph plotting this data, as described above.
  4. Electronic submission of your code.