Performance calculation for COSC 311 -- an alternative explanation and practice problems

There are two facets of understanding runtime analysis to the level needed for COSC 311 (Data Structures and Algorithms): (1) understanding Big-Oh, including the meaning of the definitions of Big-Oh, Big-Omega, and Big-Theta, and (2) begin able to count. Since understanding the mathematical meaning of the various characteristics of run-time is the easier of the two, we'll start with that.

It should be understood clearly by the reader that this paper is intended as a second (or even third) explanation of these topics as presented in the course text. Working through different styles of explanation is a common and powerful technique to gaining understanding in an obscure topic. Second, it should be understood that these explanation involve no formal proofs. Because the mathematical prerequisites for COSC 311 are pretty low, I'll be calling on your common sense and intuition rather than on your knowledge of derivatives. On the other hand, just because there are no proofs, does not mean the concepts are trivial. They are not trivial; on the contrary, they are quite subtle.

Definitions

We'll start by forcing some concrete interpretations on the symbols. f(n) is the function that corresponds to the time it takes to execute the algorithm on a dataset of size n. If f(n) were integer-valued (rather real-valued), then we could say that f(n) corresponds to the number of operations it takes to execute the algorithm on dataset size n. Intuitively, they're the "same thing," but since we are using f(n) is real, we can talk about execution time in seconds.

Second, g(n) is of value to our understanding only if we presume it is "simpler" in some way than f(n). It would not be useful, for example, to say the the function n3 is O(n3 + 2 n + 25 ). This fits in with our intuitive understanding that to guess a Big-Oh for a particular f(n), we throw away all but the highest order term of n, and replace the coefficient of n with 1.

Now, notice an important thing about the inequality f(n) ≥ c g(n), for n ≥ n0. There are two unknowns, c and n0. One inequality, two unknowns means the constraint is underspecified. The consequence of this is that you must arbitrarily specify one of the unknowns (as long as it's a real number greater than 0, as indicated in the definition); then calculate the second. If you can solve for the second unknown, then the definition is satisfied. If you cannot, then the definition is not satisfied and f(n) is not O(g(n)).

Examples.

  • f(n) = 25 n2 + 2 n + 10. Is it O(n2)?
    Here is a table with g(n)=n2 and different c values.

    Compare f(n) with 1 * g(n). Clearly, f(n) will be greater than 1*n2 for all non negative n.
    Now compare f(n) with 26*g(n). When n ≥ 5, then 26 *g(n) ≥ f(n). So, for c=5, n0 = 26.
    Now compare f(n) with 28*g(n)
    . When n ≥ 3, then 28 * g(n) ≥ f(n).

    In fact, here's a table that shows several c and n0 pairs, manually extracted from the above table.
    c n0
    1 none!
    25 none!
    26 5
    27 4
    28 3
    29 2
    30 2
    31 2
    32 2

    It's pretty tedious to calculate all the values. What happens if we use some algebra?
    Choose c = 30, then calculate n0 so that
    25 * n02 + 2 * n0 + 10 ≤ 30 * n02.
    The solutions are n0= 1.6282856.. and -1.228256..
    Since n and n0 are nonnegative, that eliminates the second solution. And since n0 is identified as the smallest integer where f(n) ≤ 30 * g(n), then pick any integer greater than 1.6282856 to satisfy the inequality. For example, for c=30, you could pick n0=2. Or you could pick n0= 20432. For an infinite number of choices, as long as you pick an n0 ≥ 2, you'll satisfy the definition of Big-Oh for this f(n) and this g(n).

    What happens when you pick c=1? The solutions to the inequality 25 * n02 + 2 * n0 + 10 ≤ 1 * n02
    are approximately -0.04166666 +//- i * 0.644151034. That is, the solutions are complex numbers -- there are no real solutions.
    Since there is no real number you can find for n0 when you've picked c = 1, you're out of luck. Notice, this does not mean that f(n) is not O(n2). The definition does not say that for c there must exist an n0. It says that you can find (at least) one c, n0 pair such that the definition's criterion is satisfied.

    You probably should be asking yourself, "yes, but once I've got a c and n0 that appear to work, how will I know that the inequality will hold for all n ≥ n0?"
    Intuitively, sketch the graph so you can observe the growth behavior. More precisely, from derivative calculus, you can see that (d f(n) / d n) = 50 * n + 2 while (d 30*n2 / d2 n) = 60 * n, and (d2 f(n) / d n) = 50 while (d2 30 * n2 / d2n) = 60. 30 * n2 grows faster than 25 * n2 + 2 * n + 10.

  • f(n) = 25 n2 + 2 n + 10. Is it O(n)?
    That is, can we find c and n0 such that
    25 n2 + 2 n + 10 ≤ c * n for all n ≥ n0?
    Last changed: