Module 0082: Big, little, Oh, Omega and Theta

Tak Auyeung, Ph.D.

April 21, 2021

1 About this module
2 Algorithm time complexity
  2.1 Complicated algorithms
  2.2 Comparing algorithms
  2.3 Asymptotic Estimates
3 Asymptotic characteristics
  3.1 Supremum
  3.2 Inﬁmum
  3.3 Limit of a sequence
  3.4 Limit superior
  3.5 Limit inferior
  3.6 Upper bound
  3.7 Negligible
  3.8 Lower bound
  3.9 Dominant
  3.10 Tight bound
4 What have we learned?

1 About this module

Prerequisites:
Objectives: This module discusses asymptotic notations in the context of algorithm time complexity.

2 Algorithm time complexity

Although the primary objective of this module is not the discussion of the time complexity of algorithms, it helps to provide a little bit of context before we start the discussion of something dry and abstract.

The time complexity of an algorithm refers to the amount of time for an algorithm to complete, usually as a function of the size of the problem. Let us consider something that is really trivial. Listing 1 is an algorithm to compute the sum of all elements in an array with “n” items.

Listing 1:array sum

1int i;
2int sum;
3int a[N];
4i = 0;
5sum = 0;
6while (i < N)
7{
8  sum = sum + a[i];
9  i = i + 1;
10}

How much time does it need? Obviously, that depends on the number of elements in array “a” (N). Lines 4 and 5 are executed only once (not dependent on N), let us say that it takes \(t_1\) for these two times to complete.

Line 6 executes \(N+1\) times because variable i ranges from 0 to \(N\) (not \(N-1\)). Lines 8 and 9 execute \(N\) times. Let \(t_2\) indicate the amount of time for each time line 6 executes, while \(t_3\) be the amount of time for lines 8 and 9 to execute in each iteration.

Then, the total amount of time for the algorithm to execute is exactly \(f(N) = t_1 + t_2 + N(t_2 + t_3)\). It would appear that the discussion of time complexity should stop here.

2.1 Complicated algorithms

The algorithm in listing 1 is really simple. There are more complicated algorithms. For example, bubble sort is relative complex. The time complexity of a bubble sort algorithm boils down to \(f(N) = t_1 + N\cdot t_2 + N^2t_3\) for some constants \(t_1\), \(t_2\) and \(t_3\) (not related to the constants used in the complexity of the array sum algorithm).

It is tedious to compute the exact amount of time of that an implementation of an algorithm takes to execute. The eﬀort is usually not worth it either, as we only worry about complexity when N gets large. When N is large, then only \(t_3\) matters anyway.

In other words, in many cases, we are quite happy with just a simple term that approximates the actual time complexity when N is large.

2.2 Comparing algorithms

As it turns out bubble sort, selection sort and insertion sort all have a time complexity of \(f(N) = t_1 + N\cdot t_2 + N^2t_3\). Each implementation of each algorithm has a diﬀerent set of \((t_1, t_2, t_3)\) constants. Nonetheless, they are lumped together as one type of sorting algorithms.

Another type of sorting algorithm, such as merge sort (recursive or iterative), have a time complexity of \(t_1 + N\cdot t_2 + N \log (N) t_3\). Although each implementation of each algorithm has its own set of constants \((t_1, t_2, t_3)\).

When we compare algorithms of the same type, the constant for the quickest growing term, \(t_3\), is important. For example, selection sort has a relatively small \(t_3\) compared to insertion sort. However, when we compare algorithms of diﬀerent types, then even \(t_3\) is not important.

For example, let us assume that \(t_3=10000\) for merge sort, whereas \(t_3=1\) for selection sort. We can always ﬁnd a value for \(N\) so that it takes less time for merge sort than selection sort. Essentially, we are trying to solve for \(N\) such that \(N^2 \ge 10000 N \log (N)\).

Although it is cumbersome to solve for the smallest \(N\) that satisﬁes the constraint, we can easily ﬁnd that any \(N \ge 120000\) will make the condition true.

Because the constants are not useful when we compare algorithms of diﬀerent types, we would like to ignore the constants altogether when we analyze the complexity of an algorithm.

2.3 Asymptotic Estimates

In many cases, we only care about the time complexity of an algorithm, with respect to the input size, as the size of the problem increases. To make our lives easier, we use approximation functions that are much simpler than the actual time complexity to characterize the complexity of algorithms.

For example, with bubble sort, insertion sort and selection sort, we say the time complexity is “\(N^2\)”, whereas merge sort has a time complexity of “\(N \log (N)\)”. What does that really mean?

3 Asymptotic characteristics

Let \(f(N)\) be the actual time complexity of an algorithm, with respect to the size of the input, \(N\). Let \(g(N)\) be a (usually simpler) approximation of the time complexity of an algorithm. We want to ﬁnd out how \(f(N)\) relates to \(g(N)\).

3.1 Supremum

Let set \(A\) be a partial order. This means that we can deﬁne the \(\le \) relation for \(A \times A\) such that it is reﬂexive, antisymmetric and transitive. The “less than or equal to” relation for numeric values is a partial order.

Next, let \(X \subseteq A\), which means \(X\) has some of the values of \(A\). Let us deﬁne the predicate \(P(y,X) = \forall x \in X( x \le y)\). The predicate \(P(y,X)\) means that \(y\) is an upper bound of \(X\). Now, let use deﬁne \(s = \sup X \Leftrightarrow (P(s,X) \wedge (\forall s'(P(s',X)\Rightarrow s \le s')))\). This basically means that \(s\) is the minimum upper bound of \(X\).

Note that \(\sup (X)\) may be diﬀerent from \(\max (X)\). This is because the supremum needs not be an element of \(X\), but the maximum needs to be an element of \(X\).

3.2 Inﬁmum

These deﬁnitions are mirrors of those of the supremum. Let us reuse the notations of \(A\) and \(X\).

This time let use deﬁne a predicate \(Q(y,X) = \forall x \in X: y \le x\). This predicate means \(y\) is a lower bound of \(X\). Now, we deﬁne \(i = \inf X \Leftrightarrow (Q(i,X) \wedge (\forall i(Q(i',X) \Rightarrow i' \le i)))\). This means that \(i\) is the maximum lower bound of \(X\).

3.3 Limit of a sequence

What is the limit of a sequence of numbers? In other words, if I have an inﬁnite sequence \((x_0, x_1, \ldots )\), what does \(\lim _{i\rightarrow \infty }x_i = k\) mean?

It means that \(k\) is a unique value that satisﬁes the requirement that \(\forall \epsilon > 0 \in \mathbb {R}( \exists m \in \mathbb {N}( \forall j \ge m \in \mathbb {N}( \left | k - x_j \right | < \epsilon )))\). In English, it means that “for all real number epsilon greater than 0, there exist a natural number m such that the diﬀerence between \(k\) and \(x_m, x_{m+1}, x_{m+2}, \ldots \) are less than \(\epsilon \).” Or, even more casually, “once we get to \(x_m\), all other values of the \(x\) sequence are within \(\epsilon \) of \(k\).”

Note that such a limit may not exist. Consider the sequence deﬁned as \(x_i =(-1)^{i}\). In this case, \(\lim _{i \rightarrow \infty }x_i\) does not exist.

The limit of a sequence can be inﬁnity. For example, if the sequence is deﬁned as \(x_i = i\), then \(\lim _{i \rightarrow \infty }x_i = \infty \). In this case, the deﬁnition of the limit of a sequence is slightly diﬀerent because the diﬀerence of inﬁnity and any ﬁnite number is inﬁnity. As a result, the deﬁnition needs to be modiﬁed as follows.

\(\lim _{i \rightarrow \infty }x_i = \infty \) means that \(\forall y \in \mathbb {R}( \exists m \in \mathbb {N}( \forall j \ge m \in \mathbb {N}( x_j > y)))\). In plain English, it means that for any real numbers y, we can ﬁnd a natural number m such that \(x_m, x_{m+1}, x_{m+2}, \ldots \) are all greater than y.

3.4 Limit superior

We can then deﬁne “limit superior” (\(\limsup \)) of a series of numbers, \(x_n\), as follows:

\begin {align*} \limsup _{n \rightarrow \infty } (x_n) & = \inf \{\sup \{x_k|k \ge n\} | n \ge 0\} \\ & = \inf \{\sup \{x_k|k\ge 0\}, \sup \{x_k|k\ge 1\}, \sup \{x_k|k \ge 2\},\dots \} \end {align*}

Note that even though \(\lim _{i \rightarrow \infty }(-1)^{i}\) does not exist,

\begin {equation} \limsup _{i \rightarrow \infty }(-1)^{i} = 1 \end {equation}

This is why limit superior is useful.

This term can be loosely interpreted as the “maximum lower bound of the minimum upper bound of the tails of an inﬁnite series of numbers”.

3.5 Limit inferior

We can then deﬁne “limit inferior” (\(\liminf \)) of a series of numbers, \(x_n\), as follows:

\begin {equation} \liminf _{n \rightarrow \infty } (x_n)= \sup \{\inf \{x_k|k \ge n\} | n \ge 0\} \end {equation}

Just to follow up an example that we have been using,

\begin {equation} \liminf _{i \rightarrow \infty }(-1)^{i} = -1 \end {equation}

This term can be loosely interpreted as the “minimum upper bound of the maximum lower bound of the tails of an inﬁnite series of numbers”.

3.6 Upper bound

Let us start with two functions, \(f(n)\) and \(g(n)\). The deﬁnition of \(g(n)\) being an upper bound of \(f(n)\)is as follows:

\begin {equation} f(n) \in O(g(n)) \Leftrightarrow \limsup _{n\rightarrow \infty } \left |{\frac {f(n)}{g(n)}}\right | < \infty \end {equation}

Note the use of the “element of” (or “in”) notation \(\in \). This means \(O(g(n))\) is denoting a set of functions to which \(g(n)\) is an upper bound.

The English translation is that \(g(n)\) is an upper bound of \(f(n)\). It essentially means that \(\forall n: \exists k: k\cdot g(n) \ge f(n)\).

This is, by far, the most important notation used in algorithm complexity analysis because the estimate is only diﬀerent from the actual time complexity, in the worst case, by some constant as the size of the problem increases.

Note that it is still possible that \(g(n)\) is a gross over-estimate of \(f(n)\).

3.7 Negligible

This is denoted by the little-oh notation:

\begin {equation} f(n) \in o(g(n)) \Leftrightarrow \lim _{n\rightarrow \infty } \frac {f(n)}{g(n)} = 0 \end {equation}

This means \(g(n)\) is conﬁrmed as a “gross” over-estimate of \(f(n)\). For example, if \(f(n)\) is the time complexity of the merge sort algorithm, then \(f(n) \in o(n^2)\).

3.8 Lower bound

This is denoted by big-omega:

\begin {equation} f(n) \in \Omega (g(n)) \Leftrightarrow \liminf _{n\rightarrow \infty } \left |\frac {f(n)}{g(n)}\right | > 0 \end {equation}

This means that \(g(n)\) is not a gross over-estimate of \(f(n)\). However, it is possible that \(g(n)\) is a gross under-estimate of \(f(n)\). For example, if \(f(n)\) is the time complexity of bubble sort, then we can say that \(f(n) \in o(n)\).

3.9 Dominant

This is denoted by small-omega:

\begin {equation} f(n) \in \omega (g(n)) \Leftrightarrow \lim _{n\rightarrow \infty } \frac {f(n)}{g(n)} = \infty \end {equation}

This means that \(g(n)\) is conﬁrmed as a “gross’ under-estimate of \(f(n)\). If \(f(n)\) is the actual time complexity of bubble sort, then \(f(n) \in \omega (n)\).

3.10 Tight bound

This is denoted by big-theta:

\begin {equation} f(n) \in \Theta (g(n)) \Leftrightarrow (f(n) \in O(g(n))) \wedge (f(n) \in \Omega (g(n))) \end {equation}

Life does not get any better than this (for a computer scientist reearching algorithm time complexity). This means that we can ﬁnd two constants, \(k_1\) and \(k_2\), such that \(\forall n: (f(n) \ge k_1g(n)) \wedge (f(n) \le k_2g(n))\).

4 What have we learned?

If \(f(n)\) is the time complexity of an algorithm,

If possible, we’d like to proof \(f(n) \in \Theta (g(n))\) for some simple function \(g(n)\).
Otherwise, we’d like to ﬁnd a simple function \(g(n)\) such that \(f(n) \in O(g(n))\). At least we know that \(g(n)\) is a “reasonable” estimate.
- If we can conﬁrm that \(f(n) \in o(g(n))\), then we know that “at some point”, things can only get better.
Showing that \(f(n) \in \Omega (g(n))\) is not very useful by itself.
Showing that \(f(n) \in \omega (g(n))\) means that \(g(n)\) is somewhat useless as an time complexity estimate.