Linear search has a time complexity of \(T(n) \in \Theta (n)\), where \(T(n)\) is the actual amount of time needed to search for an item in an array of \(n\) items. This part is proven in module 0085.
However, to make this more useful, any algorithm that has a time complexity of \(T(n) = k_1 + T(n-1)\) also has the same time complexity of \(T(n) \in \Theta (n)\). Note that this is a recursive definition of the time complexity, which implies that it is only useful for recursive algorithms.
It is fairly easy to turn the iterative linear search algorithm into a recursive one. Listing 1 is one:
001define sub linsearch 002 by reference a : array of valueType 003 by value b : integer 004 by value v : valueType 005 return type : boolean 006 local r : boolean 007 r := 0 008 if (b < |a|) then 009 r := (a[b] = v) OR linsearch (a, b+1, v) 010 end if 011 return r 012end define sub
Selection sort, bubble sort and insertion sort can all be done recursively without changing the time complexity. For example, selection sort can be implemented as in listing 2.
001define sub selection 002 by reference a : array 003 by value b : integer 004 if (|a|-1 > b) then 005 local m : integer 006 local i : integer 007 m := b 008 i := b+1 009 while i < |a| do 010 if a[i] < a[m] then 011 m := i 012 end if 013 i := i + 1 014 end while 015 swap(a[b], a[m]) 016 selection(a,b+1) 017 end if 018end define sub
This make the analysis of time complexity much easier. Let us assume \(T(n)\) denotes the time required to sort an array of \(n\) items. Then, we know that \(T(1) = k\) for some constant \(k\). This is because the loop and recursion only occurs when \(|a| - 1 > b\), which means there are more than 1 elements to sort.
In general, \(\forall n > 1: T(n) = k_1 + k_2n + T(n-1)\). This is because the loop looks for the index of the smallest element from index \(b\) to the end of the array, that requires \(n\) time (remember, \(n\) is not the size of the array, but the number of items to sort). In addition, once \(a[b]\) is sorted, we proceed to call the subroutine recursively to sort the portion of the array starting at index \(b+1\).
Let us get rid of \(k_1\) because it is a constant, and we already know the term \(k_2n\) is asymptotically dominant with respect to \(k_1\). Then we have a general recursive definition \(T(n) = k_2n + T(n-1)\) and \(T(1) = k\). For estimate purposes, it is okay to make \(k = k_2\) to simplify the analysis.
I claim that \(T(n)\) has a closed form of \(T(n) = k\frac {n(n+1)}{2}\). This can be proven by induction. The base case is when \(n=1\), then \(T(1) = k\frac {1(1+1)}{2} = k\). The induction step states that if we assume the closed form works for case \(n=j, j \ge 1\), it should work for case \(n=j+1\). Here is the proof:
Now that we know that \(T(n) = k\frac {n(n+1)}{2}\), we can proceed to show that \(T(n) \in O(n^2)\) and \(T(n) \in \Omega (n^2)\). Because \(T(n)\) does not oscilate, we can prove both at the same time:
Because \(\frac {k}{2} > 0\) and \(\frac {k}{2} < \infty \), we know that \(T(n) \in \Theta (n^2)\).
What is significant about this proof is that the time complexity of any algorithm that fits the profile of \(T(n) = k_1 + K_2n + T(n-1)\) has a time complexity of \(\Theta (n^2)\).
We explored binary search in module 0085. We already know that its time complexity is \(\Theta (\log (n))\), where \(n\) is the number of elements in the array being searched.
The pattern of the time complexity of binary search is \(T(n) = k_1 + T(n/2)\). This is because it reduces the number of candidates by half for each iteration. As it turns out, any algorithm that fits the pattern of \(T(n) = k_1 + T(r\cdot n)\) (\(r < 1\)) has a time complexity of \(\Theta (\log (n))\).
Merge sort and the average case of quick sort have a complexity of \(T(n) \in \Theta (n\log (n))\), where \(n\) is the number of elements in the array being sorted. We proved this in module 0079 for the iterative version of merge sort. We also proved this in module 0087 for the recursive version of quicksort.
The general pattern of algorithms with this kind of time complexity is \(T(n) = k_1 + k_2n + T(rn) + T((1-r)n)\) (\(0.5 \le r < 1\)).
We can proof this informally as follows. If we look at the invocation tree, each level needs to handle a total of at most \(n\) items. Consequently, the time complexity of each level is proportional to \(n\). The number of levels required \(L\) depends on \(r\), as \(L = \lceil \frac {\log (n)}{\log (1/r)} \rceil \). As a result, the time complexity of the entire invocation tree is proportional to \(\frac {1}{\log (1/r)}n \log (n)\). Because \(\log (1/r)\) is a term that is independent of \(n\), it is absorbed in the overall constant.
More formally, we can show that the first level has \(n\) items. The second level has two parts, \(rn\) and \((1-r)n\). The third level has four parts, \(r^2n\), \((1-r)rn\), \(r(1-r)n\) and \((1-r)^2n\), and etc. The fourth level has 8 parts, \(r^3n\), \(r^2(1-r)n\), \(r(1-r)rn, (1-r)r^2n, \ldots (1-r)^3n\). In general, at level \(i+1\), each part can be expressed as a permutation of \(n\) times \(i\) coefficients, each coefficient is either \(r\) or \((1-r)\). Each coefficient represents a split point in the invocation tree that splits the number of elements into \(r\) and \(1-r\) proportions. This means in terms of combinations, there are \(i \choose i\) terms with the \(r^i\) coefficient, \(i \choose {i-1}\) terms with the \(r^{i-1}(1-r)\) coefficient, and etc.
As an exercise, draw the invocation tree, and indicate the number of operations (excluding those from recursive calls) of each node of the tree. It is helpful to at least show up to level 3, or a part of level 4.
If we group by coefficients of the form \(r^j(1-r)^{i-j}\), there are \(i \choose j\) of those parts. Consequently, the total of all the parts at level \(i+1\) is \(x = \sum _{j=0}^{i} {i \choose j}r^{j}(1-r)^{i-j}\). We know that \(x = 1\) because the formula is the same as the total probabilities of a binomial distribution with a success ratio of \(r\) and \(i\) trials! You can refer to module 0062 for a more detailed discussion of the binomial distribution.
This proves that the total number of operations per level of the invocation tree is \(n\). The tree only needs to go as deep as \(L = \frac {\log (n)}{\log (1/r)}\) levels because \(T(1)\) is the base case. As a result, the total number of operations of the entire invocation tree is proportional to \(nL = n\frac {\log (n)}{\log ({1/r})} = \frac {1}{\log (1/r)}n\log (n)\).