Module 0085: Search algorithms and their complexities

Tak Auyeung, Ph.D.

November 28, 2018

Contents

1 About this module
2 Linear search
 2.1 Applications
 2.2 Algorithm
 2.3 Complexity
3 Binary search
 3.1 Applications
 3.2 Algorithm
 3.3 Complexity

1 About this module

2 Linear search

2.1 Applications

Although linear search is far inferior to binary search in terms of efficiency, there are times when only linear search is applicable.

Searching in a linked list can only be done using linear search. Searching in an unsorted array can only be done using linear search.

2.2 Algorithm

The algorithm of linear search is in listing 1.


Listing 1: linear search
1define sub linsearch 
2  by value x : sequence 
3  by value v : valueType 
4  return type boolean 
5  local i : iterator 
6  i  x.begin()  
7  while (i < x.end())  ( v) do  
8    i++  
9  end while 
10  if (i == x.end()) then  
11    return false 
12  else 
13    return true 
14  end if  
15end define sub

Just a little bit of explanation for this pseudocode.

2.3 Complexity

Let us perform the worst case analysis. In the worst case, the value v does not match any item in the sequence. Line 6 and lines 10 to 14 execute only once, and none of these lines involve repetitive actions. We’ll lump the execution time of all these lines as t1.

How many times do we execute line 7? Assuming that v is not found in the sequence, it means the right condition of the conjunction is always true. This means that we can only count on the left condition to become false to exit the loop.

The iterator i begins at the beginning of the sequence. We only increments once in the loop. It will require |x| increments to reach the end of the sequence (|x| means the number of values in the sequence). Consequently, line 7 executes |x| + 1 times, while line 8 executes |x| times. Let us assume line 7 requires t2 for each execution, and line 8 requires t3 for each execution, the total amount of time required is f(|x|) = t1 + (|x| + 1)t2 + |x|t3. Let us call n = |x|k1 = t1 + t2, and k2 = t2 + t3, then f(n) = k1 + n k2.

The upper bound of f(n) can be many functions, such as O(n2). However, the tight bound of f(n) can only be Θ(n). Let us revisit the definitions of asymptotic limits in the context of f(n)n.

liminf if(i) i = lim ik1 + i k2 i (1) = lim i(k2 + k1 i ) (2) = k2 + k1 lim i1 i (3) = k2 > 0 (4)

Here, we can use the usual limit in place of limit inferior because the sequence is monotonic. The interpretation is that f(n) Ω(n).

Next, applying the same techniques, we can show that limsupif(i) i = k2 < . This means that f(n) O(n).

Since f(n) Ω(n) and f(n) O(n), f(n) Θ(n).

3 Binary search

3.1 Applications

Binary search is useful only when values are stored in a randomly accessible way, and they are sorted. This limits the use of binary search to arrays in memory, or randomly accessiable files of equal size records (or one with an index).

3.2 Algorithm

Listing 2 is the pseudocode of binary search as a recursive algorithm.


Listing 2: binary search
1define sub binsearch 
2  by value a : randomaccess 
3  by value b : integer 
4  by value e : integer 
5  by value v : valueType 
6  local m : integer 
7  return type : boolean 
8  if (b > e) then 
9   return false 
10  else 
11   m b+e 2 
12    if (a[m] < v) then 
13      return binsearch(a, m+1, e, v)  
14    else if (a[m] > v) then 
15      return binsearch(a, b, m1, v)  
16    else 
17      return true 
18    end if 
19  end if 
20end define sub

3.3 Complexity

Let us now analyze the complexity of binary search. Again, we are only interested in the worst case complexity, which means the value v cannot be found in the random access ADT a.

Looking through the recursive algorithm, only a few lines do not take constant time to execute. Line 13 and line 15 invokes the subroutine again. This means that the execution time of an invocation depends solely on the execution time of the recursive calls.

Furthermore, the only change of parameters are to parameter b and parameter e. Let us define x = e b + 1, which is the number of elements to be considered by a particular invocation.

How does x of one invocation relate to the x of the recursive invocation? Let us consider all the possible cases for x, which is the number of elements to consider in the recursive call:

The worst case is that x = x2.

Let us define f(x) to be the amount of time required to perform bineary search for x items. Then this definition can be recursive. f(0) = t1 is the base case, in which t1 is the upper bound of the total amount of execution time of the algorithm not counting the recursive invocations. The recursive step is f(x) = t1 + f(x2) because only one of the two recursive call is chosen, and the worst case is to pass x2 elements to the next invocation.

Let us “guess” that f(x) = t1 (2 + log2(x)), provided that x is a power of 2. We can proof our guess as follows.

The basis: f(0) = t1 by definition. However, 0 is not a power of 2. f(1) = t1 + f(12) = t1 + f(0) = 2t1 by the recursive definition of f. However, our guess also suggests that f(1) = t1 (2 + log2(1)) = 2t1. This is the base case of the inductive proof.

The induction step states that assume f(x) = t1 (2 + log2(x)) for x = 2k, prove that f(2x) = t1 (2 + log2(2x)). This can be done rather easily:

f(2x) = t1 + f(x) (5) = t1 + t1(2 + log2(x)) (6) = t1(1 + 2 + log2(x)) (7) = t1(2 + log2(2x)) (8)

Consequently, given that x = 2k, binary search requires t1(2 + log2(x)) time. This means however, that f(n) Θ(ln(n)) because t1(2+log 2(x)) ln(x) = t1 ln(2) + 2 ln(x) is a monotonic function that approaches t1 ln(2) as x .. Because 0 < t1 ln(2) < , ln(n) is a tight bound for f(n).