Algorithm Notes: Understanding of Binary Search

4min read

Background

Recently, I have recovered my algorithm praticing. It's like something is messing in my mind, maybe due to the shallow understanding before. When I tried to implement a binary search to find the first number that is not less than XX, I failed several times. Then I realized that I had to make it clear.

Binary Search is an extremely efficient algorithm for searching a particular number in a sorted array, as it reduces the time complexity from O(n)O(n) to O(logn)O(logn).

I watched two videos which gives through explanation about binary search, but with slightly difference.

Implementation from 花花酱.

L = 0; R = N;
while (L < R) {
  mid = L + (R-L)/2;
  if (nums[mid] >= X)
    R = mid;
  else 
    L = mid + 1;
}
return L;

Implementation from Errichto

L = 0; R = N - 1;
ans = -1;
while (L <= R) {
  mid = L + (R-L)/2;
  if (nums[mid] >= X) {
    ans = mid;
    R = mid - 1;
  } else {
    L = mid + 1;
  }
}
return ans;

Questions

I was confused by 3 questions.

  1. Why it is mid = L + (R-L)/2, not mid = (L+R)/2.
  2. Why in 花花酱‘'s version it's R = N, while(L<R), R = mid, while in Errichto's version it's R = N - 1, while(L<=R), R=mid-1.
  3. why in 花花酱‘'s version LL is returned as the answer.

Explanation

The first question is easy to explain. It is to avoid the overflow of the sum of LL and RR.

The second question seemed a little bit silly after I got it clear. It's just about the current range for searching the target, and the terminaion condition shouldn't miss the very last element to be tested. For example, in 花花酱‘'s version, the search range is [L,R)[L, R). So:

  1. RR starts from NN, which is the length of the array, not being reached by index.
  2. LL is always less than RR, or the while loop should be terminated.
  3. When updating RR, it means we update the search range into [L,mid)[L, mid). midmid is not included, as it is already been examined. The same with updating LL, so it should be L=mid+1, as midmid shouldn't be included.

In Errichto's version, the range is [L,R][L, R]. I will save the explanation here.

The third question is really where the tricky is. First I tried to understand Errichto's version, which is quite straight forward. It narrows the search range gradually and updates the answer value with the midmid each time it is not less than XX, utils there is no more value to search. While the right boundary reduces, the answer approaches to the smallest value that is not less than XX. Let's take an example:

nums: 2, 3, 5, 6, 8, 10, 12
X: 4

first loop:  L = 0, R = 6, mid = 3, ans = 6
second loop: L = 0, R = 2, mid = 1, ans = 6
third loop:  L = 2, R = 2, mid = 2, ans = 5
out of loop, so ans = 5

When it comes to 花花酱‘'s version, it reduces one variable and simply returns LL. Actually, this is what binary search commonly does. So we can also reduce the variable of ansans in Errichto's version and return LL. But why LL is the answer? Let me explain. I'll take the search range as [L,R][L, R].

Suppose we are in the last loop.

Then L==RL == R or L==R1L == R - 1, or there will be another loop.

Then mid==Lmid == L.

There are obviously 2 situations: nums[mid]>=Xnums[mid] >= X or nums[mid]<Xnums[mid] < X.

If nums[mid]>=Xnums[mid] >= X, then nums[L]>=Xnums[L] >= X. Next we can prove nums[L1]<Xnums[L-1] < X.

  • if LL never changed(equals 0), then it means all the numbers in the array are >=X>=X. The first element in the array is obviously the smallest one satisfying >=X>=X.

  • if LL changed, no matter how many times it changed, the last time it changed must be in some loop that went this way:

              ...
              else            // means nums[mid] < x
                  L = mid + 1     // means mid == L - 1
              ...

    Are we seeing something here?

    Yes, nums[L1]==nums[mid]<Xnums[L-1] == nums[mid] < X.

    So nums[L1]<X<=nums[L]nums[L-1] < X <= nums[L], the LLth element is the smallest one satisfying >=X>= X.

If the search range is [L,R)[L, R), we could also deduce the result.

Usage

The above is a very common usage of binary search, which is to find the lower bound of a value in a sorted array. There are some variant usages:

  1. find the upper bound, which is the smallest value satisfying nums[i]>Xnums[i] > X. With a slight modification:
           ...
           if (nums[mid] > X)
               R = mid         // or R = mid - 1 if search range [L, R]
           ...
  1. find the first occurance of the target. With adding some code after the while loop:
   ...
   return (L < nums.length && nums[L] === X) ? L : -1;
  1. find the last occurance of the target. This can be solved by searching the first occurance of X+1X+1 and then returning the previous one.
   while (L <= R) {
     mid = L + (R-L)/2;
     if (nums[mid] >= X + 1) 
       R = mid - 1;
     else 
       L = mid + 1;
   }
   return (L > 0 && nums[L-1] === X) ? L-1 : -1;

MORE FROM THE BLOG

Algorithm Notes: Traversal of Binary...

Mostly we know that the traversal of binary tree can be easily done by recursion....

3min read

Algorithm Notes: Binary Heap

A Binary Heap is a complete Binary Tree. It is either a Min Heap, where parent node's value is less...

2min read

Algorithm Notes: KMP

String searching is one of the most common interview problems. Of course, it can be solved by using the simple...

10min read

Use React Context + useReducer...

React Hooks have brought us much convenience in writing concise readable code. Recently, I have been enjoying the usage of...

4min read