Here’s how to calculate the number of comparisons in a binary search:
The number of comparisons is telling you how many searches a computer has to make before finding an item.
For a binary function, the maximum number of comparisons is found by taking the number of items in the list.
The number of times you can divide that by two before it rounds down to one is the number of comparisons.
So if you want to learn all about why calculating the number of comparisons matters and how then this article is for you.
Let’s dig deeper into it!
What Is a Binary Search?
When you want a computer to find a specific file, data point, or item stored on it, then you need to use a search method.
A binary search is one such search method that a programmer can use to tell the computer how to find files.
The binary search is also known as a split-half search, half-interval search, binary chop, and plenty of other names.
They all work on the same principle, and you can understand it by thinking about how you might search for a file by hand.
Imagine you have an alphabetical list of items, and you need to find one entry on that list.
In order to find the entry, you’re going to follow the alphabet until you get to what you want, but if you’re explaining the raw logic of how you go about this search in a way a computer could understand, a binary algorithm makes a lot of sense.
Let’s say you’re looking for something that starts with the letter “c.”
Since the list is organized, you can immediately eliminate a lot of the entries by excluding everything that doesn’t start with that letter.
So, you’ll go to the “c” section of your list.
Then, you’ll do that same method, but the exclusion will be based on the second letter in the name.
You’ll keep going with that same repeated process of elimination until you get to the entry you want.
For a computer, binary searches are a little more generalized.
Binary searches don’t have to be based on the alphabet; that was just an easy example.
Instead, a binary search uses an ordered list, and it goes through the list by eliminating possibilities, kind of like you were doing when alphabetizing.
But, it’s called a binary search because it tries to split the list in half.
So, the binary search will start with the item in the exact middle of the list and compare it to the desired value.
Based on that comparison, it will then eliminate everything above or below that item, based on the internal search logic.
It keeps doing this—eliminating half of the remaining possibilities at a time—until only the desired result is left.
In this way, it is binary in its function, and it is able to search very large volumes of data in an incredibly small number of steps.
So, going back to the alphabetical example, the computer wouldn’t eliminate all of the letters in the first step.
Instead, it would eliminate half of the alphabet.
In the second step, it would eliminate half of the remaining alphabet, and so on until the search is complete.
That’s the gist of the binary search, and it will make more sense as I explain the numbers of comparisons and the real power of this method.
What Is a Number of Comparisons?
One of the important things to understand when comparing search methods is the number of comparisons.
In short, this is the mathematical description of how many steps it takes to find a single item on a list.
Basically, it’s the mathematical formula that measures the efficiency of any given search method.
A binary search will have a low maximum number of comparisons based on how many items are on the list.
No matter which item you are trying to find, it can only take a handful of iterations to get to the desired result.
Since a binary search starts with the item in the exact middle of the list, it’s possible to find your search on the first try.
But, even if the item is in the last possible location, this method churns through possibilities very quickly.
That brings us to the idea of the variable “N.”
N is used to represent the exact number of steps it takes to complete a search.
That’s the literal number of comparisons, but there are a few other ways to think about this.
If your search function has a variable N value, then you might want to consider the best-case N value, worst-case N value, and average N value for your search function and list.
Let me highlight this idea by explaining a different way to search for items: linearly.
In a linear search, the algorithm simply goes through all of the items in the list, in order.
So, the computer knows what you’re trying to find, and it simply scrolls through the list until it hits the desired item.
As you might imagine, that’s not usually efficient, and the number of comparisons will depend on what you’re trying to find.
In the best-case scenario, the search finds your item on the first try.
In the worst-case scenario, it has to read every item on the list, so N is equal to the number of entries.
That’s as inefficient as it can get.
The average N value for a linear search is going to be half the total number of items on a list.
Compare that to the binary search where the N value can’t be nearly as diverse.
Since each comparison eliminates half of all remaining possibilities, you can go through thousands of items on a list in fewer than a dozen comparisons.
The N value range for a binary search is a lot more stable, and the average N value is always going to be relatively close to the best and worst-case N values.
Why Does the Number of Comparisons Matter?
We’re getting a little deep into the topic, so let me try to bring this back to a matter of practicality.
Why do we care so much about the number of comparisons?
Well, it’s a great way to figure out which search method is ideal for finding stuff.
If you have a very short list, a linear function might actually be better than a binary function.
As lists get longer, binary searches become more valuable.
There are also plenty of other approaches to searching, and when you have all of these options, then you’ll want to use the number of comparisons to weigh these options against each other.
In other words, programmers use the number of comparisons to decide how to program a search function in order to make sure it is both fast and accurate.
How Do You Calculate the Number of Comparisons?
Alright. We’ve covered a lot of ground.
Now, it’s time to explain exactly how this number is calculated for a binary search.
And as we go through it, you’ll see why this is such a popular search mechanism, as long as the list is ordered in a way that makes binary searches possible.
Let’s try an example.
Say you have 100 emails in your inbox, and you’re looking for a specific one.
You use the search feature, and it runs its binary algorithm to get the desired result.
Since it’s using a binary function, it will start by checking the email in the middle of the list, and if that’s not the one you want, half of the whole list will be eliminated right then and there.
Now, the search will go to the middle of the remaining list, and it will eliminate 25 possible emails.
In 2 searches, it has removed 75 possibilities.
The next search will remove 13 possibilities (the one that is checked plus the half of emails that are either above or below that email according to the logic).
The following searches eliminate 6 emails, then 3, and finally 2.
At that point, you have the email you were looking for, and it took 6 comparisons to get there, so your number of comparisons is 6.
Keep in mind that this is assuming a worst-case scenario.
If any of the comparisons had found the original request, then the N value would be even less.
So, another way to think about the maximum N value for a binary search is to first figure out how many items are on that list.
Divide that number by 2, and keep dividing by 2 again and again until you get down to 1.
The number of times you divide by 2 is the number of comparisons for that list.
Well, it’s the maximum number of comparisons for that list, and for a binary search, this is the most important value.
If you want a simple mathematical formula, the maximum N value for any binary search is the log(N)/log(2), where N is equal to the number of items in the list.
Ultimately, this whole thing is showing that a binary search can scale logarithmically.
You have to double the amount of data in the list to increase the maximum number of comparisons by one.
So, it takes a maximum of 10 searches to find an item in a set of 1024 entries.
It only takes 11 searches if you double that number to 2048.
If you double the number of searches to 20, then the binary search function is guaranteed to find your result on a list with up to 1,048,576 items.
Think about that scaling.
You went from 10 to 20 comparisons, and now you’re sorting through more than a million items.
Double your maximum N value again, and now your binary search function is guaranteed to find any item out of more than a trillion—all in 40 comparisons or less.
That’s a logarithmic scale, and it means that binary searches only get better as lists get longer.
This is why the number of comparisons is so powerful.