What's the difference between the programmer who can easily "nail" any coding interview, get job in any company including Google, NASA..., win any coding competition, and the others?
Is it how many programming languages he knows?
Is it how fast he types?
Is it how clean his code look?
Is it the one who can write a binary tree without surfing through Stakoverflow?
Well, maybe all of the above.
To me, however, one of the key traits of a good programmer is the ability to solve problems efficiently. The ability to analyse and understand the problem. The ability to design, implement and optimize the solution.
The truth is, not even decades of experience can guarantee that you will obtain these skills. What I want to achieve with this article is to introduce you to this problematic and motivate you to, instead of focusing on learning new languages and new technologies, focus on the efficiency of your solutions to given problems.
Let's say we are given a simple task to write a function that takes two parameters, and , and returns (the power function).
For example .
Anyone with at least a little knowledge of programming should be able to come up with something like this:
Power(b, e):
Input: number b and number e
1. result := 1
2. while e > 0:
3. result := result * b
4. e := e - 1
Output: result
Equivalent C code would look like:
int power(int b, int e) {
int result = 1;
while (e > 0) {
result *= b;
e--;
}
return result;
}
Now we can run some tests to make sure it works correctly (check it our for yourself) and we are done. Great job! We solved the task, the code is clean and we can call ourselves good programmers.
But is it enough to work at Google or NASA? (both companies were taken as inspirational example since most would agree that that's where "good programmers" work)
Let's say our program is going to be used for calculations with a really large exponent . How fast is it going to be?
We can run some tests and measure the time. The following table is the result of running the code on an "average" CPU (2 GHz, single core).
exponent size | running time |
---|---|
10 | 0.000 ms |
100 | 0.001 ms |
1,000 | 0.005 ms |
10,000 | 0.047 ms |
100,000 | 0.460 ms |
1,000,000 | 4.610 ms |
10,000,000 | 46.02 ms |
100,000,000 | 460.1 ms |
1,000,000,000 | 4.599 s |
10,000,000,000 | 46.03 s |
As you can see, for small exponent, the calculation is relatively fast.
But for exponent being no more than 10 billion, the calculation already takes up to 46 seconds!
You may ask why would anyone need to calculate with an exponent that large in the first place.
What kind of real-world problems would require such calculation?
Well, have you ever visited a website through HTTPS protocol? (Literally any secured website like Google.com, Facebook.com, Youtube.com,...)
This protocol encrypts the communication between your device and the server using an algorithm called RSA.
And this algorithm, in order to work and to make the encryption secure, uses exponents as large as which is:
179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216
(yep, seriously)
That's a lot more than 10 billion.
Now, how long would it take our program to calculate this?
Well, actually about
26220679155974177115376501096440077469105652437549804022029369712263132210735651293042265354882123385129230491497002808474057569426124536233017976657260471188471362567357065411705775780967419184256318934979598616351453396318816665993235782011512379843988420165065680556176536786006138473836151
years.
Which, by the way, is about
1748045277064945141025100073096005164607043495836653601468624647484208814049043419536151023658808225675282032766466853898270504628408302415534531777150698079231424171157137694113718385397827945617087928998639907756763559754587777732882385467434158656265894677671045370411769119067075
times the age of the universe.
And this calculation happens every time you visit or refresh a website. Can you imagine waiting that long for a website to load?
Fortunately, the people who designed RSA were good programmers and their algorithm can perform such calculation in a few milliseconds.
How do they do it? You will find out later in this article :-)
And hopefully by that time, you will be able to start coding efficiently like a good programmer yourself.
In The basic concept of Algorithms article, you were introduced to algorithmic complexity analysis and something called Big-O notation.
Simply put, describes how (asymptotically) efficient is our algorithm in worst-case in relation to , which stands for the size of data our algorithm works with.
I will just put a table that sums up the general asymptotic functions we use in Big-O notation and their approximate running time here (considering 1 operation takes 1 ns):
n | 1 | 10 | 100 | 1,000 | 10,000 | 1,000,000 | 1,000,000,000 |
---|---|---|---|---|---|---|---|
1 ns | 1 ns | 1 ns | 1 ns | 1 ns | 1 ns | 1 ns | |
1 ns | 3 ns | 6 ns | 10 ns | 13 ns | 20 ns | 30 ns | |
1 ns | 10 ns | 100 ns | 1 μs | 10 μs | 1 ms | 1 s | |
1 ns | 30 ns | 600 ns | 10 μs | 130 μs | 20 ms | 30 s | |
1 ns | 100 ns | 10 μs | 1 ms | 100 ms | 17 minutes | 32 years | |
1 ns | 1 μs | 1 ms | 1 s | 17 minutes | 32 years | years | |
2 ns | 1 μs | years | years | years | ¯\_(ツ)_/¯ | ¯\(°_o)/¯ |
From this table we should be able to determine which complexity we should aim for depending on the expected input size.
For example, if we are to build an algorithm on a database of hundreds of records, it should be fine if we come up with anything more efficient than exponential complexity.
Similarly, if we want to build an algorithm on a large data set of, let's say, billions of records, we should definitely aim for as efficient algorithm as possible.
We also have some other notions:
- worst-case complexity (the algorithm will never be slower than this)
- average-case, exact complexity (the algorithm will never be slower or faster than this)
- best-case complexity (the algorithm will never be faster than this)
There is more new data created everyday now than there ever was few years ago. In IT, we need to be able to sort and search through all these data. Searching is one of the most common tasks a program(mer) has to deal with all the time.
Especially in large systems, the more data we have, the more efficiently we need to be able to search for certain records.
Task: Given an array of numbers, write a function that will, for a given value, return whether this value is in our array or not.
Example: Given array [1, 3, 5, 8, 10, 38] and value 5, the function will return true because 5 is in the array.
Example: Given array [1, 3, 5, 8, 10, 38] and value 4, the function will return false because 4 isn't in the array.
Simple task, isn't it? Let's start with a naive solution first.
FindX(M, x):
Input: array M and value x
1. for each element i of array M:
2. if i == x:
3. return true
Ouput: true or false
Equivalent C code:
// returns 1 on true, 0 on false
int findX(int arr[], int arrSize, int x) {
for (int i = 0; i < arrSize; i++)
if (arr[i] == x) return 1;
return 0;
}
This solution basically iterates through the whole array one by one until it finds the value we are looking for. If it doesn't find it by the end of the array, it returns false.
This solution works correctly. But let's analyze it's efficiency (or rather complexity).
How many times does it iterate in the worst-case? The worst-case here is the case when the value we are searching for is not in the array. In that case, our algorithm iterates through the whole array. In other words, it iterates through values where is the size of the given array. This leads to complexity .
We already know that is relatively fast and can be performed within seconds on an array consisting of billions of values.
But keep in mind that searching is something we typically need to perform very frequently.
Let's come up with something more efficient.
The idea of a faster algorithm for searching is simple.
"I am thinking of a number in a range between 0 and 100. Guess the number."
What's the most efficient way to guess the correct number?
We can keep asking "1? 2? 3? ... 99?". That might work but what if the number was 100? We would have to ask 100 times before we get it correctly.
Or we can just guess randomly. Well, in that case, the average number of guesses to get it right would be 50. And if we are not lucky, there is still a chance that we will have to guess 100 times.
Let's modify the task a bit:
"I am thinking of a number in a range between 0 and 100. Guess the number and I will tell you if the number is larger or smaller."
Now that we know whether our guess was too large or too small, our best bet is to guess by half:
Example for 23:
"Is it 50?" No, smaller (= it's between 0 and 49)
"Is it 25?" No, smaller (= it's between 0 and 24)
"Is it 13?" No, larger (= it's between 14 and 24)
"Is it 19?" No, larger (= it's between 20 and 24)
"Is it 22?" No, larger (= it's between 23 and 24)
"Is it 23?" Yes.
Example for 99:
"Is it 50?" No, it's larger (= it's between 51 and 100)
"Is it 75?" No, it's larger (= it's between 76 and 100)
"Is it 88?" No, it's larger (= it's between 89 and 100)
"Is it 94?" No, it's larger (= it's between 95 and 100)
"Is it 97?" No, it's larger (= it's between 98 and 100)
"Is it 99?" Yes.
This is the principle of binary searching. Using this method, how many times do we have to guess at maximum? Since each time we take a guess, the range of numbers shrinks by half, the maximum number of guesses is (in a range between 0 and 1,000,000 we only need to guess up to 20 times). In other words, the binary search has complexity of .
Now is the time to apply this principle to our previous task.
BinaryFindX(M, x):
Input: sorted array in ascending order M and value x
1. left := 0
2. right := length of M
3. while left <= right:
4. mid := (left + right) / 2
5. if M[mid] == x:
6. return true
7. else if M[mid] < x:
8. left := mid + 1
9. else if M[mid] > x:
10. right := mid - 1
11. return false
Ouput: true or false
Equivalent C code:
int binaryFindX(int arr[], int arrSize, int x) {
int left = 0;
int right = arrSize;
while (left <= right) {
int mid = floor((right+left)/2);
if (arr[mid] == x) return 1;
else if (arr[mid] > x) right = mid-1;
else if (arr[mid] < x) left = mid+1;
}
return 0;
}
How does it work:
Searching for value 5:
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
left mid right
=> (mid == 5) => TRUE
Searching for value 11:
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
left mid right
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
left mid right
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
left/mid right
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
left/right
=> (left == right) => FALSE
Iterative searching has time complexity of .
Binary searching has time complexity of .
In previous example, we learned how to use binary search for efficient searching in a sorted array. The problem comes when the array is not sorted.
In that case we have no other choice but to sort the array before performing the binary search.
Let's start with the naive solution - BubbleSort:
BubbleSort(M):
Input: array M of n values
1. while M is not sorted:
2. for each element i in M:
3. if M[i] < M[i+1]:
4. swap M[i] and M[i+1]
5. return M
Output: sorted array in ascending order
Equivalent C code:
void bubbleSort(int arr[], int arrSize) {
int sorted = 0;
while (sorted == 0) {
sorted = 1;
for (int i = 0; i < arrSize-1; i++) {
if (arr[i] > arr[i+1]) {
sorted = 0;
int tmp = arr[i];
arr[i] = arr[i+1];
arr[i+1] = tmp;
}
}
}
}
BubbleSort needs to swap up to elements each time it iterates through the array until the array is sorted. And it needs to iterate through the array up to times. In total, the complexity of this sorting algorithm is and is actually one of the slowest sorting algorithms out there.
This algorithm is data sensitive. Meaning that this algorithm can either work fast or slow depending on the format (order) of the given data. In fact, this BubbleSort is one of the fastest algorithms for already almost sorted array.
We can rewrite this algorithm into data insensitive one that will perform in , which can be, on average, a bit faster than the previous BubbleSort:
void bubbleSort(int arr[], int arrSize) {
for (int i = 0; i < arrSize-1; i++) {
for (int j = i+1; j < arrSize; j++) {
if (arr[i] > arr[j]) {
int tmp = arr[i];
arr[i] = arr[j];
arr[j] = tmp;
}
}
}
}
There are more sorting algorithms that are faster than BubbleSort with the same complexity of ,
such as SelectionSort, where we basically find the smallest value and move it to the beginning of the array until we end up with a sorted array:
SelectionSort(M):
Input: array M of n values
1. for i := 0 to n:
2. min := i
3. for j := i+1 to n:
4. if M[j] < M[min]:
5. min := j
6. swap M[i] and M[min]
7. return M
Output: sorted array in ascending order
or InsertionSort, where we take each element and move it to the place in sorted part of array they belong to:
InsertionSort(M):
Input: array M of n values
1. for i := 1 to n:
2. tmp := M[i]
3. j := i - 1
4. while j >= 0 and M[j] > tmp:
5. M[j + 1] := M[j]
6. j := j - 1
7. M[j + 1] := tmp
8. return M
Output: sorted array in ascending order
The application of these two algorithms can be, despite their quadratic complexity, very efficient for certain tasks. For example, if we have an almost already sorted array, the best choice to sort it is using InsertionSort (or the data sensitive BubbleSort).
MergeSort is one of the fastest sorting algorithms for large amount of data. It's time complexity is . Apart from that, MergeSort is based on an elegant recursive principle Divide and conquer. It however needs memory (space complexity).
The idea is that if we have 2 sorted arrays, we can easily merge them into 1 sorted array.
So what we do is we split the array into 2 smaller arrays, recursively apply MergeSort to each of these arrays to sort them and then we merge them.
MergeSort(M):
Input: array M of n values
1. if n == 1:
2. array is already sorted
3. X := MergeSort(all elements from 0 to n/2)
4. Y := MergeSort(all elements from (n/2)+1 to n)
5. return Merge(X, Y)
Output: sorted array in ascending order
Merge(X, Y):
Input: array X of m values and array Y of n values
1. i := 1, j := 1, k := 1
2. Z[] := new array
3. while i <= m and j <= n:
4. if X[i] <= Y[j]:
5. Z[k] := X[i]
6. i = i + 1
7. else:
8. Z[k] := Y[j]
9. j = j + 1
10. k = k + 1
11. if i <= m:
12. Z[k ... m+n] := X[i ... m]
13. if j <= n:
14. Z[k ... m+n] := Y[j ... n]
15. return Z
Output: sorted array in ascending order
The most popular and mostly used sorting algorithm for sorting large amount of data.
Even though it's absolute worst-case time complexity is , it performs in on the average and is also, on the average, faster than MergeSort.
The idea behind this algorithm is to choose a random pivot in the array and then split the array into 3 parts - elements smaller than pivot, elements equals to pivot and elements larger than pivot. Then, apply this algorithm recursively to first and third part. By the end, we get a sorted array.
QuickSort(M):
Input: array M of n values
1. if n <= 1:
2. Array is already sorted
3. pivot := random element from M
4. L := all elements in M smaller than pivot
5. P := all elements in M equal to pivot
6. R := all elements in M larger than pivot
7. L := QuickSort(L)
8. R := QuickSort(R)
9. Y := concatenate L, P and R
10. return Y
Output: Sorted array in ascending order
As you can guess, the time complexity of QuickSort strictly depends on the pivot choice.
We usually tend to choose a random element from the array, or the first one.
The best choice for the pivot is to choose a median of the array values.
This comes with it's problems though, if you are interested, look up for Median searching and Almostmedian searching.
The fastest sorting algorithm for sorting numbers of a small range, this algorithm performs in linear time where is the range between the smallest and largest number.
The algorithm is divided into 3 steps:
First, we calculate the number of occurrences of each number in the array.
array = [3, 2, 5, 4, 2, 1, 3, 2, 2, 1]
occurrences = [
0 => 0,
1 => 2,
2 => 4,
3 => 2,
4 => 1,
5 => 1
]
Then, we calculate their output beginning positions based on the number of their occurrences.
start_position = [
0 => 0,
1 => 0,
2 => 2,
3 => 6,
4 => 8,
5 => 9
]
At last, we calculate the output position for each number.
output = [
0 => 1
1 => 1
2 => 2
3 => 2
4 => 2
5 => 2
6 => 3
7 => 3
8 => 4
9 => 5
]
The full algorithm:
CountingSort(M):
Input: Array M of n number in a range from 0 to r
1. for i = 0 to r:
2. occurrences[i] := 0
3. for i = 0 to n:
4. occurences[M[i]] := occurences[M[i]] + 1
5. start_position[0] := 0
6. for i = 1 to r:
7. start_position[i] := start_position[i-1] + occurrences[i-1]
8. for i = 1 to n:
9. output[start_position[M[i]]] := M[i]
10. start_position[M[i]] := start_position[M[i]] + 1
11. return output
Output: Sorted array in ascending order
Although this algorithm has linear time complexity, the real time strictly depends on the range of the values, and not just their amount.
For example, sorting an array of 10 numbers with the smallest number = 1 and the largest number = 10 billion, the performance will not be efficient.
Also keep in mind that the space complexity of this algorithm is also , which makes this algorithm not very efficient memory-wise.
Sorting algorithms like BubbleSort, SelectionSort, InsertionSort has time complexity of and are suited for small data sets.
InsertionSort is one of the best algorithms for sorting an already almost sorted array.
MergeSort and QuickSort has time complexity of and are suitable for sorting large data sets.
MergeSort has space complexity (don't use MergeSort on a device with low limited memory).
QuickSort has the best results on the average and his performance strictly depends on the pivot choice.
CountingSort can theoretically sort in linear time, however the real complexity depends on the value range of the sorting numbers.
Dynamic programming is a technique we use for designing and optimizing recursive algorithms that tends to lead to exponential time complexity.
We typically optimize these algorithms by using the fact, that most recursive calculations are being repeated in our algorithm. So instead of repeating these calculations, we store their results (this process is called memoization) and then just read them from the memory instead of performing the recursive calculation.
Task: Write a program that will return Fibonacci's n-th number.
Normal recursive solution:
Fib(n):
Input: Number n
1. if n <= 2:
2. return 1
3. else:
4. return Fib(n-1) + Fib(n-2)
Output: Fibonacci's n-th number
Simple, short, correct and elegant. The time complexity for this solution however is , that's exponential!
Really. Calculating 100th Fibonacci's number using this algorithm would take billions of years to finish.
This happens because for every Fib(n), we have to calculate Fib(n-1) and Fib(n-2), and repeatedly, for every Fib(n-1) we have to calculate Fib(n-2) and Fib(n-3) and again and again.
Let's optimize this algorithm using the technique of dynamic programming by storing the results of all these calculations:
Fib(n):
Input: Number n
1. if T[n] exists:
2. return T[n]
3. if n <= 2:
4. T[n] := 1
5. else:
6. T[n] := Fib(n-1) + Fib(n-2)
7. return T[n]
Output: Fibonacci's n-th number
With this, once we successfully calculate a certain Fib(n), it will be stored in T[n] so we won't need to calculate it anymore.
The time complexity of this solution is which is much faster and much more efficient than the previous solution.
This solution can also be rewritten into iterative one, which is more memory-efficient:
Fib(n):
Input: Number n
1. T[1] := T[2] := 1
2. for k = 3 to n:
3. T[k] := T[k-1] + T[k-2]
4. return T[n]
Output: Fibonacci's n-th number
Now the final solution is simple, short, elegant, iterative and efficient both time and memory wise.
Task: Given an array of numbers, write a program that will return the length of the longest increasing subsequence.
Example: Given [6, 5, 2, 4, 13, 8, 11, 3], return 4
Example: Given [1, 2, 8, 7, 6, 5, 4, 3, 4], return 4
Example: Given [9, 8, 7, 6, 5, 4, 3, 2, 1], return 1
The recursive solution is simple, we calculate the longest increasing subsequence for each number:
LIS(i):
Input: Array M of n numbers and number i
1. res := 1
2. for j = i+1 to n:
3. if M[i] < M[j]:
4. res := max(res, 1 + LIS(j))
5. return res
Output: The length of the longest increasing subsequence
Again, short, simple and elegant. But the time complexity of this solution is .
In other words, if we wanted to find the length of the longest increasing subsequence in an array of 100 elements, it might take longer than the age of the universe.
Let's optimize this solution using memoization:
LIS(i):
Input: Array M of n numbers
1. if T[i] exists:
2. return T[i]
3. T[i] := 1
4. for j = i+1 to n:
5. if M[i] < M[j]:
6. T[i] := max(T[i], 1 + LIS(j))
7. return T[i]
Output: The length of the longest increasing subsequence
Simple modification, and our solution turns into time complexity, which is way better than exponential time.
This time, finding a solution in an array of 100 elements will take less than 1 ms.
Most tasks require us to store and work with various kinds of data. Depending on the task that we are to perform on them, it's a good idea to store them in a data structure for efficient usage.
Up until now, we have been storing our data in a simple structure called Array.
Array is a simple data structure that has it's advantages but also disadvantages. For example, if we wanted to efficiently search for a particular element in this array, we would need to keep this array sorted. And while keeping it sorted, inserting new data into it would be more expensive.
Let's have a look at other alternatives that we commonly use for storing our data with the complexity of the most common operations:
Insert | Delete | Find | Minimum | |
---|---|---|---|---|
Array | ||||
Sorted Array | ||||
Linked List | ||||
Sorted Linked List | ||||
Binary Search Tree | ||||
Hash Table | * | * | * |
Task: Write a user database that will be able to efficiently store and find usernames.
Naive solution using array:
FindUsername(username, M):
Input: string username to find and array M of stored usernames
1. for i = 0 to M.length:
2. if M[i] == username:
3. return true
4. return false
Output: true if username is found in our database, false otherwise
AddUsername(username, M):
Input: string username to add and array M of stored usernames
1. if FindUsername(username, M) == true:
2. throw UsernameAlreadyExists exception
3. M[sizeOfM] := username
4. M.length := M.length + 1
5. return M
Ouput: array M with new username appended
Throws: UsernameAlreadyExists exception if username already exists
FindUsername() - iterates through the array until it finds our desired username. This function has linear time complexity .
AddUsername() - first, we search through the array to see, if this username is already stored in our database, which is . If it's not, we add it at the end of the array using time. That's time for each call of this function.
If we called AddUsername() times, to store usernames, that would lead to complexity.
Let's optimize it. The most problematic part in our program is FindUsername() function. We already know how to optimize it using binary searching.
However, in order to use binary search, we need to keep our array sorted, which causes problems for insertions:
FindUsername(username, M):
Input: string username to find and array M of stored usernames
1. BinarySearch() from the previous chapter
Output: true if username is found in our database, false otherwise
AddUsername(username, M):
Input: string username to add and array M of stored usernames
1. if FindUsername(username, M) == true:
2. throw UsernameAlreadyExists exception
3. i := sizeOfM - 1
4. while i >= 0 and M[i] > username:
5. M[i+1] := M[i]
6. i := i - 1
7. M[i+1] := username
8. return M
Ouput: array M with new username appended
Throws: UsernameAlreadyExists exception if username already exists
This time FindUsername() will perform in time and AddUsername() in . Although FindUsername() performs much faster than in our previous approach, AddUsername() is not faster at all. Quite the opposite.
The reason is because we are using Array, which can be either efficient for inserting or finding, but not both at the same time.
Looking at the table above, the best choice is to use either Binary Search Tree or Hash Table instead of an Array. Since BST are a little bit more complicated (balancing etc...) and they deserve a standalone article dedicated just for them, I will show you an example of using a Hash Table:
HashTable structure:
HT[] := new array of size 1000
storeKey(username):
hash := username % 1000 # our hash function
HT[hash] := username # let's ignore collisions for this case
getKey(username):
hash := username % 1000
return HT[hash]
AddUsername(username, HT):
Input: string username to add and HashTable HT of stored usernames
1. if FindUsername(username, M) == true:
2. throw UsernameAlreadyExists exception
3. HT.storeKey(username)
4. return HT
Ouput: HashTable HT with new username appended
Throws: UsernameAlreadyExists exception if username already exists
FindUsername(username, HT):
Input: string username to find and HashTable HT of stored usernames
1. if (HT.getKey(username) == username):
2. return true
3. return false
Output: true if username is found in our database, false otherwise
This time, both AddUsername() and FindUsername() perform in time.
It's so fast that you might wonder why use any other data structure for similar tasks.
Well, HashTables comes with their disadvantages too. I am not going to explain how HashTables work in detail here, but keep in mind that their time complexity varies depending on the frequency of collisions, which depends on how well we designed our hashing function.
The above code works well, if there are no collisions at all. In order to deal with possible collisions, the code would be a bit more complicated.
This chapter was rather rushed, but the idea of data structures being necessary for efficient data storing and manipulation should have passed through.
Of course, there are many many more kinds of data structures that weren't even mentioned here. Notable ones are AVL, Heaps, Sets etc...
The study of data structures is so wide and important that they deserve an article on their own. For now, just keep in mind that this topic is one of the key stones to efficient programming.
In order to use bitwise operations, one needs to understand how binary works (which is something any student of IT or computer science should already be familiar with).
Bitwise operations are generally faster than normal arithmetic operations (since bit shifting takes way less operations than arithmetic calculations):
x = x * 64
// is equal to
x = x << 6 // 6 bit shift to the left
Both lines do the exact same. Except the second one, using bitwise operation, performs about 3 times faster.
Some more tricks that performs faster using bitwise operations:
// reverse value
x = -x;
x = ~x + 1; // about 3 times faster
// modulo
x = y % 5;
x = y & (5 - 1); // about 6 times faster
// check if x is even
(x % 2) == 0;
(x & 1) == 0; // about 6 times faster
// absolute value
y = abs(x); // using Math.abs()
y = (x ^ (x >> 31)) - (x >> 31); // about 25 times faster
These little tricks may come in handy in some implementations, but they are not exactly practical, since they compromise code readability.
And furthermore, 6 times faster or not, most of these operations execute immediately (in practice, we don't usually really care about the difference between 1 ns and 6 ns).
The best case to use these operations is when we need to deal with rather large numbers.
Like the ones from the motivational example from the beginning of this article.
That's right. Now is the time to optimize our first problem, using bitwise operations.
Task: Write a function that will calculate for a given and .
Our original solution works fine, but for an exponent , the calculation would take enormous amount of the ages of our universe.
The trick to this problem is simple. Let's consider a number . This can be written as . Aren't these numbers familiar?
That's right, they are all powers of number 2. And we already know that to multiply a number by the power of number 2, all we need is to shift the corresponding number of bits to the left.
In other words can be easily calculated as x = b << 7;
So this time, instead of iterating times (which led to linear time ), we will only iterate through the power of 2 (if the corresponding bit is set to 1) and add this to our result. The final number of iterations will be .
Input: number b and number e
1. result := 1
2. for i = 1 to e:
3. i := i << 1 # 1 bit shift to the left
4. if (e & i) == 1: # bitwise AND
5. result := result * b
6. b := b * b
7. return result
Output: result
Equivalent C code:
int power(int b, int e) {
int result = 1;
for (int i = 1; i <= e; i <<= 1) {
if (e & i) result *= b;
b *= b;
}
return result;
}
Let's test the speed of this approach using the same table as in our original approach:
exponent size | running time before | running time now |
---|---|---|
10 | 0.000 ms | 0.000 ms |
100 | 0.001 ms | 0.000 ms |
1,000 | 0.005 ms | 0.000 ms |
10,000 | 0.047 ms | 0.000 ms |
100,000 | 0.460 ms | 0.000 ms |
1,000,000 | 4.610 ms | 0.000 ms |
10,000,000 | 46.02 ms | 0.000 ms |
100,000,000 | 460.1 ms | 0.000 ms |
1,000,000,000 | 4.599 s | 0.000 ms |
10,000,000,000 | 46.03 s | 0.000 ms |
100,000,000,000,000,000 | ¯\_(ツ)_/¯ | 0.000 ms |
It's so fast that we don't even know how fast. We can't even measure it.
Bitwise operations are faster than normal arithmetic operations. They however compromises code readability and their speed improvement isn't that significant.
We should strongly consider whether it's worth using them, where and how. They are mostly used at places, where every nanosecond counts, including high performance computing, rendering, low-level programming, decryption/cracking systems...
There are many other various techniques to design and optimize efficient algorithms. In this article, I tried to point out the most common and basic ones.
Through this article, I wanted to spread awareness of the existence of complexity, why is it so important and why is it worth studying and analyzing, in order to let you, fellow readers, know that programming is not just about learning programming languages or writing codes. It's mostly about thinking.
If you have finished reading this article, you should, at least to some degree, be able to think of solutions to (not just) algorithmic problems more efficiently. You should be able to see the importance of the study of algorithms and data structures. And you should have no problems with improving yourself in this field to eventually become a great programmer.
Thank you for reading. If you have any questions, message us on our Facebook.