Bitmask and Dynamic Programming
Bit manipulation is a crucial aspect of computer programming and one of the most powerful tools for bit manipulation is bitmasks.
Let's first understand what a bit is. A bit is a binary digit. It's the smallest piece of data in a computer and can be only either 0 or 1.
Binary Numbers
A binary number is a number expressed with a bunch of bits, where each bit is either 0 or 1 like described earlier. The integer types we use in programming languages are actually stored as binary numbers. Here is how binary numbers convert to decimal numbers:
You can check that in python:
>>> bin(21) '0b10101'
Note that 0b
means it's a binary number. The leading 0s are omitted and that's why we have 10101
instead of 010101
.
Bitwise Operations
Bitwise operations are operations that work on individual bits of a number. Here are the primary bitwise operations that you'll use and encounter a lot in algorithmic problem solving.
Bitwise AND (&
)
Given two integers x
and y
, the expression x & y
is evaluated by doing the following:
First write the two integers in binary, in each column look at the bit in x
and y
, if both bits are 1
, then the
respective resulting bit will also be 1
. Otherwise, if any bit in that column is 0
, then the respective resulting
bit will be 0
.
This intermediate operation can be thought of as the boolean operator &&
and functions the same way.
Example
21 & 37 = 5
0 1 0 1 0 1 & 1 0 0 1 0 1 ============= 0 0 0 1 0 1
Bitwise OR (|
)
Given two integers x
and y
, the expression x | y
is evaluated by doing the following:
First write the two integers in binary, in each column look at the bit in x
and y
, if any bit is 1
, then the
respective resulting bit will also be 1
. Otherwise, if both bits in that column are 0
, then the respective resulting
bit will be 0
.
This intermediate operation can be thought of as the boolean operator ||
and functions the same way.
Example
21 | 37 = 53
0 1 0 1 0 1 | 1 0 0 1 0 1 ============= 1 1 0 1 0 1
Bitwise XOR (^
)
Given two integers x
and y
, the expression x ^ y
(not to be confused with exponents) is evaluated by doing the following:
First write the two integers in binary, in each column look at the bit in x
and y
, if the two bits are the same, then the
respective resulting bit will be 0
. Otherwise, if the two bits in that column are different, then the respective resulting
bit will be 1
.
This operation is also called the "exclusive or" operation.
Example
21 ^ 37 = 24
0 1 0 1 0 1 ^ 1 0 0 1 0 1 ============= 1 1 0 0 0 0
Notice that the 3 operations AND(&
), OR(|
), and XOR(^
) are commutative. This means that
the order of elements doesn't matter (i.e. x & y == y & x
)
Bitwise Shifts (<<
and >>
)
Given two integers x
and y
, the expression x << y
(bitwise left shift) is evaluated by doing the following:
Take the integer x
and write it in binary and move all the 1
bits to the left y
spaces.
This operation essentially takes the integer x
and multiplies it by 2
exactly y
times.
Example
21 << 2 = 84
0 0 1 0 1 0 1
<< 2
================
1 0 1 0 1 0 0
Similarly, the expression x >> y
(bitwise right shift) is evaluated by doing the following:
Take the integer x
, write it in binary, and move all the 1
bits to the right y
spaces.
Notice that it might be possible for some bits to "disappear" if you move it past the "ones" bit.
In this case, we'll let them disappear.
It may be helpful to think of this operation as taking the integer x
, and dividing it by 2
exactly
y
times. Here, the division is floor division, meaning we discard any remainders.
Example
21 >> 3 = 2
0 0 1 0 1 0 1 >> 3 ================ 0 0 0 0 0 1 0
Bitmask
Now, finally, what's a mask? We can construct a binary number such that a certain digit is set to 1 and other digits are all set to 0. This creates a "mask" that when we AND
it to another binary it "turns off" (set to 0) all digits except the 1 digit in the mask.
Encoding the state
How is this all useful to algorithms and dynamic programming? The answer is to encode states. Remember in backtracking problems like permutation, we used a used
boolean array to keep track of state, i.e. which character has been used. If we were to memoize the state, we need to somehow serialize the array, e.g. turning it into a string which has extra computational cost. With a bitmask, the serialization from boolean array to integer is already done for you - it's just an integer. This reduces memory consumption and makes the algorithm faster and simpler.
For instance, let us say that we are on a graph, and the problem requires us to remember all the nodes that we visited. We can encode each node as a boolean bit value as part of our bitmask and use that value as our dp state.
In the example, we have visited node 1
and node 4
. We encode that as 1001
which is equal to 2^0 + 2^3 = 9
.
Bitmask questions can vary and additional dp states may be required so the code provided will loosely follow what a bitmask type question will look like.
Common bit manipulation operations
Typically, these problems use a recursive function as well in order to do dp. Below is some pseudocode to outline this idea. Here are some useful operations to keep in mind:
-
(1 << i)
-> simply2
to the power ofi
, or an integer where the only bit that is1
is thei
th bit. -
(bitmask & (1 << i))
-> this operation checks if thei
th bit in the bitmask is set to1
or not -
(bitmask ^ (1 << i))
-> this operation toggles thei
th bit in the bitmask to1
if it was0
or to0
if it was1
.
Here's another example.
Use bitmask to generate all subsets:
nums = [1, 2, 3]
n = len(nums)
subsets = []
for mask in range(2 ** n):
subset = []
for i in range(n):
if mask & (1 << i):
subset.append(nums[i])
subsets.append(subset[:])
# subsets = [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]
We didn't need to use backtracking for this. Pretty cool, eh!
Bitmask and Dynamic Programming
Now here is pseudocode to show how to use bitmask for dynamic programming.
function f(int bitmask, int [] dp) { if calculated bitmask { return dp[bitmask]; } for each state you want to keep track of { if current state not in mask { temp = new bitmask; dp[bitmask] = max(dp[bitmask], f(temp,dp) + transition_cost); } } return dp[bitmask]; }
Bitmask DP is often used to solve problems that involve finding the shortest path to visit all nodes in a graph. The bitmask is used to keep track of which nodes have been visited. The state transition is to visit a node that has not been visited yet. The transition cost is the cost to go from the current node to the next node. Let's look at an example in the next article.