Bitmask and Dynamic Programming

Bit manipulation is a crucial aspect of computer programming and one of the most powerful tools for bit manipulation is bitmasks.

Let's first understand what a bit is. A bit is a binary digit. It's the smallest piece of data in a computer and can be only either 0 or 1.

Binary Numbers

A binary number is a number expressed with a bunch of bits, where each bit is either 0 or 1 like described earlier. The integer types we use in programming languages are actually stored as binary numbers. Here is how binary numbers convert to decimal numbers:

You can check that in python:

1>>> bin(21)
2'0b10101'

Note that 0b means it's a binary number. The leading 0s are omitted and that's why we have 10101 instead of 010101.

Bitwise Operations

Bitwise operations are operations that work on individual bits of a number. Here are the primary bitwise operations that you'll use and encounter a lot in algorithmic problem solving.

Bitwise AND (&)

Given two integers x and y, the expression x & y is evaluated by doing the following:

First write the two integers in binary, in each column look at the bit in x and y, if both bits are 1, then the respective resulting bit will also be 1. Otherwise, if any bit in that column is 0, then the respective resulting bit will be 0.

This intermediate operation can be thought of as the boolean operator && and functions the same way.

Example

21 & 37 = 5

1  0 1 0 1 0 1
2& 1 0 0 1 0 1
3=============
4  0 0 0 1 0 1

Bitwise OR (|)

Given two integers x and y, the expression x | y is evaluated by doing the following:

First write the two integers in binary, in each column look at the bit in x and y, if any bit is 1, then the respective resulting bit will also be 1. Otherwise, if both bits in that column are 0, then the respective resulting bit will be 0.

This intermediate operation can be thought of as the boolean operator || and functions the same way.

Example

21 | 37 = 53

1  0 1 0 1 0 1
2| 1 0 0 1 0 1
3=============
4  1 1 0 1 0 1

Bitwise XOR (^)

Given two integers x and y, the expression x ^ y (not to be confused with exponents) is evaluated by doing the following:

First write the two integers in binary, in each column look at the bit in x and y, if the two bits are the same, then the respective resulting bit will be 0. Otherwise, if the two bits in that column are different, then the respective resulting bit will be 1.

This operation is also called the "exclusive or" operation.

Example

21 ^ 37 = 24

1  0 1 0 1 0 1
2^ 1 0 0 1 0 1
3=============
4  1 1 0 0 0 0

Notice that the 3 operations AND(&), OR(|), and XOR(^) are commutative. This means that the order of elements doesn't matter (i.e. x & y == y & x)

Bitwise Shifts (<< and >>)

Given two integers x and y, the expression x << y (bitwise left shift) is evaluated by doing the following:

Take the integer x and write it in binary and move all the 1 bits to the left y spaces.

This operation essentially takes the integer x and multiplies it by 2 exactly y times.

Example

21 << 2 = 84

1   0 0 1 0 1 0 1
2<<             2
3================
4   1 0 1 0 1 0 0

Similarly, the expression x >> y (bitwise right shift) is evaluated by doing the following:

Take the integer x, write it in binary, and move all the 1 bits to the right y spaces.

Notice that it might be possible for some bits to "disappear" if you move it past the "ones" bit.

In this case, we'll let them disappear.

It may be helpful to think of this operation as taking the integer x, and dividing it by 2 exactly y times. Here, the division is floor division, meaning we discard any remainders.

Example

21 >> 3 = 2

1   0 0 1 0 1 0 1
2>>             3
3================
4   0 0 0 0 0 1 0

Bitmask

Now, finally, what's a mask? We can construct a binary number such that a certain digit is set to 1 and other digits are all set to 0. This creates a "mask" that when we AND it to another binary it "turns off" (set to 0) all digits except the 1 digit in the mask.

Encoding the state

How is this all useful to algorithms and dynamic programming? The answer is to encode states. Remember in backtracking problems like permutation, we used a used boolean array to keep track of state, i.e. which character has been used. If we were to memoize the state, we need to somehow serialize the array, e.g. turning it into a string which has extra computational cost. With a bitmask, the serialization from boolean array to integer is already done for you - it's just an integer. This reduces memory consumption and makes the algorithm faster and simpler.

For instance, let us say that we are on a graph, and the problem requires us to remember all the nodes that we visited. We can encode each node as a boolean bit value as part of our bitmask and use that value as our dp state.

In the example, we have visited node 1 and node 4. We encode that as 1001 which is equal to 2^0 + 2^3 = 9.

Bitmask questions can vary and additional dp states may be required so the code provided will loosely follow what a bitmask type question will look like.

Common bit manipulation operations

Typically, these problems use a recursive function as well in order to do dp. Below is some pseudocode to outline this idea. Here are some useful operations to keep in mind:

  1. (1 << i) -> simply 2 to the power of i, or an integer where the only bit that is 1 is the ith bit.

  2. (bitmask & (1 << i)) -> this operation checks if the ith bit in the bitmask is set to 1 or not

  3. (bitmask ^ (1 << i)) -> this operation toggles the ith bit in the bitmask to 1 if it was 0 or to 0 if it was 1.

Here's another example.

Use bitmask to generate all subsets:

1nums = [1, 2, 3]
2n = len(nums)
3subsets = []
4for mask in range(2 ** n):
5    subset = []
6    for i in range(n):
7        if mask & (1 << i):
8            subset.append(nums[i])
9    subsets.append(subset[:])
10
11# subsets = [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]

We didn't need to use backtracking for this. Pretty cool, eh!

Bitmask and Dynamic Programming

Now here is pseudocode to show how to use bitmask for dynamic programming.

1function f(int bitmask, int [] dp) {
2    if calculated bitmask {
3        return dp[bitmask];
4    }
5    for each state you want to keep track of {
6        if current state not in mask {
7            temp = new bitmask;
8            dp[bitmask] = max(dp[bitmask], f(temp,dp) + transition_cost);
9        }
10    }
11    return dp[bitmask];
12}

Bitmask DP is often used to solve problems that involve finding the shortest path to visit all nodes in a graph. The bitmask is used to keep track of which nodes have been visited. The state transition is to visit a node that has not been visited yet. The transition cost is the cost to go from the current node to the next node. Let's look at an example in the next article.


Got a question?ย Ask the Teaching Assistantย anything you don't understand.

Still not clear? Ask in the Forum, ย Discordย orย Submitย the part you don't understand to our editors.

โ†
โ†‘TA ๐Ÿ‘จโ€๐Ÿซ