Memoization

Prereq: Backtracking

Memoization is a fancy word for a simple concept (so is the case for a lot of things we learn in school). It means saving the previous function call result in a dictionary and reading from it when we do the exact same call again. And no I didn't spell it wrong. The word is meant to mean writing down on a "memo".

A classic example is calculating Fibonacci number.

1def fib(n):
2    if n == 0 or n == 1:
3        return n
4    return fib(n - 1) + fib(n - 2)
5

1int fib(int n) {
2    if (n == 0 || n == 1) {
3        return n;
4    }
5    return fib(n - 1) + fib(n - 2);
6}
7

1function fib(n) {
2    if (n === 0 || n === 1) {
3        return n;
4    }
5    return fib(n - 1) + fib(n - 2);
6}
7

1int fib(int n) {
2    if (n == 0 || n == 1) {
3        return n;
4    }
5    return fib(n - 1) + fib(n - 2);
6}
7

1int Fib(int n)
2{
3    if (n == 0 || n == 1) return n;
4    return Fib(n - 1) + Fib(n - 2);
5}
6

1func fib(n int) int {
2    if n == 0 || n == 1 {
3        return n
4    }
5    return fib(n-1) + fib(n-2)
6}
7

This results in a lot of repeated computations

The solution is simply saving previous results in a map of function arguments to results (the "memo"), checking it, and returning previous results if it has been done before. Otherwise, we carry out the computation and save the results in the map.

1def fib(n, memo):
2    if n in memo: # check in memo, if found, retrieve and return right away
3        return memo[n]
4
5    if n == 0 or n == 1:
6        return n
7
8    res = fib(n - 1, memo) + fib(n - 2, memo)
9
10    memo[n] = res # save in memo before returning
11    return res
12

1int fib(int n, int[] memo) {
2    // check in memo, if found, retrieve and return right away
3    if (memo[n] != 0) return memo[n];
4
5    if (n == 0 || n == 1) return n;
6
7    int res = fib(n - 1, memo) + fib(n - 2, memo);
8
9    // save result to memo before returning
10    memo[n] = res;
11    return res;
12}
13

1function fib(n, memo) {
2    // check in memo, if found, retrieve and return right away
3    if (n in memo) return memo[n];
4
5    if (n === 0 || n === 1) return n;
6
7    const res = fib(n - 1, memo) + fib(n - 2, memo);
8
9    // save result to memo before returning
10    memo[n] = res;
11    return res;
12}
13

1int fib(int n, int memo[]) {
2    // check in memo, if found, retrieve and return right away
3    if (memo[n] != 0) return memo[n];
4
5    if (n == 0 || n == 1) return n;
6
7    int res = fib(n - 1, memo) + fib(n - 2, memo);
8
9    // save result to memo before returning
10    memo[n] = res;
11    return res;
12}
13

1int Fib(int n, int[] memo)
2{
3    // check in memo, if found, retrieve and return right away
4    if (memo[n] != 0) return memo[n];
5
6    if (n == 0 || n == 1) return n;
7
8    int res = Fib(n - 1, memo) + Fib(n - 2, memo);
9
10    // save result to memo before returning
11    memo[n] = res;
12    return res;
13}
14

1func fib(n int, memo int[]) int {
2    // check in memo, if found, retrieve and return right away
3    if val, ok := memo[n]; ok {
4        return val
5    }
6
7    if n == 0 || n == 1 {
8      return n
9    }
10
11    res := fib(n-1) + fib(n-2)
12
13    // save result to memo before returning
14    memo[n] = res
15    return res
16}
17

When to memoize

After you draw the state-space tree, if you see duplicate subtrees, then it might be a good time to consider memoization.

What to memoize

Think about the duplicate subtrees, what attribute(s) do they share? In the Fibonacci example, the duplicate subtrees have the same n value. Usually, the key to the memo is the start_index or any additional states that may appear multiple times during the recursion.

Time Complexity Analysis

The benefit of memoization is that we store the obtained information in our memo so that we can access it in constant time.

The time to do backtracking is proportional to the number of nodes in the state-space tree. However, with memoization, we avoid duplicate subtrees (e.g. the red subtrees in the 2nd and 3rd slides). Therefore, the actual number of nodes visited is proportional to the size of the memo array.

For the above Fibonacci example, the size of the memo is O(n) (space complexity) and for each node we do O(1) work to combine the results from child recursive calls. Therefore the overall time complexity is O(n).

Memoization is particularly useful for combinatorial problems that have large repeated state-space tree branches. We will see that in the next module.