Memoization
Prereq: Backtracking
Memoization is a fancy word for a simple concept (so is the case for a lot of things we learn in school). It means saving the previous function call result in a dictionary and reading from it when we do the exact same call again. And no I didn't spell it wrong. The word is meant to mean writing down on a "memo".
A classic example is calculating Fibonacci number.
1def fib(n):
2 if n == 0 or n == 1:
3 return n
4 return fib(n - 1) + fib(n - 2)
5
1int fib(int n) {
2 if (n == 0 || n == 1) {
3 return n;
4 }
5 return fib(n - 1) + fib(n - 2);
6}
7
1function fib(n) {
2 if (n === 0 || n === 1) {
3 return n;
4 }
5 return fib(n - 1) + fib(n - 2);
6}
7
1int fib(int n) {
2 if (n == 0 || n == 1) {
3 return n;
4 }
5 return fib(n - 1) + fib(n - 2);
6}
7
1int Fib(int n)
2{
3 if (n == 0 || n == 1) return n;
4 return Fib(n - 1) + Fib(n - 2);
5}
6
1func fib(n int) int {
2 if n == 0 || n == 1 {
3 return n
4 }
5 return fib(n-1) + fib(n-2)
6}
7
This results in a lot of repeated computations
The solution is simply saving previous results in a map of function arguments to results (the "memo"), checking it, and returning previous results if it has been done before. Otherwise, we carry out the computation and save the results in the map.
1def fib(n, memo):
2 if n in memo: # check in memo, if found, retrieve and return right away
3 return memo[n]
4
5 if n == 0 or n == 1:
6 return n
7
8 res = fib(n - 1, memo) + fib(n - 2, memo)
9
10 memo[n] = res # save in memo before returning
11 return res
12
1int fib(int n, int[] memo) {
2 // check in memo, if found, retrieve and return right away
3 if (memo[n] != 0) return memo[n];
4
5 if (n == 0 || n == 1) return n;
6
7 int res = fib(n - 1, memo) + fib(n - 2, memo);
8
9 // save result to memo before returning
10 memo[n] = res;
11 return res;
12}
13
1function fib(n, memo) {
2 // check in memo, if found, retrieve and return right away
3 if (n in memo) return memo[n];
4
5 if (n === 0 || n === 1) return n;
6
7 const res = fib(n - 1, memo) + fib(n - 2, memo);
8
9 // save result to memo before returning
10 memo[n] = res;
11 return res;
12}
13
1int fib(int n, int memo[]) {
2 // check in memo, if found, retrieve and return right away
3 if (memo[n] != 0) return memo[n];
4
5 if (n == 0 || n == 1) return n;
6
7 int res = fib(n - 1, memo) + fib(n - 2, memo);
8
9 // save result to memo before returning
10 memo[n] = res;
11 return res;
12}
13
1int Fib(int n, int[] memo)
2{
3 // check in memo, if found, retrieve and return right away
4 if (memo[n] != 0) return memo[n];
5
6 if (n == 0 || n == 1) return n;
7
8 int res = Fib(n - 1, memo) + Fib(n - 2, memo);
9
10 // save result to memo before returning
11 memo[n] = res;
12 return res;
13}
14
1func fib(n int, memo int[]) int {
2 // check in memo, if found, retrieve and return right away
3 if val, ok := memo[n]; ok {
4 return val
5 }
6
7 if n == 0 || n == 1 {
8 return n
9 }
10
11 res := fib(n-1) + fib(n-2)
12
13 // save result to memo before returning
14 memo[n] = res
15 return res
16}
17
When to memoize
After you draw the state-space tree, if you see duplicate subtrees, then it might be a good time to consider memoization.
What to memoize
Think about the duplicate subtrees, what attribute(s) do they share? In the Fibonacci example, the duplicate subtrees have the same n
value.
Usually, the key
to the memo
is the start_index
or any additional states
that may appear multiple times during the recursion.
Time Complexity Analysis
The benefit of memoization is that we store the obtained information in our memo
so that we can access it in constant time.
The time to do backtracking is proportional to the number of nodes in the state-space tree. However, with memoization, we avoid duplicate subtrees (e.g. the red subtrees in the 2nd and 3rd slides).
Therefore, the actual number of nodes visited is proportional to the size of the memo
array.
For the above Fibonacci example, the size of the memo
is O(n)
(space complexity) and for each node we do O(1)
work to combine the results from child recursive calls. Therefore the overall time complexity is O(n)
.
Memoization is particularly useful for combinatorial problems that have large repeated state-space tree branches. We will see that in the next module.