Backtracking Template

Prereq: DFS with States

Combinatorial search problems

Combinatorial search problems involve finding groupings and assignments of objects that satisfy certain conditions. Finding all permutations, combinations, subsets and solving sudoku are classic combinatorial problems. The time complexity of combinatorial problems often grows rapidly with the size of the problem. Feel free to go back to math basics section for a review.

The algorithm we use to solve a combinatorial search problem is often called backtracking.

Backtracking == DFS on a tree

In combinatorial search problems, the search space (a fancy way of saying all the possibilities to search) is in the shape of a tree. The tree that represents all the possible states is called a state-space tree (because it represents all the possible states in the search space).

Below are the state-space trees for all 2-letter words composed using only 'a' and 'b':

Each node of the state-space tree represents a state we can reach in a combinatorial search (by making a particular combination). Leaf nodes are the solutions to the problem. Solving combinatorial search problems boils down to DFS on the state-space tree. Since the search space can be quite large, we often have to "prune" the tree, i.e. discard branches and stop further traversals. This is why it's often called backtracking.

Difference between previous DFS problems and backtracking

If you had followed the content in order, you would have gone through quite a few DFS-on-tree problems. The main difference between those problems and the backtracking problems is that in backtracking, we are not given a tree to traverse but rather we construct the tree/ generate the edges and tree nodes as we go. At each step of a backtracking algorithm, we write logic to find the edges and child nodes. This may sound abstract but I promise itโ€™ll be much clearer once we start seeing a couple of problems.

How to implement a backtracking algorithm

Draw the tree, draw the tree, draw the tree!!!

Draw a state-space tree to visualize the problem. A small test case that's big enough to reach at least one solution (leaf node). We can't stress how important this is. Once you draw the tree, the rest of the work becomes so much easier - simply traverse the tree depth-first.

When drawing the tree, bear in mind:

  1. how do we know if we have reached a solution?
  2. how do we branch (generate possible children)?

Then, apply the following backtracking template:

1function dfs(start_index, path):
2  if is_leaf(start_index):
3    report(path)
4    return
5  for edge in get_edges(start_index):
6    path.add(edge)
7    dfs(start_index + 1, path)
8    path.pop()

start_index is used to keep track of the current level of the state-space tree we are in.

edge is the choice we make; the string a, b in the above state-space trees.

The main logic we have to fill out are

  1. is_leaf
  2. get_edges

which correspond to the two questions above.

Notice how very similar this is to the Ternary Tree Path code we've seen in DFS with States module. That problem has an explicit tree. For combinatorial search problems, we have to find our own tree.

Note that the is_leaf and get_edges helper functions can be just one line of code, in which case it wouldn't be necessary to separate into another function.

Time and space complexity

We visit each node of the state-space tree exactly once so the time complexity of a backtracking algorithm is proportional to the number of nodes in the state-space tree. The number of nodes in a tree can be calculated by multiplying number of children of each node ^ height of the tree.

The space complexity of a backtracking algorithm is typically the height of the tree because that's where the DFS recursive call stack is of maximum height and uses the most memory.

Combinatorial Example Problem

Find all n-letter words composed by 'a' and 'b', return them in a list of strings in lexicographical order.

Input: 2
Output: ["aa", "ab", "ba", "bb"]

Input: 4
Output: ["aaaa", "aaab", "aaba", "aabb", "abaa", "abab", "abba", "abbb", "baaa", "baab", "baba", "babb", "bbaa", "bbab", "bbba", "bbbb"]

Solution

We can start from an empty string and add letters repeatedly until we get to the n length. Each time we add a letter, we can pick from either a or b.

Visualization

Draw the state-space tree when n=2.

We will reach leaf nodes (solution) with values "aa", "ab", "ba", "bb".

Implementation

Applying the backtracking template to get the solution.

is_leaf: when word is n letters, we have reached a leaf (solution).
get_edges: each letter is either 'a' or 'b'.

1from typing import List
2
3def letter_combination(n):
4    def dfs(start_index, path):
5        if start_index == n:
6            res.append(''.join(path))
7            return
8        for letter in ['a', 'b']:
9            path.append(letter)
10            dfs(start_index+1, path)
11            path.pop()
12
13    res = []
14    dfs(0, [])
15    return res
16
17if __name__ == '__main__':
18    n = int(input())
19    res = letter_combination(n)
20    for line in sorted(res):
21        print(line)
22
1import java.util.ArrayList;
2import java.util.Collections;
3import java.util.List;
4import java.util.Scanner;
5import java.util.stream.Collectors;
6
7class Solution {
8    private static void dfs(Integer n, List<String> res, int startIndex, List<Character> path) {
9        if (startIndex == n) {
10          res.add(path.stream()
11                    .map(e -> e.toString())
12                    .collect(Collectors.joining()));
13          return;
14        }
15        for (char letter : new char[] {'a', 'b'}) {
16          path.add(letter);
17          dfs(n, res, startIndex+1, path);
18          path.remove(startIndex);
19        }
20    }
21
22    public static List<String> letterCombination(Integer n) {
23        List<String> res = new ArrayList<>();
24        dfs(n, res, 0, new ArrayList<>());
25        return res;
26    }
27
28    public static void main(String[] args) {
29        Scanner scanner = new Scanner(System.in);
30        int n = Integer.parseInt(scanner.nextLine());
31        scanner.close();
32        List<String> res = letterCombination(n);
33        Collections.sort(res);
34        for (String line : res) {
35            System.out.println(line);
36        }
37    }
38}
39
1function dfs(n, res, startIndex, path) {              
2    if (startIndex == n) {
3      res.push(path.join(''));
4      return;
5    }
6    ["a", "b"].forEach(
7      letter => {
8        path.push(letter);
9        dfs(n, res, startIndex+1, path)
10        path.pop();
11      }
12    )
13}
14
15function letterCombination(n) {
16    var res = [];
17    dfs(n, res, 0, []);
18    return res;
19}
20
21function* main() {
22    const n = parseInt(yield);
23    const res = letterCombination(n);
24    res.sort();
25    for (const line of res) {
26        console.log(line);
27    }
28}
29
30class EOFError extends Error {}
31{
32    const gen = main();
33    const next = (line) => gen.next(line).done && process.exit();
34    let buf = '';
35    next();
36    process.stdin.setEncoding('utf8');
37    process.stdin.on('data', (data) => {
38        const lines = (buf + data).split('\n');
39        buf = lines.pop();
40        lines.forEach(next);
41    });
42    process.stdin.on('end', () => {
43        buf && next(buf);
44        gen.throw(new EOFError());
45    });
46}
47
1#include <algorithm> // sort
2#include <iostream> // cin, cout, streamsize
3#include <limits> // numeric_limits
4#include <string> // string
5#include <vector> // vector
6
7void dfs(int n, std::vector<std::string> &res, int startIndex, std::string path) {
8    if (startIndex == n) {
9      std::string s(path.begin(), path.end());
10      res.emplace_back(s);
11      return;
12    }
13    for (std::string letter : {"a", "b"}) {
14      dfs(n, res, startIndex+1, path+letter);
15    }
16}
17
18std::vector<std::string> letter_combination(int n) {
19    std::vector<std::string> res;
20    dfs(n, res, 0, "");
21    return res;
22}
23
24void ignore_line() {
25    std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
26}
27
28int main() {
29    int n;
30    std::cin >> n;
31    ignore_line();
32    std::vector<std::string> res = letter_combination(n);
33    std::sort(res.begin(), res.end());
34    for (const std::string& line : res) {
35        std::cout << line << '\n';
36    }
37}
38

Time complexity

For each node there are a maximum of 2 children. The height of the tree is n. The number of nodes is 2^n (see the "perfect binary tree" section of Everything about trees for a quick review). It takes O(n) to join the n characters at each leaf node. So the overall time complexity is O(2^n*n).

Space complexity

The height of the tree is O(n) so the space complexity is O(n).

Summary

Backtracking is one of the most headache-inducing category of interview questions. Now that you've seen it, it isn't too bad, right? It becomes mechanical once you master tree drawing and applying the template. Also did I tell you backtracking + memoization = dynamic programming? aka the hardest category of interview questions. Crazy how far we've come. In the next few sections, we will gradually evolve the basic template we have as we add more complexity.


Still not clear?ย Submitย the part you don't understand to our editors.