Backtracking Template
Prereq: DFS with States
Combinatorial search problems
Combinatorial search problems involve finding groupings and assignments of objects that satisfy certain conditions. Finding all permutations, combinations, subsets, and solving Sudoku are classic combinatorial problems. The time complexity of combinatorial problems often grows rapidly with the size of the problem. Feel free to go back to the math basics section for a review.
The algorithm we use to solve a combinatorial search problem is often called backtracking.
Backtracking == DFS on a tree
In combinatorial search problems, the search space (a fancy way of saying all the possibilities to search) is in the shape of a tree. The tree that represents all the possible states is called a state-space tree (because it represents all the possible states in the search space).
Below are the state-space trees for all 2-letter words composed using only 'a' and 'b':
Each node of the state-space tree represents a state we can reach in a combinatorial search (by making a particular combination). Leaf nodes are the solutions to the problem. Solving combinatorial search problems boils down to DFS on the state-space tree. Since the search space can be quite large, we often have to "prune" the tree, i.e. discard branches and stop further traversals. This is why it's often called backtracking.
Difference between previous DFS problems and backtracking
If you had followed the content in order, you would have gone through quite a few DFS-on-tree problems. The main difference between those problems and the backtracking problems is that in backtracking, we are not given a tree to traverse but rather we construct the tree/ generate the edges and tree nodes as we go. At each step of a backtracking algorithm, we write logic to find the edges and child nodes. This may sound abstract but I promise it’ll be much clearer once we start seeing a couple of problems.
How to implement a backtracking algorithm
Draw the tree, draw the tree, draw the tree!!!
Draw a state-space tree to visualize the problem. A small test case that's big enough to reach at least one solution (leaf node). We can't stress how important this is. Once you draw the tree, the rest of the work becomes so much easier - simply traverse the tree depth-first.
When drawing the tree, bear in mind:
- how do we know if we have reached a solution?
- how do we branch (generate possible children)?
Then, apply the following backtracking template:
function dfs(start_index, path): if is_leaf(start_index): report(path) return for edge in get_edges(start_index): path.add(edge) dfs(start_index + 1, path) path.pop()
start_index
is used to keep track of the current level of the state-space tree we are in.
edge
is the choice we make; the string a
, b
in the above state-space trees.
The main logic we have to fill out is
is_leaf
get_edges
which correspond to the two questions above.
Notice how very similar this is to the Ternary Tree Path code we've seen in DFS with States module. That problem has an explicit tree. For combinatorial search problems, we have to find our own tree.
Note that the is_leaf
and get_edges
helper functions can be just one line of code, in which case it wouldn't be necessary to separate into another function.
Time and space complexity
We visit each node of the state-space tree exactly once, so the time complexity of a backtracking algorithm is proportional to the number of nodes in the state-space tree.
The number of nodes in a tree can be calculated by multiplying number of children of each node ^ height of the tree
.
The space complexity of a backtracking algorithm is typically the height of the tree because that's where the DFS recursive call stack is of maximum height and uses the most memory.
Combinatorial Example Problem
Given a non-negative integer n
, find all n
-letter words composed by 'a' and 'b', return them in a list of strings in lexicographical order.
Input: 2
Output: ["aa", "ab", "ba", "bb"]
Input: 4
Output: ["aaaa", "aaab", "aaba", "aabb", "abaa", "abab", "abba", "abbb", "baaa", "baab", "baba", "babb", "bbaa", "bbab", "bbba", "bbbb"]
Solution
We can start from an empty string and add letters repeatedly until we get to the n
length. Each time we add a letter, we can pick from either a
or b
.
Visualization
Draw the state-space tree when n=2
.
We will reach leaf nodes (solution) with values "aa", "ab", "ba", "bb".
Implementation
Applying the backtracking template to get the solution.
is_leaf
: when word
is n
letters, we have reached a leaf (solution).
get_edges
: each letter is either 'a' or 'b'.
1from typing import List
2
3def letter_combination(n: int) -> List[str]:
4 res: List[str] = []
5
6 def dfs(start_index: int, path: List[str]) -> None:
7 if start_index == n:
8 res.append("".join(path))
9 return
10 for letter in "ab":
11 path.append(letter)
12 dfs(start_index + 1, path)
13 path.pop()
14
15 dfs(0, [])
16 return res
17
18if __name__ == "__main__":
19 n = int(input())
20 res = letter_combination(n)
21 for line in sorted(res):
22 print(line)
23
1import java.util.ArrayList;
2import java.util.Collections;
3import java.util.List;
4import java.util.Scanner;
5import java.util.stream.Collectors;
6
7class Solution {
8 private static void dfs(Integer n, List<String> res, int startIndex, List<Character> path) {
9 if (startIndex == n) {
10 res.add(path.stream()
11 .map(e -> e.toString())
12 .collect(Collectors.joining()));
13 return;
14 }
15 for (char letter : new char[] {'a', 'b'}) {
16 path.add(letter);
17 dfs(n, res, startIndex + 1, path);
18 path.remove(startIndex);
19 }
20 }
21
22 public static List<String> letterCombination(Integer n) {
23 List<String> res = new ArrayList<>();
24 dfs(n, res, 0, new ArrayList<>());
25 return res;
26 }
27
28 public static void main(String[] args) {
29 Scanner scanner = new Scanner(System.in);
30 int n = Integer.parseInt(scanner.nextLine());
31 scanner.close();
32 List<String> res = letterCombination(n);
33 res.stream().sorted().forEach(System.out::println);
34 }
35}
36
1"use strict";
2
3function dfs(n, res, startIndex, path) {
4 if (startIndex === n) {
5 res.push(path.join(""));
6 return;
7 }
8 for (const letter of "ab") {
9 path.push(letter);
10 dfs(n, res, startIndex + 1, path);
11 path.pop();
12 }
13}
14
15function letterCombination(n) {
16 const res = [];
17 dfs(n, res, 0, []);
18 return res;
19}
20
21function* main() {
22 const n = parseInt(yield);
23 const res = letterCombination(n);
24 res.sort();
25 for (const line of res) {
26 console.log(line);
27 }
28}
29
30class EOFError extends Error {}
31{
32 const gen = main();
33 const next = (line) => gen.next(line).done && process.exit();
34 let buf = "";
35 next();
36 process.stdin.setEncoding("utf8");
37 process.stdin.on("data", (data) => {
38 const lines = (buf + data).split("\n");
39 buf = lines.pop();
40 lines.forEach(next);
41 });
42 process.stdin.on("end", () => {
43 buf && next(buf);
44 gen.throw(new EOFError());
45 });
46}
47
1#include <algorithm>
2#include <iostream>
3#include <limits>
4#include <string>
5#include <vector>
6
7void dfs(int n, std::vector<std::string>& res, int startIndex, std::string& path) {
8 if (startIndex == n) {
9 res.emplace_back(path);
10 return;
11 }
12 for (char letter : {'a', 'b'}) {
13 path.push_back(letter);
14 dfs(n, res, startIndex + 1, path);
15 path.pop_back();
16 }
17}
18
19std::vector<std::string> letter_combination(int n) {
20 std::vector<std::string> res;
21 std::string path;
22 dfs(n, res, 0, path);
23 return res;
24}
25
26void ignore_line() {
27 std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
28}
29
30int main() {
31 int n;
32 std::cin >> n;
33 ignore_line();
34 std::vector<std::string> res = letter_combination(n);
35 std::sort(res.begin(), res.end());
36 for (const std::string& line : res) {
37 std::cout << line << '\n';
38 }
39}
40
1package main
2
3import (
4 "bufio"
5 "fmt"
6 "os"
7 "sort"
8 "strconv"
9 "strings"
10)
11
12func dfs(n int, res *[]string, startIndex int, path []string) {
13 if startIndex == n {
14 *res = append(*res, strings.Join(path, ""))
15 return
16 }
17 for _, letter := range []string{"a", "b"} {
18 dfs(n, res, startIndex+1, append(path, letter))
19 }
20}
21
22func letterCombination(n int) []string {
23 var result []string
24 dfs(n, &result, 0, nil)
25 return result
26}
27
28func main() {
29 scanner := bufio.NewScanner(os.Stdin)
30 scanner.Scan()
31 n, _ := strconv.Atoi(scanner.Text())
32 res := letterCombination(n)
33 sort.Strings(res)
34 for _, row := range res {
35 fmt.Println(row)
36 }
37}
38
Time complexity
For each node there are a maximum of 2 children. The height of the tree is n
. The number of nodes is 2^n-1
or O(2^n)
, (see the "perfect binary tree" section of Everything about trees for a quick review).
It takes O(n)
to join the n
characters at each leaf node.
So the overall time complexity is O(2^n*n)
.
Space complexity
We store 2^n
strings in total, each with a length of n
so this gives us O(2^n*n)
space. In addition, our recursion
depth is O(n)
. Adding the two together, we still get a space complexity of O(2^n*n)
.
Summary
Backtracking is one of the most headache-inducing category of interview questions. Now that you've seen it, it isn't too bad, right? It becomes mechanical once you master tree drawing and applying the template. Also did I tell you backtracking + memoization = dynamic programming? aka the hardest category of interview questions. Crazy how far we've come. In the next few sections, we will gradually evolve the basic template we have as we add more complexity.