Topological Sort | Topological Order

Prereq: Breadth First Search Review

What is a topological order?

Topological sort or topological ordering of a directed graph is an ordering of nodes such that every node appears in the ordering before all the nodes it points to. Topological sort is not unique. For example, for a graph like this

Both [4, 5, 2] and [5, 4, 2] are valid topological sort.

And for the following graph:

Task 3 is completely independent of task 2 and 4, and it can be anywhere in the order as long as it is before task 1 which depends on it. All of the following ordering are valid topological orderings.

[4, 2, 3, 1], [4, 3, 2, 1], [3, 4, 2, 1]

Graphs with Cycles Do Not Have Topological Ordering

Cyclic graph

It should be obvious that if a graph with a cycle does not have a topological ordering. In the above example, 2 has to come before 5 which has come before 4 which has to come before 2 which has to come before 5... which is impossible.

Circular dependency is a common problem in real-world software engineering. If you have experience with web development, you probably have seen dreaded errors like these πŸ˜…:

Kahn's Algorithm

To obtain a topological order, we can use Kahn's algorithm which is very similar to Breadth First Search.

In order to understand why we must use this algorithm we first look at the problem it is trying to solve.

Given a directed graph does there exist a way to remove the nodes such that each time we remove a node we guarantee that no other nodes point to that particular node?

For this algorithm we systematically remove one node at a time, each time removing a node such that no other nodes point to that node. If no such node exists then there exists a cycle and there does not exist a solution to the problem (because if there isn't a cycle then we can always find a node that does not depend on any other node). For each of the neighbouring nodes to that node we then check if it has no other nodes that point to it. If there isn't one then we push it into the queue. Notice it is important to keep track of the number of nodes pointing to the node in question as we only push the node into the queue once all the parents have been removed. Here is a graphic to demonstrate the idea.

Let's summarize the steps here:

  1. Initialize a hashmap of node to parent count.
  2. Go through the nodes, count how many parents each node has (a parent node means another node pointing to the current).
  3. Push the nodes with 0 parents into the queue.
  4. Pop each node from the queue, subtract 1 from the parent count of each node it points to.
  5. If a node's parent count drops to 0, then push it into the queue.
  6. repeat until the queue is empty. If the queue is not empty, then there must be a cycle.

Topological Sort vs BFS

Notice the topological sort algorithm is very similar to BFS. The main difference is that we only push nodes with 0 parents into the queue in topological sort whereas in BFS we push all the neighboring nodes into the queue.

Topological Sort Implementation in Python, Java, Javascript and C++

Similar to BFS, we keep things short and clear by separating the logic into functions count_parents (step 1) and topo_sort (step 2-5).

1from collections import deque
2
3def count_parents(graph):
4    counts = { node: 0 for node in graph }
5    for parent in graph:
6        for node in graph[parent]:
7            counts[node] += 1
8    return counts
9
10
11def topo_sort(graph):
12    res = []
13    q = deque()
14    counts = count_parents(graph)
15    for node in counts:
16        if counts[node] == 0:
17            q.append(node)
18    while len(q) > 0:
19        node = q.popleft()
20        res.append(node)
21        for child in graph[node]:
22            counts[child] -= 1
23            if counts[child] == 0:
24                q.append(child)
25    return res if len(graph) == len(res) else None
26
1// count the number of parents that a particular node has by adding to the node, need to initialize this before
2public static <T> Map<T, Integer> countParents(Map<T, List<T>> graph) {
3    Map<T, Integer> counts = new HashMap<>();
4    graph.keySet().forEach(node -> {
5        counts.put(node, 0);
6    });
7    // loop through every node and add to the child node 1 parent
8    graph.entrySet().forEach(entry -> {
9        for (T node : entry.getValue()) {
10            counts.put(node, counts.get(node) + 1);
11        }
12    });
13    return counts;
14}
15
16// topological sort the list
17public static <T> List<T> topoSort(Map<T, List<T>> graph) {
18    // return a list of the topological sorted list
19    List<T> res = new ArrayList<>();
20    // make a queue that we will use for our solution
21    Queue<T> q = new ArrayDeque<>();
22    // loop through all nodes and add all nodes that do not have any parent
23    Map<T, Integer> counts = countParents(graph);
24    counts.entrySet().forEach(entry -> {
25        if (entry.getValue() == 0) {
26            q.add(entry.getKey());
27        }
28    });
29    // perform bfs with queue, mostly the same as template bfs
30    while (!q.isEmpty()) {
31        T node = q.poll();
32        // add node to list to keep track of topological order
33        res.add(node);
34        for (T child : graph.get(node)) {
35            // subtract one from every neighbour
36            counts.put(child, counts.get(child) - 1);
37            // once the number of parents reaches zero you add it to the queue
38            if (counts.get(child) == 0) {
39                q.add(child);
40            }
41        }
42    }
43    // check for cycle
44    return (graph.size() == res.size()) ? res : null;
45}
46
1function countParents(graph) {
2    const counts = new Map();
3    for (let node of graph.keys()) {
4        counts.set(node, 0);
5    }
6    for (let parent of graph.keys()) {
7        for (node of graph.get(parent)) {
8            counts.set(node, counts.get(node) + 1);
9        }
10    }
11    return counts;
12}
13
14function topoSort(graph) {
15    const res = [];
16    const q = [];
17    const counts = countParents(graph);
18    for (let node of counts.keys()) {
19        if (counts.get(node) == 0) {
20            q.push(node);
21        }
22    }
23    while (q.length > 0) {
24        const node = q.shift();
25        res.push(node);
26        for (let child of graph.get(node)) {
27            counts.set(child, counts.get(child) - 1);
28            if (counts.get(child) == 0) {
29                q.push(child);
30            }
31        }
32    }
33    return (graph.size === res.length) ? res : null;
34}
35
1template <typename T> std::unordered_map<T, int> count_parents(std::unordered_map<T, std::vector<T>> graph) {
2    std::unordered_map<T, int> counts;
3    for (auto entry : graph) {
4        counts[entry.first] = 0;
5    }
6    for (auto entry : graph) {
7        for (auto node : entry.second) {
8            counts[node] += 1;
9        }
10    }
11    return counts;
12}
13
14template <typename T> std::vector<T> topo_sort(std::unordered_map<T, std::vector<T>> graph) {
15    std::vector<T> res;
16    std::queue<T> q;
17    std::unordered_map<T, int> counts = count_parents(graph);
18    for (auto entry : counts) {
19        if (entry.second == 0) q.push(entry.first);
20    }
21    while (q.size() > 0) {
22        T node = q.front();
23        res.emplace_back(node);
24        for (T child : graph[node]) {
25            counts[child] -= 1;
26            if (counts[child] == 0) q.push(child);
27        }
28        q.pop();
29    }
30    if (graph.size() == res.size()) return res;
31    else {
32        std::cout << "Invalid topo sort: graph contains cycle" <<'\n';
33        return std::vector<T>();
34    }
35}
36

Topological Sort Example

An example problem is Task Scheduling.


Still not clear?Β SubmitΒ the part you don't understand to our editors.