Topological Sort

Prereq: Breadth First Search Review

Topological sort or topological ordering of a directed graph is an ordering of nodes such that every node appears in the ordering before all the nodes it points to. Topological sort is not unique. For example, for a graph like this

Both [4, 5, 2] and [5, 4, 2] are valid topological sort.

And for the following graph:

Task 3 is completed independent of task 2 and 4, and it can be anywhere in the order as long as it is before task 1 which depends on it. All of the following ordering are valid topological orderings.

[4, 2, 3, 1], [4, 3, 2, 1], [3, 4, 2, 1]

Graph with Cycles Do Not Have Topological Ordering

Cyclic graph

It should be obvious that if a graph with a cycle does not have a topological ordering. In the above example, 2 has to come before 5 which has come before 4 which has to come before 2 which has to come before 5... which is impossible.

Circular dependency is a common problem in real-world software engineering:

Kahn's Algorithm

To obtain a topological order, we can use Kahn's algorithm which is very similar to Breadth First Search.

In order to understand why we must use this algorithm we first look at the problem it is trying to solve.

Given a directed graph does there exist a way to remove the nodes such that each time we remove a node we guarantee that no other nodes point to that particular node?

For this algorithm we systematically remove one node at a time, each time removing a node such that no other nodes point to that node. If no such node exists then there exists a cycle and there does not exist a solution to the problem (because if there isn't a cycle then we can always find a node that does not depend on any node). For each of the neighbouring nodes to that node we then check if it has no other nodes that point to it. If there isn't one then we push it into the queue. Notice it is important to keep track of the number of nodes pointing to the node in question as we only push the node into the queue once all the parents have been removed. Here is a graphic to demonstrate the idea.

Now we show an implementation of these ideas in the code below.

// count the number of parents that a particular node has by adding to the node, need to initialize this before
// doing topological sort
public static void childrenCount(List<Integer> [] adj, int [] parentCount) {
    // loop through every node and add to the child node 1 parent
    for (int i = 0; i < adj.length; i++) {
        for (int j = 0; j < adj[i].size(); j++) {
            parentCount[adj[i].get(j)]++;
        }
    }
}
// topological sort the list
public static List<Integer> topologicalSort(List<Integer> [] adj, int [] parentCount){
    // return a list of the topological sorted list
    List<Integer> ret = new ArrayList<Integer>();
    // make a queue that we will use for our solution
    LinkedList<Integer> q = new LinkedList<Integer>();
    // loop through all nodes and add all nodes that do not have any parent
    for (int i = 1; i < parentCount.length; i++) {
        if (parentCount[i] == 0) {
            q.add(i);
        }
    }
    // perform bfs with queue, mostly the same as template bfs
    while(!q.isEmpty()) {
        int currentNode = q.poll();
        // add node to list to keep track of topological order
        ret.add(currentNode);
        // loop through every neighbour of current node
        for (int neighbour: adj[currentNode]) {
            // subtract one from every neighbour
            parentCount[neighbour]--;
            // once the number of parents reaches zero you add it to the queue
            if (parentCount[neighbour] == 0) {
              q.add(neighbour);
            }
        }
    }
    return ret;
}